The authenticity and security of data is becoming more important. The exchange
of data over public infrastructure increases due to geographic distances and
(inter)national collaboration. It is therefor important to adopt technology
that has a very broad implementation, is easy to use and offers good security.
Just like in the real world, security of data within digital systems is largely based on trust. This trust is the cornerstone of digital signing and encryption. Without trust, all the math and technology behind digital signing and encryption is useless.
One of the largest implementations of encryption is the exchange of data between a webbrowser and a webserver. This has a huge installation base, mostly for commercial or financial websites. But usage for privacy and content protection for websites with other content is also increasing (just look at google and facebook). The same basic technology is also used for signing and encrypting emails and is is based on certificates.
With such a large dependency for every day users on this technology it can be very useful to understand the basic workings. We need to look at the basics of encryption first.
Coming closer to digital encryption, think about moving all the letters in the alphabet a couple of positions. In encryption actual moving of the letters within the alphabet is called an algorithm. The key is what controls the working of the algorithm, or in this case how many positions the letters are moved.
Lets say A becomes C, B becomes D etc. So we move all the letters 2 positions. The key being used is 2.
Using this technique the word "test" would become "vguv". Not something someone would easily understand, but easy to decode once you know the key (moving all the letters 2 positions back).
There are of course much more advanced functions then the one described before, far more complex. And instead of just moving the letters a couple of positions longer keys can be used. The more choices there are for the key, the more difficult it will be to decrypt a message for someone that hasn't got the key. But the basic principle stays the same, you use the same function for encrypting and decrypting and you use the same key for encrypting and decrypting. That is one of the nice things about symmetric encryption, although the functions might be complicated, they are relatively simple for a computer. So encoding and decoding large amounts of data is relatively fast.
There are some difficulties with symmetric encryption.
In order for the other party to decode your message, he or she needs to know
the key. So how do you safely transfer the key to the other person?
Also, how can you make sure that no one else has the key? Anyone that knows the key can decode the messages.
And how far can you trust the other party to be careful with the key? They might
write it down, put it under their keyboard for others to find.
And what do you do when you want to exchange encrypted messages with more than one party? All of them need to know the key, and if you want to exclude one party from the secret communication you need to change the key for all remaining parties.
A way to solve some of these problems is with Asymmetric encryption.
As the names imply, the public key is public and should be known to as many people as possible. The private key needs to remain private, no one should be able to use it except the person that owns the key. The nice thing about the separation of keys is that sending someone an encrypted message is quite simple. You only need to have their public key. You use their public key to encrypt the message. After encryption you can't read the message yourself, only the person which has the private key which belongs to the public key can read the message.
To make this possible there is some complicated math involved which we won't go into (look up RSA or ElGamal if you want to know more). It suffices to say that the function being used is more complex then moving the letters in the alphabet a couple of positions.
Because of the complicated nature of the functions, it is quite a lot of work (even for a computer) to encode and decode messages this way. To make the best of this you can combine both technologies.
For example, parties A and B want to exchange data in a safe way. Both parties have the public key of their key pair publicly available. Communication would go something like this:
A: Retrieve the public key of party B. Maybe from a website or in a mail they received from party B before.
A: Generate a long and complicated key that can be used for symmetric encryption later on.
A: Make a message with the symmetric key as the content and encrypt it with B's public key. The message can now only be read by B, A or anyone else can't read it.
A: Send the message to B.
B: Receive the message from A.
B: Use the private key to decode the message received from A. B now has the
content of the encrypted message from A. So both B and A now have the same key
that was generated by A to use for symmetric encryption.
After this initial exchange A and B can continue communication by using quick symmetric encryption, without other parties knowing to the key.
To sign a message or document first a fingerprint is made of the data. This is done mathematically by a special function (known as a hash function).
A good hash function takes data and generates a short sequence of data in return. The characteristic of the hash is that although the data returned by the function is very short compared to the original, every type of input data generates a different answer. Only modifying 1 character in the original document will generate a different output from the hash function.
Now that the fingerprint of the data is known, the encryption part comes
into play. Instead of encrypting the fingerprint with the public key of the
receiver, the fingerprint is put through the private key of the sender. This
makes the fingerprint only readable with the public key that is linked to the
private key of the sender.
The receiver can now use the public key of the sender to decrypt the fingerprint that was send by the sender. The receiver then uses the same hash function to calculate the fingerprint of the data the sender has sent. If the finger prints match, the data has not been modified. And because the public key of the sender was used to decrypt the fingerprint, the receiver is also sure the sender was the one that send the data.
Used correctly, the certificate provides the receiving party with a means to verify the sender is really who they say they are. And as the certificate includes the public key of the sender, it is a starting point to initiate an encrypted connection as we've seen before.
Usually this contains information about the company (for a website) or a person (for an email certificate). For a website the certificate also needs to contain the website name, so software can validate the certificate belongs to the website that presents it. The same goes for the email address in an email certificate. The resulting document is the certificate signing request. This request can now be presented to a certificate authority together with information on who you are.
Depending on the level of trust required, the certificate authority will perform a number of actions to make sure you are the person or company requesting the certificate. This can be a very simple action, like sending an email with a validation link that has to be opened in the webbrowser to doing an in-person check of your passport. As you can understand because these actions take resources they cost money. The more trust you require, the more extensive the validation needs to be by the certificate authority, the more money it costs. And, because things can change in future, the certificate is valid for only a specific period of time. Usually a year, sometimes longer.
When the certificate authority has done all its checks it will sign the certificate signing request with their private key, resulting in a certificate. This certificate, together with the private key of the requester can now be used in a webserver or email program.