Certificate of trust

version 2013/02/03

The authenticity and security of data is becoming more important. The exchange of data over public infrastructure increases due to geographic distances and (inter)national collaboration. It is therefor important to adopt technology that has a very broad implementation, is easy to use and offers good security.

Just like in the real world, security of data within digital systems is largely based on trust. This trust is the cornerstone of digital signing and encryption. Without trust, all the math and technology behind digital signing and encryption is useless.

One of the largest implementations of encryption is the exchange of data between a webbrowser and a webserver. This has a huge installation base, mostly for commercial or financial websites. But usage for privacy and content protection for websites with other content is also increasing (just look at google and facebook). The same basic technology is also used for signing and encrypting emails and is is based on certificates.

With such a large dependency for every day users on this technology it can be very useful to understand the basic workings. We need to look at the basics of encryption first.

Symmetric encryption

One of the most common ways to encrypt data is with symmetric encryption. This relies on both parties knowing a secret. This can be something very basic. Maybe you remember an old trick from your childhood when writing with a blunt object on a wet pieces of paper. When the paper had dried you couldn't read it, only when the paper was made wet again you could read the text. So the secret both parties had to know is to make the paper wet. Of course anyone that knew that secret would be able to read your message.

Coming closer to digital encryption, think about moving all the letters in the alphabet a couple of positions. In encryption actual moving of the letters within the alphabet is called an algorithm. The key is what controls the working of the algorithm, or in this case how many positions the letters are moved.

Lets say A becomes C, B becomes D etc. So we move all the letters 2 positions. The key being used is 2.

Using this technique the word "test" would become "vguv". Not something someone would easily understand, but easy to decode once you know the key (moving all the letters 2 positions back).

There are of course much more advanced functions then the one described before, far more complex. And instead of just moving the letters a couple of positions longer keys can be used. The more choices there are for the key, the more difficult it will be to decrypt a message for someone that hasn't got the key. But the basic principle stays the same, you use the same function for encrypting and decrypting and you use the same key for encrypting and decrypting. That is one of the nice things about symmetric encryption, although the functions might be complicated, they are relatively simple for a computer. So encoding and decoding large amounts of data is relatively fast.

There are some difficulties with symmetric encryption.
In order for the other party to decode your message, he or she needs to know the key. So how do you safely transfer the key to the other person?
Also, how can you make sure that no one else has the key? Anyone that knows the key can decode the messages.
And how far can you trust the other party to be careful with the key? They might write it down, put it under their keyboard for others to find.
And what do you do when you want to exchange encrypted messages with more than one party? All of them need to know the key, and if you want to exclude one party from the secret communication you need to change the key for all remaining parties.

A way to solve some of these problems is with Asymmetric encryption.

Asymmetric encryption

With symmetric encryption there was just one key to encode and decode message. With asymmetric encryption the encoding and decoding is split. Each party will have two keys, one for encrypting messages and one for decrypting messages. Commonly there are two names connected to the keys:

Public key: used to encrypt a message
Private key: used to decrypt a message

Both keys belong together, it is a key-pair.

As the names imply, the public key is public and should be known to as many people as possible. The private key needs to remain private, no one should be able to use it except the person that owns the key. The nice thing about the separation of keys is that sending someone an encrypted message is quite simple. You only need to have their public key. You use their public key to encrypt the message. After encryption you can't read the message yourself, only the person which has the private key which belongs to the public key can read the message.

To make this possible there is some complicated math involved which we won't go into (look up RSA or ElGamal if you want to know more). It suffices to say that the function being used is more complex then moving the letters in the alphabet a couple of positions.

Because of the complicated nature of the functions, it is quite a lot of work (even for a computer) to encode and decode messages this way. To make the best of this you can combine both technologies.

Hybrid encryption

The use of symmetric encryption is fast, but you need a long key to make it safe and distribute it to the other party so they can decrypt the message. So what if we let the computer take care of this? Let the computer generate a long and complicated key for the symmetric encryption. Then we use asymmetric encryption to transport the key in a safe way to the other party. Once they have the key we can use the much faster symmetric encryption to exchange the actual data we want to transfer.

For example, parties A and B want to exchange data in a safe way. Both parties have the public key of their key pair publicly available. Communication would go something like this:

A: Retrieve the public key of party B. Maybe from a website or in a mail they received from party B before.
A: Generate a long and complicated key that can be used for symmetric encryption later on.
A: Make a message with the symmetric key as the content and encrypt it with B's public key. The message can now only be read by B, A or anyone else can't read it.
A: Send the message to B.
B: Receive the message from A.
B: Use the private key to decode the message received from A. B now has the content of the encrypted message from A. So both B and A now have the same key that was generated by A to use for symmetric encryption.

After this initial exchange A and B can continue communication by using quick symmetric encryption, without other parties knowing to the key.

Signing

It is not always required to encrypt a message. Often it is more important to know the message wasn't tampered with. To make sure the receiving party can make sure the message is the same as the message that was sent out you can sign the message. The message could of course also be a document or other data. To digitally sign something the same technology is used as with encryption.

To sign a message or document first a fingerprint is made of the data. This is done mathematically by a special function (known as a hash function).

A good hash function takes data and generates a short sequence of data in return. The characteristic of the hash is that although the data returned by the function is very short compared to the original, every type of input data generates a different answer. Only modifying 1 character in the original document will generate a different output from the hash function.

Now that the fingerprint of the data is known, the encryption part comes into play. Instead of encrypting the fingerprint with the public key of the receiver, the fingerprint is put through the private key of the sender. This makes the fingerprint only readable with the public key that is linked to the private key of the sender.

The receiver can now use the public key of the sender to decrypt the fingerprint that was send by the sender. The receiver then uses the same hash function to calculate the fingerprint of the data the sender has sent. If the finger prints match, the data has not been modified. And because the public key of the sender was used to decrypt the fingerprint, the receiver is also sure the sender was the one that send the data.

Trust

With all of this encrypting and signing in place it should be possible to exchange data in a safe way without eavesdropping. However, how do you know that the party you are exchanging data with is actually that party? What assurances are there they say who they are? You could receive an email from the CEO of your company from his or her private mail address, signed with the public key they made available on the internet. But how do you know if t44199a@gmail.com is really the email address of the CEO of your company? In order to really benefit from the encryption and signing, trust is required. You need to know that the other party is who they say they are.

Certificates

A common technique used for trust are certificates. A certificate is a short digital document providing information about a party you might want to communicate with. The certificate contains information like the name of the party, maybe their street address, email address and their public key. The certificate is digitally signed by a third party, called a certificated authority. They use the signing procedure explained earlier to sign the certificate. To verify the certificate was signed by a certain certificate authority, the public key of the certificate authority can be used to verify the digital signing of the certificate.
Now, anyone can make their own certificate authority, you only need a private and a public key. But there are a couple of companies (with all kinds of procedures in place to protect their private key) that are being trusted. Their public key is included or used by a lot of software (like your webbrowser). When your software is presented with a certificate signed by one of these certificate authorities, it will be able to validate the signature with the public key it has on file. If the certificate is signed by an unknown certificate authority, the software will usually display a warning to the user that it is not able to validate the other party that you are trying to communicate with. As you might understand, the whole basis of this scheme is trusting the public keys of the certificate authorities. Unfortunately this can go wrong as we've seen with the Diginotar certificate authority. Their systems were hacked and people from the outside gained access to their private key. So they were able to manufacture certificates signed with Diginotar's private key which was in the list of trusted certificates in lots of software.

Used correctly, the certificate provides the receiving party with a means to verify the sender is really who they say they are. And as the certificate includes the public key of the sender, it is a starting point to initiate an encrypted connection as we've seen before.

Making a certificate

To make a certificate you first have to fill out all information that needs to be in the final certificate.

Usually this contains information about the company (for a website) or a person (for an email certificate). For a website the certificate also needs to contain the website name, so software can validate the certificate belongs to the website that presents it. The same goes for the email address in an email certificate. The resulting document is the certificate signing request. This request can now be presented to a certificate authority together with information on who you are.

Depending on the level of trust required, the certificate authority will perform a number of actions to make sure you are the person or company requesting the certificate. This can be a very simple action, like sending an email with a validation link that has to be opened in the webbrowser to doing an in-person check of your passport. As you can understand because these actions take resources they cost money. The more trust you require, the more extensive the validation needs to be by the certificate authority, the more money it costs. And, because things can change in future, the certificate is valid for only a specific period of time. Usually a year, sometimes longer.

When the certificate authority has done all its checks it will sign the certificate signing request with their private key, resulting in a certificate. This certificate, together with the private key of the requester can now be used in a webserver or email program.

Using the certificate

A place certificates are often used are in websites. Lets assume the company Company.com bought a certificate for their website www.company.com. They configured their webserver so it will use this certificate to initiate encrypted communication with the webbrowsers of people connecting to their website.

The webbrowser will connect to the webserver.
The webserver provides the certificate to the webbrowser.
The webbrowser checks the signature of the certificate with the public keys it has available of the certificate authorities.
If the webbrowser can verify the signature it will continue, otherwise it will notify the user the certificate can't be validated.
Next the webbrowser will check the website URL with the URL listed in the certificate. They should match. If they don't, the webbrowser will alert the user.
Once all the checks are ok, the webbrowser will continue to setup secure communications with the webserver using the public key that is available in the certificate.
The setup of the secure communication will be basically the same as explained before.

Usually the webbrowser will indicate the validity of the certificate and the encrypted communication with an icon in the URL bar or somewhere else in the interface.

Last notes

With the whole infrastructure in place, the framework is there to validate the party you are communicating with and make sure no one eavesdrops. However, it is very important to be careful with all private keys that are used in the whole infrastructure. If any of the private keys falls into the wrong hands, the trust is gone. Make sure that when you as a user have a private key it is properly protected. Usually software provides a password mechanism to protect your private key. Although it might seem a nuisance to fill out a password often to use a key (for example a certificate for email signing) it is required to keep the whole trust working. If your private key falls into the wrong hands, all your communication will no longer be secure.