Introduction to web SSL certificates

A high level overview

Last year I wrote about how I used Letsencrypt to handle the SSL certificates for this site. In this entry I’m going to take a step back and discuss the basics of what an SSL certificate is and the steps involved in managing them. There’s a lot of jargon involved, which can make this seem more complicated than it already is.

Note that in this post I’m likely to use the words “SSL” and “TLS” interchangeably. TLS is the successor to SSL (in some respects you can think of TLS1 as SSL4) and many of the same things apply. The basics of certificates are the same. I started playing with SSL over 20 years ago, and tend to use that term (even when I should really use TLS).

Note also you may see the phrase “X509 certificate” used in some places. The format typically used to manage SSL certificates is X509; you may say that an SSL cert is an X509 cert with the “server authentication” attribute. X509 certs may also be used elsewhere (e.g. for code signing), but if you’re reading some SSL tutorial and it tells you to do stuff with the X509 file then we’re talking about the same thing.

SSL certs can be present in various services (e.g. SMTPS, IMAPS, NNTPS) but the most common use case most people see are for HTTPS (encrypted web sites, like this one). I’ll describe things in those terms, but the same concepts apply.

What is an SSL certificate used for

An SSL cert does two things for you:

  1. Identify the server
  2. Provides a means of performing public key encryption with the server

So what this means is that when you go to this site your browser will be told “Hi, I’m www.sweharris.org!”, and will be given a way of verifying that the certificate is not lying to you.

It’s important to note that the certificate doesn’t tell you who runs the site but we’ll see, later, how this can be handled.

How is it generated

Modern tools such as Letsencrypt can hide all this from you, but it may be useful to know the underlying basics.

We start off by generating a public/private key pair. Typically, today, this would be an RSA 2048 or 4096 bit key. As with all public key systems the private key is the important part and should be kept protected.

We then create a Certificate Signing Request (CSR). This contains the public part of the key pair, along with some information about the site.

  1. The common name (CN) of the site (stored in the Subject field)
  2. Other names the website may use (Subject Alternate Names - SAN)
  3. (Optionally) other information such as the city/state/country or the name of the site owner (also stored in the Subject field)
  4. A signature for the request (typically SHA256 based).

So, for example, this web site has

    Public Key: (4096 bit RSA key)
       Subject: CN=www.sweharris.org
           SAN: DNS:www.sweharris.org, DNS: sweharris.org
     Signature: (SHA256 sig)

The certificate for www.google.com has:

    Public Key: (2048 bit RSA key)
       Subject: C=US, ST=California, L=Mountain View, O=Google Inc, CN=www.google.com  
           SAN: DNS:www.google.com
     Signature: (SHA256 sig)

Other fields can be added to this request, but these are the minimum.

This CSR is then sent to the Certificate Authority (CA) for signing.

NOTE: As you can see, the CN field isn’t too well defined, whereas the SAN field is better organised. Recent versions of Chrome have started to require SAN entries and will only use those to validate the site.

What is certificate signing

Once the CA receives the CSR it needs to validate the request is correct (am I allowed to request a certificate for www.sweharris.org, or is this a naughty person trying to pretend to be me).

Once the CA is happy the request it takes the CSR adds various fields (most importantly a serial number, a start/end date, its own Issuer field, and for what purpose the certificate can be used for). The result is then cryptographically signed (basically a hash is taken of the data, and encrypted with the CA’s own private key). The result is returned to the requester and is the certificate to be used.

There are two common types of validation:

Domain validation (DV)

This is the simplest and most basic form of checking. Basically the CA tries to verify the requester has some form of control over the domain. This may be something like

  1. Put a file on the webserver at /my/check.html containing the string “wowzer”
  2. Edit your DNS so you have a TXT record “check_this” with the string “foobar”
  3. Read the email sent to webmaster and click on the link in it.

It’s important to note that this doesn’t ask anything about the organisation behind the request, just whether it has some control over the domain. A phishing organisation could register the DNS name sweharr1s.org and request a cert for that.

A DV site typically shows up in browsers with a green padlock, but not much else.

mysite

The advantage of DV certificates is that the creation and signing can be done very cheaply, or even automatically (which is how Letsencrypt works).

Extended validation (EV)

EV certs take a deeper approach to validating the request; the CA also check the owner claims. So, for example, I might be able to get a DV cert for chas3.com, but I would not be able to get an EV cert saying “this is owned by JPMorgan Chase”. An EV cert is meant to help reduce phishing attacks by helping the web user spot if an EV cert is in use.

In a web browser the owner of an EV is typically presented:

chase

When you compare that to my DV cert you can see that we both have the green padlock, showing the certs are good, but the Chase entry has more details about the owner next to the padlock.

EV certification takes more resources from the CA (typically people are involved and it’s a multi-day affair and may require a registered legal entity (e.g. a registered company name) in your country.

The idea behind EV is that you can now get a level of trust of the identity of the organisation behind a web site, which can be very important for a site such as a bank. “JPMorgan Chase and Co run this site and it is called www.chase.com” is a stronger message than “this site is called www.chas3.com”.

Organisation Validated (OV)

This is somewhat of a middle layer between DV and EV. In this setup the owner of the site is still checked. Originally this was meant to be the “higher quality” certificate but there were no strict rules around how this validation was performed (which is why the CA/Browser Forum - see later - introduced EV). That means there’s a level of assurance in the owner of the domain, but it’s not necessarily as strong as that presented by an EV cert.

OV certificates include organisational details inside the certificate, but these are rarely presented directly to the user by default. Indeed, the user may need to go digging to distinguish between DV and OV, so I’m not sure if these provide any real benefit any more.

How does your browser know the certificate is valid?

This is where the CAs signature comes in. You can use the CAs public to verify the signature is accurate (calculate the signature yourself and then use the public key to decrypt the one they provided; do they match?).

The problem is where to get the CAs public key from?

To solve this your operating system and web browser come with a set of CA keys pre-installed. For example, the Chromium browser on my Debian desktop shows a lot of CAs, including:

certs

The sheer number of CAs can cause a problem; do you trust them all? Inherently you do, even if you don’t know it! Are you sure some random CA won’t sign a certificate for your site? Or allow some bad person pretend to be your bank? There’s a CA/Browser Forum that tries to keep on top of this which, combined with Certificate Transparency, tries to keep on top of the problem but it’s not clear if there is a good solution.

One advantage of this model, though, is that an enterprise can create their own “internal” CA; they just need to add the CA’s public key to their corporate image (or deploy it using Microsoft group policies, or similar) and their own internal created SSL certificates are trusted, which can enable internal SSL encryption to happen without needing an external CA.

What is certificate chaining

Typically the CA doesn’t sign requests with the “root” key that the browser knows about. It’s a lot of work to get new keys distributed (it’s not just browsers but also other software, such as java, that is impacted) so these keys are long lived (20 or 30 years, in some cases). If they were stolen then there would be potential for false signings for a long time. So, instead, the root key is locked in a vault and inaccessible. Not so useful. Instead an intermediate certificate (that has been signed by the root cert) is used to do the signing. These may be shorter lived (say 5 years or less).

The problem, now, is in telling the browser about this. So instead of the server just presenting the signed certificate from the CA it sends both that cert and the intermediate’s cert public key.

The browser can check the intermediate cert is good because it knows about the root cert. Once it trusts the intermediate it can then use this to check and trust the web servers certificate.

So, for example, we can see the chain for www.google.com

certs

Here the website certificate has been signed by a Google CA, which has been signed by GeoTrust… and that is in the browser/OS certificate store and is trust. So the browser can verify the site is correct.

Certificate chaining mistakes are common in new SSL deployments, so it’s definitely worth verifying you get this correct!

What happens when a certificate expires?

Essentially nothing happens. However web browsers will start to complain. As part of their validation process they check the “not before” and “not after” dates inside the CA signed certificate and compares them to the local time. Once the certificate has expired then the browser will no longer trust it and will give the user a nasty scary looking screen.

certs

If the user decided to click on through then they can still get to your site, and it will still be encrypted… but don’t do this. Don’t encourage users to ignore these warnings. Make sure your cert is renewed in time!

What happens when a certificate is lost stolen?

It doesn’t matter if the signed cert is stolen; all that information is public. The important part is the private key that we generated all the way back at the beginning of this post while creating the CSR. This private key is what is used to prove communication is to the server, so if someone steals that then they can decrypt your data, or even impersonate you.

There are two ways a certificate can be revoked, and these are under the control of the CA:

  1. Certificate Revocation List (CRL). In this version the CA posts a complete list of revoked certificates. The CRL URI is in a field in the signed certificate so the client knows where to look.
    e.g.

            X509v3 CRL Distribution Points:
    
                Full Name:
                  URI:http://sr.symcb.com/sr.crl
    

    The client must download this list and then check for the serial number of the certificate to see if it has been revoked. The Symantec list mentioned here is currently 167K in size and contains 4700 entries so clients using this approach may cache data. (Of course expired certs may be removed from the CRL, helping to keep the size down).

  2. Online Certificate Status Protocol (OCSP). This version is more API oriented; rather than downloading a complete CRL the web browser can ask specifically for the state of the certificate. Again the URL is present in the signed certificate:

            Authority Information Access:
                OCSP - URI:http://sr.symcd.com
    

    Entries typically have a validity time associated with them, to tell the browser how long to cache the results for.

    OCSP failures have been the cause of sites becoming unreachable and have some privacy concerns, so a version called OCSP stapling, where the web server presents a signed OCSP response. Apache documentation claims this doesn’t work properly for intermediate certs, but standard OCSP caching may help, here.

Unfortunately, revocation checking isn’t necessarily reliable nor as well performed as it should be. For example, Chrome doesn’t do OCSP at all, and claims to have its own method. It’s purely a function of the client and non-browser clients rarely perform all the validation (or, indeed, any validation) on the certificates.

Summary

This was meant to be a brief introduction to some of the features of an SSL certificate, but it grew a little large! There’s a lot of features available (and I didn’t even talk about client certs, mutual TLS, code signing certs…) and it can get a little complicated.

Fortunately modern tools such as Letsencrypt handle most of this for you (it can handle the key generation, the CSR, the Domain Validation, the chaining and the rotation when certificates are due to expire… magic!). In an enterprise world, companies such as Venafi can act as an interface to hide the complexity, and integrate with your CA of choice.

So, hopefully, most people won’t need to deal with these messy internals so much. But knowing about SSL certs and how they work for assigning identity to a machine will help understand their limitations and what considerations need to in any security model relying on them.