In this chapter, we will discuss one of the cornerstones of secure Internet communication: TLS. The topic, like so many things in cryptography, is a big one, filled with fiddly parameters, subtle pitfalls, and breathtaking logic. Let’s find out more!
Intercepting Traffic
Eve is very proud of herself. She managed to get into computer rooms across East Antarctica and install “sniffing” software. Basically, she has managed to intercept HTTP (web) traffic and exfiltrate it for later analysis by the intelligence officers for her agency (the “West Antarctica Central Knights Office,” or WACKO).
The HTTP protocol natively supports proxying. An HTTP client can connect to a server through an intermediary HTTP server (the proxy). When the client first connects to the proxy, it sends a special HTTP command called CONNECT that tells the proxy where the real destination is. Once the proxy has connected to the true server, it serves as a simple pass-through, forwarding the data from one party to the other.
HTTP Proxy
This HTTP proxy prints out everything it receives from either endpoint. Eve’s real proxy doesn’t do this. Instead, it sends the intercepted data over the network to a command and control server. Alternatively, she could have made it save the data to disk for later extraction.
Python’s http.client module has some built-in methods for interacting with HTTP servers. It also has HTTP proxying capabilities. In the example code, the HTTPConnection object was configured with the proxy’s IP address and port. The set_tunnel method re-configured the object to assume it is connecting to a proxy but will request "www.example.com" via the CONNECT method .
After it gets the response, the read method gets the output. You should see something akin to an HTML document as a result. This represents the data received by the WA user’s browser when they navigate to www.example.com.
Note: Finding an HTTP Site
For this exercise to work, you need to browse to a web site that still supports HTTP. More and more web sites are disabling HTTP altogether, and you can only connect to them via HTTPS. At the time of this writing, www.example.com still supports both.
You will notice that they see the entire communications stream between the client (e.g., browser) and the web server. Eve has hit upon a fantastic source of intelligence.
Warning: Multiple Proxy Methods
Our proxy is using the CONNECT method. There are multiple ways to configure a web proxy, and our basic source code only supports this one method. Thus, it will not work with browsers or tools that attempt to make use of other methods.
Eve is happily collecting traffic on her enemies one day when suddenly everything stops working. To be clear, the proxy is still proxying data. In fact, the CONNECT method still appears, but almost all of the data that flows across the proxy is unreadable!
Do you see the difference? Almost everything is the same except for one thing: the port. Eve used to see browsers connecting to www.example.com on port 80. Now it’s on port 443. What is going on?
It turns out that the EA adversaries have switched to using HTTPS (“HTTP Secure”). While HTTP uses port 80 by default, HTTPS uses port 443. Just to be clear, it is not the port that is making things secure, it is the new protocol. The port difference is merely Eve’s first clue that something has intentionally been changed.
This code is literally different by just one character. Do you see it? We changed HTTPConnection to HTTPSConnection.
Eve, disturbed that she can no longer read the network traffic that she is intercepting, heads back to WA to do some research on HTTPS. She learns that HTTPS encapsulates HTTP traffic inside another protocol called TLS. This protocol allows a client to verify the identity of a server and for the two parties to establish a secret key between them. This key remains secret even if an eavesdropper (like Eve) is listening to the entire communication stream. TLS, in theory, will completely shut Eve out from snooping on Alice, Bob, and the EA!
Eve is frustrated by this discovery. But, being the determined person that she is, she decides to start searching for weaknesses. If there’s one thing she’s learned throughout this book, it’s that cryptography is often done incorrectly and is therefore exploitable.
Exercise 8.1. What’s In Web Traffic?
Pretend to be Eve and examine some of your own encrypted traffic. That is, configuring your browser to use your proxy, navigate to some HTTP web sites and spy on your own data. Hint: Are there parts of the secure communications that are still in plaintext?
If you don’t know how to configure your browser for proxying, please do some searching on the search engine of your choice! Be aware that you may not be able to configure your browser to use your proxy correctly for unencrypted (HTTP) traffic. We personally tested Chrome and found that it uses the CONNECT method for HTTPS but not for HTTP.
Digital Identities: X.509 Certificates
To start searching for weaknesses, Eve first turns to the authentication part of the TLS protocol.
She learns that TLS uses a public key infrastructure (PKI) to establish identities and secure communications. Parties that wish to have an identity for use with TLS (typically) require an X.509 certificate.
In Chapter 5, we introduced the concept of certificates. At the time, to keep things simple, we used fake certificates that were nothing more than dictionaries we serialized with the Python json library . Now it’s time to dig into real X.509 certificates, the most common type of certificate used on the Internet today.
X.509 Fields
Somewhat similar to our dictionary-based certificates, X.509 is a collection of key/value pairs. These pairs could also be represented using a dictionary, although X.509’s fields permit hierarchical subfields.
- 1.Certificate
- (a)
Version Number
- (b)
Serial Number
- (c)
Signature Algorithm ID
- (d)
Issuer Name
- (e)Validity Period
- i.
Not Before
- ii.
Not After
- i.
- (f)
Subject Name
- (g)Subject Public Key Info
- i.
Public Key Algorithm
- ii.
Subject Public Key
- i.
- (h)
Issuer Unique Identifier (optional)
- (i)
Subject Unique Identifier (optional)
- (j)
Extensions (optional)
- (a)
- 2.
Certificate Signature Algorithm
- 3.
Certificate Signature
Versions 1 and 2 of X.509 are subsets. The most important addition of version 3 is the extensions. These extensions are used in making certificate-enabled PKI more secure by, for example, limiting what a certificate can be used for. Nevertheless, version 1 certificates still exist and are usable, as we will see in a moment when we start generating some samples.
The primary purpose of a certificate is to tie a subject’s identity to a public key under the signature of an issuer. The fields that identify the subject, the public key, and the issuer are the most critical, but the other fields provide contextual information necessary to understand and interpret the data.
For example, the validity period is used to determine when a certificate should be considered valid. While the “Not Before” field is important and must be checked, in practice the “Not After” period usually gets the most attention. Certificates with a higher risk of compromise can be issued with a shorter validity period to mitigate the damage done if a compromise occurs.
Another important piece of context with an X.509 certificate is found in the fields for identifying the certificate creation algorithms used and the type of public key embedded within it. Unlike most of our toy examples in this book, real cryptographic systems make use of a wide range of algorithms, and certificates have to be flexible enough to support them.
Scanning through the preceding X.509 fields, there is a “Certificate:Signature Algorithm ID” field that identifies how the certificate is signed.2 Because it specifies all the details for the actual signature embedded in the certificate, it includes both the signing algorithm (e.g., RSA) and the message digest (e.g., SHA-256).
The “Certificate:Subject Public Key Info:Public Key Algorithm” field, on the other hand, specifies what type of public key is being used by the certificate owner.
The last contextual field we will mention is the serial number. This is a unique number (per issuer) that identifies the certificate uniquely. This number is useful for revocation purposes discussed later in the chapter.
Now let’s go back to the real reason we have certificates in the first place: identifying the subject, the subject’s public key, and the trusted third party that “proves” this.
- 1.
CN: CommonName
- 2.
OU: OrganizationalUnit
- 3.
O: Organization
- 4.
L: Locality
- 5.
S: StateOrProvinceName
- 6.
C: CountryName
Not all of these subfields have to be filled in, but CN (Common Name) is typically the critical subfield. Later, when we look to validate a certificate, the subject’s common name is used as the primary identifier. Additionally, most modern certificates include a field called “Subject Alternative Name” (which is a version 3 to store alternative subject names. While in many of our examples we have been using agent (code) names (e.g., “Charlie”) as the subject name, certificates associated with TLS-protected web servers have to identify the host name—such as google.com—as the subject’s identity.
You may also have noticed that the certificate included “Issuer Unique Identifier” and “Subject Unique Identifier” fields, but these can usually be left out and are not discussed here.
With the subject and the issuer identified, the remaining fields are the public key and the signature computed on the certificate’s contents. The signature is calculated over a binary encoding of the certificate called the “DER” (“Distinguished Encoding Rules”). The signature both proves that the certificate was signed by the true issuer and that it has not been modified.
Certificate Signing Requests
To create a certificate in real life, a party creates a certificate signing request (CSR) and transmits it to a certificate authority (CA). The CSR has almost all of the same fields as an X.509 certificate but is missing, for example, an issuer (since issuance is what we’re trying to obtain with the request). Once the CA has the CSR, it uses its own certificate and associated private key to generate the finalized certificate, filling in fields as necessary. One of the most important fields is the “Issuer” field. The issuer of one certificate should be identical to the “Subject” field of the signer’s certificate. Once all of the fields are populated, the CA signs the certificate with its own private key.
Note: Private Keys Are Still Private
The party requesting a certificate did not send its private key to the CA. It only sent a CSR with its public key! Nobody, not even the CA, should have the private key!
We mentioned earlier that certificates are encoded in a format known as DER before signing. The DER format is, as we said, a binary format. Most on-disk representations of certificates (and CSRs and private keys) are actually in a text (ASCII) format known as PEM (“Privacy-Enhanced Mail”). Because all of the binary data has been encoded as ASCII, it is easy to send these certificates by text-based transmission systems, for example, email.
Armed with this knowledge about certificates, Eve decides to create a certificate. Because Eve doesn’t have a certificate authority (CA) to sign her certificate, she will experiment with two alternative approaches: self-signed and signed by a “fake” CA she creates herself.4
One common method for generating an X.509 certificate is using openssl from the command line. As you’re using the cryptography module (which uses OpenSSL libraries under the hood) for the exercises in this book, you should have OpenSSL installed. Eve does, so she is going to use it.
- 1.
Generate an RSA key.
- 2.
Create a CSR from the key.
- 3.
Send to a certificate authority for signing (or sign it herself).
Generate a Key
In the various instructions for generating RSA keys that litter the Internet, there are many guides and walk-throughs that use a different OpenSSL command called genrsa . Please note that genpkey, which is more general, has superseded genrsa. Eve’s example command says to generate a 2048-bit private key using the RSA algorithm. The output will be saved in domain_key.pem (in PEM format).
Create a CSR from a Key
Once Eve enters all of these fields, OpenSSL produces the CSR file and saves it to disk (also in PEM format). Eve uses the same utility (openssl req) to load the CSR from disk and view the fields in a human-readable format.
You’ll notice that the version of Eve’s CSR is version 1 and not version 3. OpenSSL always assigns version 1 unless version 3 extensions are in use. But remember, this is just the request, not the actual certificate. When CAs generate the actual certificate, they may insert V3 extensions for security reasons, resulting in a certificate that is using X.509 version 3.
Additionally, some certificate fields are not present, such as “Serial Number.” Those will also be added when the CSR is signed by the CA.
In looking over Eve’s shoulder, you may also have been surprised to see that the CSR already has a signature (the data on the line following Signature Algorithm ). Where did that come from? Aren’t signatures created when the issuer signs the certificate?
CSRs are typically signed by their own key as a way of indicating that the private key is actually held by the requester. Anybody could throw anyone’s public key into a CSR. By having it be self-signed, this proves to the CA that the requester is in control of the private key, sometimes called “proof of possession.” The real signing by the CA to produce a certificate is a separate process, and the next step.
Signing a CSR to Produce a Certificate
To review, let’s remember that a certificate always has to be signed by the CA/issuer. If Eve, for example, created a web site and wanted a TLS certificate for it, she would generate the CSR and send it to a CA for a signature as we discussed. This signature is their stamp of approval that Eve’s certificate is valid and she is permitted to claim the requested identity. The CA is responsible for a certain level of verification. If Eve requests an identity within the East Antarctica government, for example, the CA should determine, as part of their verification process, that she can’t claim that identity. They would then deny her request. On the other hand, she can claim an identity within her native West Antarctica and may need to provide the government with physical documentation and have an in-person meeting with a representative of the CA to prove it.
Eve does have another option besides sending her CSR to a CA. She could sign the certificate herself using the same private key. This is called generating a self-signed certificate. All root certificates (e.g., root certificates held by a CA) are self-signed. After all, the chain has to stop somewhere.
We’re getting ahead of ourselves. What is a certificate chain anyway?
We mentioned the concept briefly in Chapter 5. If you recall, when we were using our simplified (not very real) certificates, we discussed having an issuer of an issuer chain that could be arbitrarily long. That is, a party’s certificate (say Eve’s certificate) could be signed by an issuer, that is in turn signed by a “higher” issuer, that is signed by an even higher issuer, until some root certificate is the highest level issuer for the entire chain. The root certificate is signed by itself! In fact, the subject and issuer sections of a root certificate are identical.
This is one reason why verifying a certificate requires great care. You have to ensure that your certificate chain ends with a root that is trustworthy. The entire security of the system rests on this requirement. Anybody, including Eve in West Antarctica, you, or a Mafia Mob Boss in America, can create a self-signed certificate for any identity (the West Antarctica government, Google, Amazon, your bank, etc.). The only reason your browser won’t trust Eve’s self-signed certificate is because it isn’t signed by an issuer that it (the browser) already trusts.
How does a browser know which root certificates to trust? Most browsers are shipped with certain trusted root certificates baked in. In our hypothetical Antarctic example, East Antarctica and West Antarctica could produce browsers with only government-authorized CAs installed. This would literally prevent the two countries from communicating with each other (at least over HTTPS or TLS).
This command creates a certificate valid until 30 days from now. It is signed by domain_key.pem, which is the same key associated with the CSR. The self-signed certificate is saved in the file domain_cert.crt.
Now all of the fields are filled in. For example, Eve did not specify a serial number so one was automatically generated. The issuer field is also filled in and, as expected for a self-signed certificate, it has the same identity as the subject.
This EC key is based on the P-256 curve which is a very popular and widely used curve and a reasonable choice.5
Now Eve has a request to create a certificate, not a signed certificate. Not yet, anyway. To create the certificate, Eve needs to sign with domain_key.pem, as she is treating that key and certificate like a CA key/cert.
She is also going to add some X.509 V3 options. These options are used for limiting how a certificate can be used. For example, Eve wants to use her first certificate and private key (domain_cert.crt and domain_key.pem ) to sign her second certificate. She wants her first certificate to be able to be used as a CA. She does not, however, want her second certificate (for localhost) to be able to sign other certificates. Using V3 extensions, it is possible for Eve to encode these limitations directly into the certificate itself.
To see why this is important, imagine if Eve is granted a certificate by a real CA for wacko.westantarctica.southpole.gov. If this certificate does not have limitations on its use, nothing stops Eve from using it to sign a new certificate granting her the identity of eatsa.eastantarctica.southpole.gov . This would give Eve a chain of authority back to the CA for an identity she shouldn’t have. Thus, in order for certificate chains to mean something, Eve’s certificate must deny her the right to create other certificates.
Key Usage
Basic Constraints
Eve is going to use these fields to express that this new certificate should not be used as a CA. In fact, it will say so expressly in the “Basic Constraints” field. The “Key Usage” field will include normal key uses such as “Digital Signature” but it will leave out things like being used as signing “Certificate Revocation Lists” (CRLs).
When signing with a CA key and certificate, the signkey parameter is removed and CA option and CAkey parameters are added. The CA option specifies the certificate of the CA/issuer, and the CAkey specifies the associated private key used for signing. Eve plugs in the private key and self-signed certificate from her first experiment.
Although not required when creating a self-signed certificate, Eve now has to explicitly specify a serial number when signing with a CA key and certificate. A real CA must not reuse serial numbers and must keep a record of the serial numbers issued in case the certificate needs to be revoked.
As you would expect, the issuer is not the same as the subject this time around. In fact, the issuer field from this certificate matches the subject field from the signing certificate. This is required for correct certificate chain validation.
Also, the public key algorithm is elliptic curve now instead of RSA, but the Signature Algorithm is still sha256WithRSAEncryption . That’s because this certificate is signed by the domain_cert.crt Eve created earlier, and that is still RSA.
As you can see, the X.509 V3 extensions are present and the version of the certificate is now listed as “3” as well.
This command starts the server listening on port 8888 (for your own tests, make sure your HTTP proxy is turned off or else pick a different port). It uses the localhost cert as its identity certificate, but uses the domain cert file as a list of certificates for use in building chains. The build_chain option instructs the server to attempt to build a complete chain of certificates for transmission to clients. In other words, it sends the entire chain to the client, not just the identity certificate.
Figure 8-2(b) is an image with details about the certificate chain, specifically. Notice that it received the chain (both the certificate and its issuing certificate). It recognizes that the domain certificate Eve created (identified in this figure by the common name wacko.westantarctica.southpole.gov) as a “root” certificate, because it is self-signed. But, it says that this root certificate is not a trusted certificate. If a root certificate is not trusted, then the entire security of the chain cannot be established.
There are ways to add a root certificate to a browser’s trusted certificate store. Eve studies this concept very closely as she might be able to use this approach to defeat TLS. We are not going to include the details in this book, however, as it is actually a pretty bad and dangerous idea. It is probably the most dangerous thing we have discussed so far.6 If you install a new root certificate into your browser, your browser will trust any certificate signed by that root. If, somehow, your ill-conceived trusted root escaped into the wild, an attacker could basically convince your browser that any web site was authentic.
Speaking of which, how do browsers trust any certificate authorities at all? The answer, which is uncomfortably arbitrary, is that a small number of “authorities” have established themselves as reliable root authorities. These organizations and companies have their root public keys installed by default in popular computer systems and browsers. All other trust must be derived from these arbitrary authorities. Does that make you feel safe?
In summary, however, for TLS to work correctly, it has to have correctly configured (and correctly limited) trust anchors. There may be times when engineering teams need to use self-signed certificates for testing and other temporary purposes. Generally, though, browsers will not trust them, and any TLS-enabled code you write should not trust them either.
Exercise 8.2. Certificate Practice
Generate some different TLS certificates experimenting with different algorithms and parameters (such as key sizes).
Exercise 8.3. Fantasy Certificates
Create some “fantasy” certificates for some of your favorite organizations. Self-sign a certificate or two that reads amazon.com or google.com. You can’t use these as nobody’s browser will accept them.7 But it is kind of a fun game.
Maybe you could print out a copy of Openssl’s text representation and frame it. After all, how many of your friends have an Amazon TLS certificate?
Creating Keys, CSRs, and Certificates in Python
After getting done with her OpenSSL certificate tests, Eve explores creating these same objects programmatically using the Python cryptography library . Using this library, Eve can generate self-signed certificates, certificate requests, signed certificates, and keys. Alice, Bob, Eve, and you have already generated keys in previous chapters, so let’s skip ahead to certificate requests.
The CertificateSigningRequestBuilder follows the object-oriented “builder pattern” wherein each building method returns a new copy of the builder object. This is handy for when Eve decides to construct multiple CSRs with partially overlapping parameters. One builder can be configured with the overlapping parameters, and then individual builders are created when the parameters diverge.
As a side note about X.509 extensions, you will note that the CSR created in our example set ca=False. As with our earlier OpenSSL example, we are explicitly marking this certificate as not being able to sign other certificates (e.g., act as a CA). In this example, it also sets path_length=None, but that’s a superfluous piece of data because path_length only applies when ca=True. The critical flag indicates that this is a mandatory extension that must be processed by processing software.
When ready, Eve uses the sign method to build the actual CSR request object using a private key. Recall that CSRs are self-signed to ensure that the requester has the private key corresponding to the embedded public key. The sign method extracts the public key from the private key, inserts it into the CSR, and then signs with the private key. The object built by this method is an instance of CertificateSigningRequest.
For making certificates, Eve discovers that the cryptography library follows a similar pattern as it did for making CSRs. There is a builder class and a read-only certificate class that can also be serialized to and from disk.
Interestingly, there is no method for creating a certificate from a CSR. The cryptography documentation explicitly identifies that the purpose of the certificate builder class is to generate self-signed certificates. There is no reason to start from a CSR.
Even if Eve wanted to establish a CA (for her own West Antarctic colleagues), it would be better for her to not automate CSR signing. As we discussed earlier, CAs need to verify CSR information very carefully, and sometimes manually; correctness and validity must be established before signing.
Still, Eve finds that if she needs to create a certificate from a CSR, she can load the CSR and then use its data fields to fill in the certificate builder.
From the cryptography documentation , Listing 8-2 contains an example for building a self-signed certificate. After this code runs, the certificate variable has what we need.
Note: Dot Chaining
We take advantage of the fact that each operation on the builder returns itself. This allows the method “dot chaining” approach you see. Since the final call to “sign” returns a certificate, not a builder, we can assign this long operation to the certificate itself.
TLS Builder
To modify this example to create the certificate from a CSR, Eve can extract the subject name, public key, and optional extensions directly from the CSR object and copy them into the certificate builder. To sign the certificate with a CA certificate/key pair, Eve needs to load the CA certificate and key, copy the “Issuer” field from the signing certificate into the certificate builder, and sign using the certificate’s private key.
Certificates can be loaded using load_pem_x509_certificate , then serialized for storage or transmission using the public_bytes method .
Exercise 8.4. Openssl To Python And Back
Generate a CSR with Python and sign it with Openssl.
Generate a CSR with Openssl, open it in Python, and create a self-signed certificate from it.
Exercise 8.5. Certificate Intercept In The Middle
In the next section, we will talk about TLS, the security protocol that underlies HTTPS. TLS relies on the certificates you learned about in this section. Going back to your HTTP proxy, intercept some more HTTPS traffic and see if you can figure out when the certificate is being sent.
This is a tough exercise and more for those interested in experimentation and tinkering. As a hint, certificates are not sent in PEM format, but DER. This is a binary format. But it’s not encrypted. You can try poking around for certain binary byte combinations. You could also use openssl to convert the certificates you’ve created into DER format and examine them in a hex editor to see if there are common bytes to look for.
Exercise 8.6. Certificate Modification In The Middle
If you do manage to find when certificates are going over the wire, modify your HTTP proxy program to intercept and modify them. At the very least, you could just have a certificate of your own pre-loaded that you send instead. How does your browser feel about this?
An Overview of TLS 1.2 and 1.3
With a little bit of knowledge about X.509 certificates under her belt, Eve turns to studying the TLS protocol. As you follow along, you should recognize that the TLS protocol draws on cryptography components that we have studied through all the preceding chapters. This is a chance for you and Eve to see how all of the pieces are put together in a modern security protocol.
The goal of the TLS protocol is to provide transport security (TLS stands for “Transport Layer Security”). The TCP/IP protocol suite, upon which the Internet is built, does not have any security guarantees. It does not provide confidentiality, which is why Eve was able to use an HTTP proxy to read the data being sent between two parties.
At least as bad, if not worse, is the fact that TCP/IP does not provide authenticity either. Eve could have used her HTTP proxy with a few modifications to masquerade as the true destination (example.com), and Alice and Bob would have had no idea. The TCP/IP protocol suite also does not provide message integrity. The proxy could change the data and the change would not be detected.
TLS is designed to add these security features on top of TCP/IP. The protocol originated as the “Secure Sockets Layer” (SSL) protocol from Netscape in the mid-1990s. Version 2 was the first public release, followed by version 3 shortly thereafter. Subsequently, it received a few changes and was renamed TLS 1.0.8 The updated versions since that time have been released to update cryptography and alleviate problems with the cryptographic protocol. Version 1.2 has been around for a number of years and is still considered current. Recently, version 1.3 was also released, but is not currently being described as a replacement to 1.2 (both versions are considered current).
How does TLS work? It starts with a handshake. That handshake is extremely critical. Keep in mind that TLS has two major goals: first, establish identity9 and, second, mutually derive session keys for secure transport. These two goals are typically achieved by a successful TLS handshake.
The handshake is also where the various TLS versions are most different from one another. For this section, we will review the TLS 1.2 handshake and then briefly discuss how TLS 1.3’s handshake is different. The TLS 1.3 changes will make more sense after the TLS 1.2 handshake has been explained.
Please note that this section is somewhat academic. There isn’t much in the way of programming for Eve to experiment with. This background will help her understand how TLS is supposed to work and places where it has gone wrong in the past. Eve can use this information to figure out which servers are going to be easier to crack than others.
At the same time, you, the reader, will benefit from watching Eve’s attempts to break through the cryptography shield that TLS is supposed to provide. Throughout this entire book, we’ve been pounding it into your head that you shouldn’t create your own algorithms and shouldn’t create your own implementations whenever a well-tested library is present.
TLS is actually a protocol you can and should use, and there is plenty of library support for it in Python, which helps a lot. Still, you want to know what kinds of things Eve will be looking for if she wants to attack your system. Let’s dive in.
The Introductory “Hellos”
TLS 1.2 begins with a client sending a client “hello” message to the server. The client hello message includes information about its TLS configuration, as well as a nonce. One of those bits of configuration is the client’s list of cipher suites . Possibly one of the most confusing characteristics of TLS to newcomers is that the TLS protocol is actually a combination of protocols that work together. And it supports a number of different algorithms and protocol combinations.
The hello message must, out of necessity, get the client and the server preparing to communicate using the same algorithms and component protocols. The client sends a list of cipher suites to indicate all the different ways that it is willing to talk and the server will select one in its response (presuming there is any overlap between the cipher suites they support).
A cipher suite for TLS typically includes one choice of algorithm each for key exchange, signing, bulk encryption, and hashing. As we said, TLS brings together all the different elements you have been learning about in this book, so these terms should look familiar!
TLS: The protocol the cipher suite is meant for. Easy enough.
ECDHE: As described in Chapter 6, the client and server will use ECDHE to create a symmetric key.
ECDSA: Recall from learning about ECDHE that it is not authenticated. In order to be sure that the server is who it claims to be, it will use ECDSA signatures on some of the handshake data.
AES_256_CBC: After the handshake is over, the client and server will send data protected by AES-256 in CBC mode.
SHA_384: This parameter has to do with two different parts of the TLS operation. The SHA-384 algorithm will be used in a key derivation function during the handshake. Additionally, the bulk encryption messages sent after the handshake (encrypted by AES-256 in CBC mode) will be protected from modification by HMAC-SHA-384.
These elements will make more sense as we go through the rest of the TLS protocol. Meanwhile, it is a good introduction to the number of components that are a part of TLS operations.
Note: ECDH vs. ECDHE
Throughout this book, we have not made too much of a distinction between DH/ECDH and DHE/ECDHE. As a reminder, the “E” stands for “ephemeral.” When DH/ECDH is used in ephemeral mode, the public/private key pair is used once and discarded.
The reason we haven’t made an effort to say “DHE” instead of “DH” is that in many contexts DH is implicitly ephemeral.
This is not the case with TLS. There are modes of operation that are not ephemeral at all. Accordingly, we will use the full DHE/ECDHE term throughout this chapter to be explicit.
Notice that the strength of TLS depends greatly on its cipher suite. What is a little frightening is that two servers can be “using” TLS 1.2 where one server is strongly protected and the other is vulnerable to attack because of the choice of cipher suite. Don’t ignore the hello part of the TLS handshake!
It is really important!
Hello: The server’s hello message includes its own random nonce.
Certificate: The server’s TLS certificate or certificate chain, the details of which we covered earlier in this chapter.
Key exchange: If the cipher suite uses DHE or ECDHE, the server will also transmit its portion of the Diffie-Hellman exchange along with the hello. For RSA key transport, the server does not send this element.
Finished: An end-of-message kind of marker.
Exercise 8.7. Who Goes There?
If you’ve been practicing with your HTTP proxy (specifically in the previous exercise), you should already be getting a feel for the back and forth exchanges in TLS. So, now that you know how a TLS handshake starts with the initial hellos, try to reverse engineer it a little. Remember, this part of the communication is clear text!
Can you figure out whether you’re looking at a TLS 1.2 or 1.3 handshake? That’s a great start!
Client Authentication
The most popular configuration of TLS today authenticates only the server when the server’s certificate is sent with the ServerHello . Unless explicitly requested by the server, the client will not send a certificate to authenticate itself.
For a lot of Internet applications, this is sufficient. The servers are running on the Internet, broadcasting their information to the world. They want to prove to the world that they are who they say they are. Anyone is welcome to come and visit without proving their identity. Plus, exactly what a client’s identity should be is less clear. A server’s identity is usually tied to a domain name (e.g., google.com) or an IP address. But when you are browsing the Internet, what should your computer’s identity be?
For circumstances where the server needs to identify the client, like bank transactions, or any other kind of account access, the user’s identity (rather than the machine’s) is what really matters. Usernames and passwords (or other kinds of personal identification) are what the server concerns itself with in those cases. Conceptually, by first authenticating the server and creating a shared key with it, the user can then safely identify themselves to the server using something like a password without worrying about disclosing their confidential information to the wrong party.
There are times, however, when security policy dictates that the client device must also be authenticated. When TLS is thus configured, it is referred to as “Mutual TLS” (MTLS). In this mode, the server lets the client know that it requires a certificate and proof of certificate ownership.
Exercise 8.8. Client Authentication Research
Mutual TLS is not used very frequently, but it is used. The authentication of clients, even when certificates are used, is often a little different. Do a little Internet search about how to configure a browser with a client certificate, how one obtains such a certificate, and what kind of identifier is chosen for the subject.
Deriving Session Keys
Recall from Chapter 6 that a very common configuration for cryptography is to have asymmetric operations used to exchange or generate a symmetric session key. In that same chapter, we discussed two different ways of doing that: key transport and key agreement.
In the TLS 1.2 handshake, the goal is to get both the client and the server the same copy of a symmetric key. Actually, that’s not completely true. The goal is to get what is called the “pre-master secret” (PMS). The PMS, along with some other non-secret data, will be used to generate the “master secret.” The master secret will be used to generate all the necessary session keys for bulk data communications.
TLS 1.2, through its various cipher suites, provides both key transport and key agreement approach to provisioning the PMS.
TLS cipher suites that begin with TLS_RSA refer to TLS suites that use RSA encryption for key transport. For example, the cipher suite TLS_RSA_WITH_3DES_EDE_CBC_SHA. You might notice that for ECDHE in our previous example we also required ECDSA signatures. Why do we not need RSA or ECDSA signatures with RSA key transport?
As we said in the previous section, if ECDHE or DHE is being used for key exchange, the server sends those parameters along with the server hello. But if RSA key transport is used, it sends nothing. Instead, in RSA key transport mode, the client receives the server’s certificate that was sent with the server’s hello, extracts the public key, and encrypts the PMS with the public key. It transmits the encrypted PMS to the server and only the server can subsequently decrypt it. Now both client and server have the same PMS.
DHE and ECDHE behave differently. They are called key agreement protocols because the PMS is not transmitted. Instead, both sides exchange DH/ECDH ephemeral public keys that can be used to simultaneously derive the PMS on both sides. As a reminder, exchanged DH/ECDH public keys are not like RSA or ECDSA public keys in the certificate. The DH/ECDH public keys are generated on the spot and are used only once. That’s what makes them ephemeral.
That is also why they can’t be trusted. If the public key was just made up on the spot, how does the client know that the public key really came from the server? How does the server know that the public key it received really came from the client?
The long-term RSA or ECDSA private key is used by the server to sign its DHE or ECDHE public key and parameters (e.g., curve). When the client receives them, it can use the server’s public key in the certificate to verify that the DHE or ECDHE data came from the proper source. As discussed in the previous section, usually the client does not sign anything.
The security of these two approaches is very different. As we have already discussed in Chapter 6, the DH/ECDH approach provides perfect forward secrecy, while the RSA encryption approach does not. Furthermore, the RSA encryption approach has the pre-master secret generated entirely by the client. The server has to trust that the client is not reusing the same pre-master secret (or generating them from poor sources of randomness).
Even though the session key derivation from the pre-master secret depends on additional data—including the ClientHello nonce and the ServerHello nonce—that prevents a trivial replay attack, reusing the pre-master secret is suboptimal and potentially reduces the security of the system. On the other hand, when using DH/ECDH the server and the client both contribute to the generation of the key material, ensuring that the server is not wholly dependent on the client for this value.
The RSA encryption scheme is problematic for one other reason: it uses PKCS 1.5 padding. You found that this scheme was vulnerable to a padding oracle attack in Chapter 4. TLS 1.2 has “countermeasures” designed to eliminate the oracle (remember, for the attack to work, the attacker needs to know when the padding was accepted), but unfortunately they aren’t always successful. As described in more detail later in this chapter, this attack is still a threat.
For these reasons and others, most security experts are encouraging TLS servers to stop using RSA encryption for key transport. At the very least, this form of key exchange should be an option of last resort.
Exercise 8.9. Key Exercises
Try re-creating TLS’s key transport and key agreement operations. Let’s start with key transport. Start by taking one of the RSA certificates you’ve generated. If you were a browser, this is what you would receive over the wire. Create a Python program to import the certificate, extract the RSA public key, and use it to encrypt some random bytes (i.e., like a key) that you write back to disk.
There were already exercises in Chapter 6 for key agreement, even over a network. If you didn’t do those exercises, then maybe try it again now.
Switching to the New Cipher
Once the client has finished sending the key exchange information (either using RSA encryption or DHE/ECDHE), it no longer needs to send data in the clear. All subsequent information should be sent encrypted and authenticated.
To signal this, the client sends a message called a ChangeCipherSpec message to the server. This basically says that everything else sent from the client from this point forward will be sent using the negotiated cipher. Once the server has received the client key exchange data, it can also derive the session keys. As with the client, there is no further reason to communicate in the clear and the server sends its own ChangeCipherSpec message .
Each side then sends a Finished message to complete the handshake. The Finished message has a hash of all the handshake messages sent thus far, and because it is sent after the ChangeCipherSpec message, it is encrypted and authenticated under the new cipher suite.
The purpose of this hash of handshake messages is to prevent an attacker from altering any of the messages sent in the clear before the changed cipher spec. For example, if an attacker intercepted and altered the client hello message, they could eliminate tougher ciphers and leave weak ones enabled, decreasing the difficulty of cracking the system. However, both sides keep a record of the messages sent and transmit a hash over all of these messages under the new cipher suite. If the hashes don’t match, then what one side sent is not what the other side received. The communications channel is considered compromised in this case and is immediately closed.
Deriving Keys and Bulk Data Transfer
At this point, the TLS 1.2 handshake is over. The client has verified the server’s identity using public key certificates, and both sides share a pre-master secret.
Regardless of how the pre-master secret is generated, both client and server derive keys using it. These keys are to create a secure authenticated channel using symmetric encryption and message authentication. Application data is set using this channel. But first, let’s talk about these derived keys.
In this book, we’ve derived keys from data using a number of methods. Many are built in one form or another around hashing. In TLS 1.2, the pre-master secret is expanded into the “master secret” using what the specification calls the “pseudo-random function” (PRF). By default, the PRF is built using HMAC-SHA256 using an expansion mechanism based on HMAC being called repeatedly; the output from one call is fed into another to expand data to any arbitrary size. The PRF can also be built using a different underlying mechanism if specified by the cipher suite.
As a reminder, the idea of key expansion is simply to take a secret and expand it into more bytes. In the case of TLS, we expand the pre-master secret, whatever size it is, into 48 bytes. This is the master secret. The master secret is, itself, expanded into as many bytes as necessary for all of the session keys and IVs required by a cipher suite. Different suites require different parameters and different sizes, so the final output of the master suite, called the key_block, is of variable length.
client write MAC key
server write MAC key
client write key
server write key
client write IV
server write IV
You will notice that there are no read keys listed. That’s because these are symmetric keys. In other words, the server’s write key is the client’s read key.
Exercise 8.10. Implement The PRF
Look in RFC 5246, available online, and look up the PRF. There is a description of how it works on pages 13 and14. Implement the PRF for HMAC-SHA256 and try out some key expansion. Generate a hundred bytes or so and divide some up for different keys.
Not all of these parameters are used for every cipher suite either. AEAD algorithms such as AES-GCM and AES-CCM do not need a MAC key. Even so, every cipher suite provides both confidentiality and authentication.10 This either involves encrypting and applying a MAC or using AEAD encryption.
Speaking of which, the AES-CBC modes in TLS 1.0 are vulnerable to a padding oracle attack because they apply MAC first, then encrypt. This is vulnerable to the same attack you performed as an exercise in Chapter 3. While TLS 1.2 should theoretically not be vulnerable to this, some implementations did not follow the specification correctly and were found to be vulnerable. For this reason, CBC modes of operation have fallen out of favor in recent years.
It’s also good to understand where the MAC is applied. We had a brief discussion about this issue back in Chapter 5. You might remember that we talked about how much data would someone want to encrypt before they include a MAC. In a communications context, would you wait until the very end of a communications session to send a MAC of all data transmitted? That’s probably a bad idea. After all, what if the communications session lasted a month! It would be a terrible thing to reach the end of the month and find out that all of the data received was bogus. TLS chooses instead to put a MAC on every packet (after the ChangeCipherSpec).
If you’re not familiar with C-style structs, this is really just a raw data structure. It’s kind of like a class in Python but without any methods. The structure has type, version, and length fields that are reasonably straightforward. The exact types of ContentType and ProtocolVersion are defined elsewhere in the document, but the intent is clear even without looking them up.
The select statement is perhaps a little more confusing. What this part of the struct is expressing is that there is a fragment field, but its type is one of three options: GenericStreamCipher , GenericBlockStream , and GenericAEADCipher . Each of these three options represents a different kind of cipher.
Just to be clear, the struct shown here is conceptual. This struct shows how data is laid out and concatenated in binary form in a way that is easy to understand, as well as hierarchical (data structures within data structures). When sending data, TLS constructs a stream of binary data with these pieces in it, in this order.
The content field for both of these types is the plaintext (potentially compressed). The stream-ciphered and block-ciphered keywords in front of the respective structs indicate that the binary data is encrypted. The MAC for both of these cipher types is within the enciphered structure. The documentation states that these MACs are computed over the content which includes the content type, version, length, and the plaintext itself. Obviously, this is a MAC-Then-Encrypt scheme.
There is no MAC for this because the MAC is included by default in the output. Recall from Chapter 7 that the “AD” in AEAD means “additional data” that is authenticated, but not encrypted. In the case of TLS AEAD ciphers, the AD includes the same data—to which the MAC is applied—in the stream and block ciphers, namely, the content type, version, and length. By inserting this AD directly into the decryption process, the algorithm will not decrypt the plaintext unless the contextual data is correct. This helps reduce errors and ensure correctness.
Importantly, because there is a MAC for each record, the AEAD encryption is finalized for each TLSCiphertext chunk. In Chapter 7, we discussed the idea of not wanting to wait for gigabytes of data before determining that the ciphertext has been modified. Accordingly, the AEAD algorithm is run with an individual key and IV (nonce) on each one of these TLSCiphertext structures (the same key and IV must not be reused after finalizing an encryption and producing a tag).
In the GenericAEADCipher struct defined for TLS, it includes a nonce_explicit field that carries a certain amount of IV/nonce data. For AEAD algorithms, it is common to have an implicit part of the IV and an explicit part of the IV. The implicit part is calculated. For TLS 1.2, the server (or client) IV derived in the key derivation operation is the implicit part. Both parties calculate this internally without sending it over the network. The explicit part included in the fragment makes up the rest of the IV/nonce, permitting the nonce to be unique for each packet.
Exercise 8.11. The TLS 1.2 Pieces
Try stringing together something similar to TLS 1.2 from the other exercises in the chapter so far. Exchange a certificate over the network (you can leave it in PEM format if it’s easier). Once you get the server’s certificate, have the client either send back a PMS encrypted or use ECDHE to generate the PMS on both sides.
You can leave out all of TLS’s complicated stuff. You don’t need to negotiate cipher suites, create an underlying record layer, or do the hash over all messages at the end. Exchange a certificate, get a PMS, and derive some keys. For “packet” structure, you can use the same JSON dictionaries you did for the Kerberos exercises.
TLS 1.3
The TLS 1.3 protocol represents the biggest change to the handshake process in the history of TLS.
TLS_AES_256_GCM_SHA384
TLS_CHACHA20_POLY1305_SHA256
TLS_AES_128_GCM_SHA256
TLS_AES_128_CCM_8_SHA256
TLS_AES_128_CCM_SHA256
Basically, TLS 1.3 supports AES-GCM, AES-CCM, and ChaCha20-Poly1305. You have seen all three of these algorithms in this book. By reducing the cipher suites available and requiring AEAD, TLS 1.3 makes it much harder for servers to accidentally or unknowingly secure their web site with weak encryption or authentication.
RSA encryption is also no longer available as a key transport mechanism.
Technically, there is a second message from the client in the form of a “finished” message, but as shown in the figure, it can be piggybacked with the client’s first application message. The server may have already transmitted application data piggybacked with its handshake message as well.
This speedup is especially important for stateless protocols like HTTP. Most HTTP messages are single-shot, one-time transmissions. Setting up a new TLS 1.2 tunnel for every single message really slows down a web site’s speed and responsiveness. Cutting that latency in half makes a big difference for web communications.
More importantly, weak ciphers and modes have been removed. By eliminating RSA key transport, for example, TLS 1.3 makes forward secrecy mandatory! Limiting algorithms to AEAD is also an important improvement.
There are other differences and details for both protocols not covered here, but this is sufficient for an introduction.
Warning: Extra Terrible Lacking Security (eTLS)
There is a “variation” of TLS 1.3 being promoted called eTLS. We put variation in quotes because it is not a standard developed by the IETF, the standards body behind TLS. It takes TLS 1.3 and removes some of its most important security features including forward secrecy.
The purported motivations are data loss prevention (DLP), performance, and other usability reasons. But we, ourselves, do not support cryptographic standards that intentionally weaken protocols and algorithms. We highly recommend that you should not use eTLS under any circumstances and applaud browsers that refuse to support it. Be aware that eTLS will be renamed Enterprise Transport Security (ETS) in a future release [9].
Exercise 8.12. What’s Broken Now?
Do some research to see if you can find new vulnerabilities that have been uncovered in TLS (any version) since the publication of this book. It’s important to stay up to date on vulnerabilities happening all around you and a mitigation path forward. It’s a terrible thing when bad guys find out you’re vulnerable before they do.
Certificate Verification and Trusting Trust
Padding oracle attack against RSA encryption in some versions and implementations of TLS.
Padding oracle attack against AES-CBC encryption in some versions and implementations of TLS.
Attempting to coerce the client and server into using a weak cipher suite.
There are defenses to all of these, but they are areas that Eve can examine. Maybe she’ll get lucky and find a poorly configured server. We will explore these attacks, and a few others, shortly. But first, Eve decides to look at one other potentially massive vulnerability: certificate checking.
In the preceding section, we made only the briefest of references to certificate verification. When a client receives a server’s certificate, the client must ensure that the certificate is valid and trusted. The client certificate may rely on a chain of CAs, and the verification process is said to follow a certificate path. The path must terminate with a trusted root.
- The client certificate’s subject name must match the expected host name from the URI (e.g., if we navigated to https://google.com , then google.com needs to be the subject of the TLS certificate).
The host name can match the subject’s common name, or
The host name can match one of the subject’s alternate names (V3 extension).
None of the certificates in the path can be expired.
None of the certificates in the path can be revoked.
The issuer of a certificate must be the subject of the next certificate in the chain until the root is reached.
Certificate limitations (such as KeyUsage and BasicConstraints) are enforced.
Policies are enforced related to maximum path length, name constraints, and so forth.
Eve realizes that this is a complicated process. There are a lot of checks to be made, and an error in any one of them might grant her access. Many TLS exploits have less to do with the protocol and more to do with programmer or user errors.
The entire security of TLS depends on certificates being issued to authorized parties. If Eve can get an unauthorized certificate, steal a private key, or convince Alice or Bob (or you) that she has an authorized certificate, the rest of the security breaks down. The most powerful certificate attack Eve could attempt is to convince Alice or Bob (or you) to install an evil root certificate! If that happens, TLS will accept any certificate Eve chooses to send!
Certificate Revocation
We mentioned in Chapter 5 that certificates have a big weakness in the realm of revocation. Unfortunately, revoking a certificate is a major pain, and Eve is looking closely at how she can exploit this.
There are two classic approaches to revoking certificates. The first is a certificate revocation list (CRL). As the name suggests, this is just a static record of certificates that have been revoked. To keep the size of the CRL manageable, the certificate is identified by its serial number. CRLs are often CA-specific and are signed by the CA, so it is important that the CA keep track of issued serial numbers. It must ensure that no serial number is used more than once, and it must ensure that the serial number matches the expected owner information. CRLs tend to be published on a fixed schedule (e.g., once per day).
Certificate verification systems , such as one used in TLS, must keep a list of all revoked certificates so that any such detected certificates can be invalidated during the verification process.
The other classic approach to checking for revocation is to use the Online Certificate Status Protocol (OCSP). As with CRLs, this protocol is used to check the validity of a certificate by serial number lookup. Unlike CRLs, however, this protocol is used with an online server in real time and can be executed during the certificate validation process. Once again, the issuing CA is often the OCSP responder for certificates that they have issued.
Obviously, OCSP will have more up-to-date information than static CRLs. OCSP, however, introduces additional latency into a TLS handshake setup. Worse, what should a client, like a browser, do if the OCSP responder doesn’t respond? Should it not connect? Should it tell the user that “I’m sorry, I can’t let you do online banking today because the OCSP server is down?”
Most browsers refuse to take this hard line. If the browser can’t get an OCSP response, it just moves forward and assumes that the certificate isn’t revoked. This makes Eve super excited. If she can get a revoked certificate (or a certificate that is immediately revoked once her theft is discovered), she can use it against Alice’s and Bob’s browsers. If the browsers try to reach out to OCSP servers, she will just execute a denial-of-service attack and ensure that the OCSP responses are never received. It’s an easy way around the security measure.
For these and many other reasons, CRLs and OCSPs are considered obsolete. Many browsers, such as Google Chrome, don’t even have an option to turn these features on.11
The truth is, revocation is still a hard problem and Eve is going to do everything she can to exploit this fact.
The good news is, new forms of certificate revocation are being explored right now including mandatory OCSP stapling. The concept for this is that a server includes an OCSP response along with their certificate. The OCSP response is only good for a relatively short period of time, so the server has to refresh regularly. The full details of this approach are beyond the scope of this book, but this might be a good topic of research for Alice and Bob.
Untrustworthy Roots, Pinning, and Certificate Transparency
Unfortunately for us (and to Eve’s delight), as with all known approaches to establishing trust, TLS requires a trusted third party. And, as the Roman poet Juvenal would say, “Quis custodiet ipsos custodes?” (“Who guards the guards?” or “Who watches the watchmen?”)
What is problematic about CAs is that if a CA private key is compromised, the thief can generate certificates for themselves for any domain. This is not a theoretical problem. By way of example, there was a successful attack on the now defunct DigiNotar CA in 2011 [8]. The attacker infiltrated their servers and managed to generate forged certificates including a “wild card” certificate for google.com, plus additional certificates for Yahoo, WordPress, Mozilla, and TOR. The DigiNotar CA had to be removed from the trusted CA list of browsers and mobile devices. Unsurprisingly, DigiNotar went out of business almost immediately after the attack was uncovered.
For a more recent, and in some ways more disturbing, example, Trustico, a TLS certificate reseller, asked DigiCert to revoke more than 20,000 certificates. That, by itself, was not problematic. The certificates were being revoked because of a loss of trust in the issuer. What was shocking was the admission that Trustico had the private keys for these certificates and had sent them to DigiCert by email [4]! This means that the reseller was generating the key pairs for their customers and holding on to the private key. Although reportedly kept in “cold storage,” in theory the reseller, an employee of the reseller, or a disgruntled former employee of the reseller could have taken a customer’s private key and assumed their digital identity.
This particular problem of a CA keeping customer private keys cannot be solved technologically. If a party gives up their private key, there are no mechanisms for keeping them secure. All cryptography rests on keeping secrets secret.
The issue of fraudulent and misused certificates is more serious and more common. Eve desperately wants to compromise a CA or a CA’s cert if she can (specifically one trusted by Alice or Bob). Stealing one cert only gives her one fraudulent identity. Stealing a CA cert gives her an unlimited number of fraudulent identities.
Fortunately, there are methods that Alice and Bob can use to protect themselves. Let’s look at two of them.
The first is “certificate pinning.” The term is used in a number of different ways, so make sure you are careful in your research. The basic concept is that a client like Alice or Bob has, one way or another, an expectation of what a certificate should be before receiving it. When the certificate is received, it is compared to the expected version—the “pinned” version—and a policy is invoked if there is a mismatch. It is assumed that a mismatch means, with high probability, that Eve is using a fraudulent certificate.
Although pinning is more general, some sources treat the more specific HTTP Public Key Pinning (HPKP) as a synonym. Perhaps this is because there was a time when some parties, including Google, were pushing for this technology as a general solution to identifying and rejecting compromised certificates. Since then, there has been a general consensus that this approach is insufficient and the new move is toward “certificate transparency” (CT).
Pinning (as a general concept) continues to have its uses even so, especially in mobile applications. An app on a phone, for example, can have its author’s certificate baked into the app itself. This pinned version of the cert is always compared against the cert received in the TLS handshake. If it doesn’t match, something is wrong. Should the company need to change out their certificate or rotate a key, they can push a new pinned version in an app upgrade. Mobile applications aside, Google and Firefox do this kind of static pinning in their browsers.
This is effective. Google actually discovered the issue with the compromised DigiNotar-issued Google certificate because of static pinning.
Exercise 8.13. Monitor Certificate Rotation
Assuming you successfully intercepted TLS certificates in your HTTP proxy program, visit a site multiple times and see if you receive the same certificate every time. How often do you expect a server’s certificate to change?
HPKP, on the other hand, is a general purpose, dynamic pinning technology that relies on trust-on-first-use (TOFU) principles. Basically, the first time a client visits a web site, that web site can request that the client pin the certificate for a certain period of time. Should the certificate change within that period of time, it should treat the modified certificate as an imposter. The idea is interesting and reasonable, but it introduces a number of problems and can still be exploited by attackers in unhappy ways. Hence, the idea is already dying out.
Instead, the aforementioned certificate transparency (CT ) is a second method of addressing certificate issues that is gaining momentum. The basic idea is in some ways similar to blockchain and distributed ledgers. Whenever a certificate is issued, it is also submitted to a public log. The public log is hosted by a third party, perhaps even the CA that issued the certificate, but it is verifiable so that the third party does not have to be trusted.
The purpose of the log is transparency: CAs are thus essentially audited for the certificates they produce. The goal is to have all issued certificates publicly available for inspection in a cryptographically verifiable way.12 Browsers will eventually be configured to not accept any certificate that is not found in such a log.
What do we get from using CT logs? It’s deceptively simple but surprisingly helpful. Suppose that Eve attempts to create a fake certificate to an EA server. If EA browsers will not accept the cert unless it is published, Eve will have to submit it to one of the public logs. If that happens, EA can immediately detect that a forged certificate has been generated. While this does require that EA be monitoring the logs, it is easy to deploy an automated system that checks to see if any new certificates have been issued that shouldn’t have been. The EA knows (or should know) which certs it has legitimately issued and can flag ones that aren’t.
Even if Eve is so clever as to somehow interfere with East Antarctica’s auditing system and does manage to get away with some subterfuge, once the attack is detected, the public logs will enable a thorough investigation of the problem and an accurate assessment of the damage. It is terrifying that in the DigiNotar hack, investigators were unable to even fully identify all the certificates that had been generated! To this day, nobody knows exactly how many certificates the attacker created. That is one reason why DigiNotar had to completely shut down. It was impossible to identify all of the certificates that needed to be revoked.
CT is still somewhat new, so it may continue evolving over time. It does not, for example, provide a mechanism for verifying revocation, and there is already a proposal for “revocation transparency” to be added to it. This is definitely the technology to watch and to start using as soon as possible.
Known Attacks Against TLS
Eve will always be trying to break certificates in some way or another. If she gets past that gate, everything else is broken. Of course, if Alice and Bob are using DHE or ECDHE with forward secrecy, everything else in the future is broken, but at least not the past.
Beyond certificates, there are some other contemporary attacks against TLS to be aware of. The following is a brief overview of well-known attacks against TLS and how to prevent them.
POODLE
POODLE stands for “Padding Oracle On Downgraded Legacy Encryption.” TLS 1.0, as we’ve discussed, could be exploited when using CBC mode. At the time, the block cipher was DES, but the attack works on DES or AES so long as the mode of operation is CBC.
TLS 1.1 and 1.2 were supposed to fix this problem by changing how the CBC encryption was padded. But the POODLE attack showed that, even for servers running 1.1 and 1.2, they could be re-negotiated down to TLS 1.0 in order to be attacked.
Worse, it was later discovered that some TLS 1.1 and 1.2 implementations were using the same padding as TLS 1.0 (contrary to specifications). This kind of error caused no problem with normal communications because the two padding schemes are compatible for legitimate traffic. It is only when the data is attacked that it becomes clear that the padding is wrong. For the implementations that had the faulty implementation, they were vulnerable without the downgrade.
- 1.
Disable TLS 1.0 (and 1.1 really).
- 2.
Verify that TLS 1.2 is not vulnerable using an auditing tool.
FREAK and Logjam
The Logjam attack, like POODLE, relies on forcing a downgrade to earlier versions of TLS. Actually, the goal is to downgrade the cipher suites.
In the 1990s, the US government had a policy of now allowing strong cryptography to be exported to foreign countries. The government’s policy treated these kinds of algorithms as weapons.13 Security software still bears the scars of this policy, and there were specific TLS cipher suites that were called EXPORT algorithms. These algorithms were, in fact, very weak.
In Logjam, an attacker intercepts the client’s message and removes all of the proposed cipher suites and replaces them with EXPORT variants of Diffie-Hellman (DH). The server picks weak parameters accordingly and sends them back to the client. The client doesn’t know that anything is wrong and just accepts the server’s poorly chosen configuration.
The resulting keys are easily broken.
Notice that the Finished message of the TLS protocol should detect this attack. The whole point of sending a message with a hash of all messages exchanged during the handshake was to reveal this kind of manipulation.
The problem is that the Finished message is sent encrypted under the new (weak) key. If Eve is attempting this attack, she can intercept the real message while still cracking the key. Once the key is cracked, she can create a false Finished message and encrypt it now using the cracked session key. Unless the time it takes to get the key cracked is longer than internal timeouts, Eve can succeed.
FREAK is a very similar attack to Logjam, but uses “export” RSA parameters instead.
- 1.
Disable weak cipher suites—especially “export” ciphers—on the sever.
- 2.
Use clients that unconditionally refuse to accept weak parameters (e.g., DH/ECDH or RSA parameters that are weak) .
Sweet32
The Sweet32 attack is a little different from the ones we’ve seen before. It is designed specifically for block ciphers that have a block size of 64 bits. For most TLS 1.2 installations, there is only one cipher in use that has such a block size: 3DES.
Although a full explanation of 3DES is beyond the scope of this book, it uses DES underneath. It is slow, but it at least isn’t as weak as DES. DES keys can be compromised in fairly reasonable time; 3DES cannot, yet.
Nevertheless, 3DES is using a 64-bit block size. The block size of an algorithm impacts how much data should be encrypted under a single key before rotation. The math is outside the scope of this book, but cryptography breaks down once more than 2n/2 blocks have been encrypted. For 64-bit block sizes, the limit is about 32GB of data, which is easily generated on modern computers. Even worse, 2n/2 is an upper bound! Vulnerabilities creep in much sooner in practice.
Sadly, many TLS implementations do not enforce maximum data limits with a key. The Sweet32 attack exploits this to send enough data to force collisions and recover data.
Disable 3DES-based cipher suites (and any other 64-bit ciphers if any happen to be present).
ROBOT
Recall that in Chapter 4 we spent a lot of time beating up on RSA. We showed that it was trivially defeated when used without padding. We also showed that certain forms of padding could be exploited as well. In particular, PKCS 1.5 is vulnerable to a padding oracle attack. This is the very padding that is used for RSA encryption in TLS, up to and including version 1.2.
Bleichenbacher discovered the attack against PKCS 1.5 in 1999. Obviously, that was long before TLS version 1.2. Why wasn’t it changed?
For compatibility reasons, the designers behind TLS decided to keep the same padding scheme and insert countermeasures. As we mentioned earlier in the chapter, the padding oracle attack requires an oracle! If the TLS protocol can keep from revealing the success or failure of padding, it should eliminate the attack.
Unfortunately, it isn’t that simple. ROBOT stands for “Return Of Bleichenbacher’s Oracle Threat.” What the researchers behind ROBOT found is that TLS countermeasures aren’t always successful. They also found new ways to extract oracle information from TLS, and they were able to demonstrate that their attack was practical. They could, for example, sign messages for Facebook without access to the appropriate private keys.
Disable all cipher suites that use RSA encryption for the key exchange (any cipher that starts with TLS_RSA).
CRIME, TIME, and BREACH
TLS version 1.2 provides for compression of data before encryption. This has been disabled in TLS 1.3. The problem with compression is that it leaks information to people like Eve. That information can be used to recover information within the ciphertext.
CRIME , which stands for “Compression Ratio Info-leak Made Easy,” was first demonstrated in 2012. The problem with compression is that it really only works well if data is repeated. So, even if you only have the ciphertext of some compressed plaintext, if you can insert or partially insert messages, a drop in the ciphertext size strongly suggests that there was some repeated data resulting in a better compression ratio. This information can be used to recover small numbers of bytes. Any loss of data, no matter how small, is unacceptable. But if the data being attacked is already small (e.g., a web cookie with authentication information), a small number of bytes lost can be catastrophic.
CRIME was followed by TIME, which was slightly more effective. It also inspired BREACH, which is a different attack, but also uses compression to reveal information.
Disable compression.
Heartbleed
Heartbleed is a special mention in our list because it is not a vulnerability in TLS itself. Rather, it was a bug in OpenSSL’s implementation (yes, the library you’ve been using). Specifically, it was a bug in an extension to TLS that enables heartbeats for detecting dead connections. Although an extension, it is a commonly used one.
The problem with OpenSSL’s implementation was that they were not doing bounds checking on heartbeat request received from the other side. A typical heartbeat request included some data to echo back and the length of the data. If the length was longer than the data to echo, the incorrect implementation simply read contents out of memory. Although there were no guarantees on what would be included in those contents, it might include private keys and other secrets.
The point of this vulnerability is to indicate that not all attacks are on the protocols themselves but sometimes on the implementations. It is important to watch for both kinds of issues.
Keep TLS libraries and applications up to date.
Using OpenSSL with Python for TLS
We have done a lot of talking in this chapter, but not a lot of programming. This background was helpful to Eve, though, and hopefully helpful to you. Let’s get our hands dirty just a little bit to wrap up.
Many of Python’s built-in networking operations have TLS support (often under parameter names referencing SSL because that name has persisted even after 20 years of TLS). Eve is concerned about TLS keeping her from sniffing traffic. From what she’s learned in this chapter, however, she has seen that there are a lot of ways to do things wrong. Eve decides to walk through some examples to see what she might exploit.
She begins by connecting to a TLS server like Alice and Bob might do. Execute the code from the beginning of this chapter but, for simplicity, this time without the HTTP proxy snooping in the middle.
The bad news for Eve (and the good news for you) is that Python is trying to make sure programmers don’t shoot themselves in the foot. This code, by default, tries to do a number of things reasonably correct where SSL is concerned. The default parameter loads the system’s trusted certificates, validates the host name, and verifies the certificate. These things might sound obvious, but some APIs require the programmer to implement all of these checks on their own increasing the risk of leaving something out or implementing it incorrectly.
It rejected Eve’s certificate, as is to be expected. After all, it has no reason to trust it. The certificate sent by the server (s_server) is not rooted in a valid certificate authority. The Python code, by default, did the right thing. Eve curses under her breath.
Still, after searching through Python documentation, Eve discovers that Python will let you shoot yourself in the foot if you really, really want to.
Eve is pleased! She successfully received a response from s_server. Why?
The SSLContext object contains TLS configuration parameters and controls (at least partially) the processing of the TLS handshake including certificate checking. An empty SSLContext does no checking on certificates.
In fact, the Python documentation recommends not creating an SSLContext in this way. Instead, programmers should typically use SSLContext. create_default_context(). This method creates an SSLContext that performs the default checks Eve encountered earlier that resulted in a rejected certificate.
To verify that the trust system is working, Eve re-runs this test with the verify_mode = ssl.CERT_REQUIRED left in but the load_verify_locations left out. It results in the failed certificate check she saw earlier. Only by telling her context where her roots of trust are was she able to get her certificates validated.
There’s yet another check that is currently disabled: host name checking. Recall that when validating a certificate, the certificate should have the same subject name (either in the distinguished name’s Common Name or in the subject’s Alternative Name) as the host URI. Eve created this localhost certificate with the common name of 127.0.0.1 on purpose so she could run host name matching tests. When she browses to https://127.0.0.1, she wants the certificate’s subject name to match.
She re-runs the test code and it still works. Even though the URI is https://127.0.0.1 and the subject common name is wacko.westantarctica.southpole.gov, the data was permitted. Without host checking enabled, this mismatch doesn’t result in an error.
As you can see in our truncated exception trace, TLS complained that the host name (127.0.0.1) didn’t match the subject name (wacko.westantarctica.southpole.gov).
In general, programmers that don’t want Eve getting fake certificates past them shouldn’t be messing around with these parameters. The default context with its default checking is a good start.
Exercise 8.14. Social Engineering
This is a thought exercise; there is no programming involved. How might Eve try to get others using less secure software? What could she do to convince them to use a poorly configured SSL context?
The additional functionality does have important uses, though. What if Alice and Bob would like to do static certificate pinning. Maybe Bob is running a command and control server, and Eve is in the field with a Python program that needs to securely communicate with it. How can Alice pin the certificate to Bob’s server? There isn’t an API for the SSLContext to do this. It can only specify trusted CA certificates. It has no method for specifying a trusted server certificate.
The hash can be compared against a pinned value to ensure that it’s the expected certificate. Certificate pinning, especially static certificate pinning, might be a good idea in certain contexts.
Unfortunately for Alice and Bob, there isn’t yet an API for using CT logs. The Python cryptography library is starting to add support, but it appears right now to be limited to extensions in X.509 certificates. There is no API for submitting a serial number to get a CT response nor a mechanism for submitting a certificate to a log for insertion.
Again, keep your eyes on this (Eve certainly will). There will probably be new additions to Python libraries soon.
If Eve had her way, she would love to see Alice and Bob writing their own certificate-checking algorithms. She wishes they would do something like that instead of using Python’s built-in checker.
Note that the tbs_certificate_bytes are the DER-encoded (not PEM-encoded) bytes that are hashed for signing the certificate. So, in the sample code, the issuer’s public key is used to check the signature in the certificate over those bytes. To repeat, the signature is not over the PEM data.
The reason Eve wants Alice and Bob to do this is because this is just a small part of real certificate validation!15 In the preceding code, there are no checks for valid data, no checks against revocation lists, and not even checks that the client certificate’s issuer matches the subject line for the issuing certificate. There are a lot of ways to get this wrong, and Eve is far more likely to find a vulnerability if Alice and Bob use their own methods.
If you are smarter than Alice and Bob, leave certificate verification up to library operations. If you really feel that you want to do some specialized verification, do it in addition to, not in place of, these widely deployed and widely tested library functions.
Finally, beyond correct certificate checking, there is one other set of parameters that Eve decides to investigate: the supported TLS versions and supported cipher suites.
With respect to versions, even though TLS 1.0 and 1.1 are deprecated, most TLS implementations continue to support them for backward compatibility and legacy operations. This is almost always the wrong thing to do. Servers and clients should be disabling TLS 1.0 and 1.1 by default and only re-enabling them if this causes some kind of real, concrete, unresolvable problem. Eve hopes to find that she can use attacks like POODLE, Logjam, and FREAK against servers that still support these legacy versions.
Happily for Eve, she finds out that these vulnerable versions are still very much present. SSLv3 and SSLv2 are disabled, but this isn’t enough. TLS 1.0 absolutely must be disabled and TLS 1.1 should be as well.
The default list on Eve’s test computer is very bad for her (good for us!). No RSA encryption for key exchange, no AES-CBC mode ciphers, and no 3DES. It doesn’t look like Alice and Bob need to make any changes. According to the Python documentation, most of the weak ciphers have already been disabled. Still, it doesn’t hurt to check.
If Alice and Bob do have any ciphers that use RSA encryption for key exchange (e.g., TLS_RSA_WITH_AES_128_CBC_SHA), they should remove them from the cipher suites by curating the list returned by get_ciphers and then update the SSLContext using the set_ciphers method .
Eve sighs and then leaves the room. She’s on her way back to East Antarctica to try some new approaches to stealing information. She might try to fake a certificate, or she might try to find a vulnerable TLS implementation. It might be a challenge; it might take some time, but Eve is patient, crafty, and persistent. And she’s always listening.
Exercise 8.15. Learn To Poke Around
One of the best things you can do with your newly acquired (or improved) cryptography knowledge is learn to poke around. Most of the example code for this chapter was written as if executed in a Python shell on purpose. Get comfortable using the shell to poke a server or test a connection. There are many tools for testing publicly accessible TLS servers, but what about internal ones? If you find that your company is using poor security for internal TLS connections, let IT know. It’s important to be aware of what’s going on around you.
With that in mind, write a diagnostic program in Python that connects to a given server and looks for weak algorithms or configuration data. For example, you have seen that the SSLSocket class has the getpeercert() method to get the remote certificate. Write a program that, upon connecting to a server, obtains the certificate and reports if the signature on the certificate uses a SHA-1 hash (very broken and unlikely) or still supports RSA encryption (more probable).
You can also use the SSLSocket object to check the current cipher using cipher(). Which cipher suite is the server picking out of all the ones proposed? Is that a good choice?
Building on this cipher check, change your Python SSLContext to only support weak ciphers. That is, create a context that disables strong ciphers and re-enables weak ones. You can set a context’s ciphers using the SSLContext.set_ciphers() function. The list of available cipher suites, for each version of TLS, can be found at www.openssl.org/docs/manmaster/man1/ciphers.html . The goal of this test is to see if a server is still supporting older, deprecated ciphers.
Should your analysis tool uncover any weaknesses, report them to the appropriate IT or administrative staff with recommendations for remediation.
The End of the Beginning
Well, reader, this is the end of this book. Hopefully it’s a beginning for you. There is a lot to learn about cryptography and, to repeat for the thousandth time, this is just an introduction. You have learned much, but you are not a (crypto) Jedi yet!
Eve, representative of the EVEr listening EaVEsdropper, is not to be underestimated. Eve, along with Alice and Bob, was sometimes made out to be a little behind the times throughout most of this book. The truth is that Eve is always on the forefront of technology. There are still a lot of ways TLS servers get successfully attacked on a regular basis. Keep an eye out for news and updates about TLS. Unfortunately, new vulnerabilities and weaknesses are discovered more often than we’d like, and there are many who love to see and exploit them.
The good news is that, with strong cipher suites in use and legacy versions of TLS disabled, you already have a lot of good security in place. This chapter is an introduction to TLS security in Python programming. If you can understand the concepts in this chapter, it will be a good foundation to build on, but keep learning! Eve’s most effective weapon against us is ignorance.
Python aside, if you’re running a TLS-enabled web site, take time to occasionally have your site reviewed by a TLS audit program. For example, Qualys SSL Labs currently runs a free project to report on a site’s TLS hygiene. You can try it out free here: www.ssllabs.com/ssltest/index.html .
Also, check in on the cryptodoneright.org web site as well. This project aims to keep crypto users as informed and well advised as possible.
In short, let’s make Eve’s life as difficult as possible. There will always be risks, but don’t give her any easy wins. Make any victories painful and short-lived. After all, she is always keeping us on our toes, so we should return the favor!
Exercise 8.16. Three Cheers!
This is the last exercise in the book! Give yourself a round of applause for reaching this point.
And as you close the cover, please feel free to send the authors feedback, good or bad. And especially if you let us know if we’ve missed anything!