In case you haven’t heard, there’s a new SSL/TLS vulnerability making the rounds. Nicknamed Logjam, the new attack is ‘special’ in that it may admit complete decryption or hijacking of any TLS connection you make to an improperly configured web or mail server. Worse, there’s at least circumstantial evidence that similar (and more powerful) attacks might already be in the toolkit of some state-level attackers such as the NSA.
This work is the result of an unusual collaboration between a fantastic group of co-authors spread all around the world, including institutions such as the University of Michigan, INRIA Paris-Rocquencourt, INRIA Paris-Nancy, Microsoft Research, Johns Hopkins and the University Of Pennsylvania. It’s rare to see this level of collaboration between groups with so many different areas of expertise, and I hope to see a lot more like it. (Disclosure: I am one of the authors.)
The absolute best way to understand the Logjam result is to read the technical research paper. This post is mainly aimed at people who want a slightly less technical form. For those with even shorter attention spans, here’s the TL;DR:
It appears that the the Diffie-Hellman protocol, as currently deployed in SSL/TLS, may be vulnerable to a serious downgrade attack that restores it to 1990s “export” levels of security, and offers a practical “break” of the TLS protocol against poorly configured servers. Even worse, extrapolation of the attack requirements — combined with evidence from the Snowden documents — provides some reason to speculate that a similar attack could be leveraged against protocols (including TLS, IPSec/IKE and SSH) using 768- and 1024-bit Diffie-Hellman.
I’m going to tackle this post in the usual ‘fun’ question-and-answer format I save for this sort of thing.
What is Diffie-Hellman and why should I care about TLS “export” ciphersuites?
Diffie-Hellman is probably the most famous public key cryptosystem ever invented. Publicly discovered by Whit Diffie and Martin Hellman in the late 1970s (and a few years earlier, in secret, by UK GCHQ), it allows two parties to negotiate a shared encryption key over a public connection.
Diffie-Hellman is used extensively in protocols such as SSL/TLS and IPSec, which rely on it to establish the symmetric keys that are used to transport data. To do this, both parties must agree on a set of parameters to use for the key exchange. In traditional (‘mod p‘) Diffie-Hellman, these parameters consist of a large prime number p, as well as a ‘generator’ g. The two parties now exchange keys as shown below:
|Classical Diffie-Hellman (source).|
TLS supports several variants of Diffie-Hellman. The one we’re interested in for this work is the ‘ephemeral’ non-elliptic (“DHE”) protocol variant, which works in a manner that’s nearly identical to the diagram above. The server takes the role of Alice, selecting (p, g, ga mod p) and signing this tuple (and some nonces) using its long-term signing key. The client responds gb mod p and the two sides then calculate a shared secret.
Just for fun, TLS also supports an obsolete ‘export’ variant of Diffie-Hellman. These export ciphersuites are a relic from the 1990s when it was illegal to ship strong encryption out of the country. What you need to know about “export DHE” is simple: it works identically to standard DHE, but limits the size of p to 512 bits. Oh yes, and it’s still out there today. Because the Internet.
How do you attack Diffie-Hellman?
The best known attack against a correct Diffie-Hellman implementation involves capturing the value ga and solving to find the secret key a. The problem of finding this value is known as the discrete logarithm problem, and it’s thought to be a mathematically intractable, at least when Diffie-Hellman is implemented in cryptographically strong groups (e.g., when p is of size 2048 bits or more).
Unfortunately, the story changes dramatically when p is relatively small — for example, 512 bits in length. Given a value ga mod p for a 512-bit p, itshould at least be possible to efficiently recover the secret a and read traffic on the connection.
Most TLS servers don’t use 512-bit primes, so who cares?
The good news here is that weak Diffie-Hellman parameters are almost never used purposely on the Internet. Only a trivial fraction of the SSL/TLS servers out there today will organically negotiate 512-bit Diffie-Hellman. For the most part these are crappy embedded devices such as routers and video-conferencing gateways.
However, there is a second class of servers that are capable of supporting 512-bit Diffie-Hellman when clients request it, using a special mode called the ‘export DHE’ ciphersuite. Disgustingly, these servers amount to about 8% of the Alexa top million sites (and a whopping 29% of SMTP/STARTLS mail servers). Thankfully, most decent clients (AKA popular browsers) won’t willingly negotiate ‘export-DHE’, so this would also seem to be a dead end.
|ServerKeyExchange message (RFC 5246)|
You see, before SSL/TLS peers can start engaging in all this fancy cryptography, they first need to decide which ciphers they’re going to use. This is done through a negotiation process in which the client proposes some options (e.g., RSA, DHE, DHE-EXPORT), and the server picks one.
This all sound simple enough. However, one of the early, well known flaws in SSL/TLS is the protocol’s failure to properly authenticate these ‘negotiation’ messages. In very early versions of SSL they were not authenticated at all. SSLv3 and TLS tacked on an authentication process — but one that takes place only at the end of the handshake.*
This is particularly unfortunate given that TLS servers often have the ability to authenticate their messages using digital signatures, but don’t really take advantage of this. For example, when two parties negotiate Diffie-Hellman, the parameters sent by the server are transmitted within a signed message called the ServerKeyExchange (shown at right). The signed portion of this message covers the parameters, but neglects to include any information about which ciphersuite the server thinks it’s negotiating. If you remember that the only difference between DHE and DHE-EXPORT is the size of the parameters the server sends down, you might start to see the problem.
Here it is in a nutshell: if the server supports DHE-EXPORT, the attacker can ‘edit’ the negotiation messages sent from the a client — even if the client doesn’t support export DHE — replacing the client’s list of supported ciphers with only export DHE. The server will in turn send back a signed 512-bit export-grade Diffie-Hellman tuple, which the client will blindly accept — because it doesn’t realize that the server is negotiating the export version of the ciphersuite. From its perspective this message looks just like ‘standard’ Diffie-Hellman with really crappy parameters.
|Overview of the Logjam active attack (source: paper).|
All this tampering should run into a huge snag at the end of the handshake, when he client and server exchange Finished messages embedding include a MAC of the transcript. At this point the client should learn that something funny is going on, i.e., that what it sent no longer matches what the server is seeing. However, the loophole is this: if the attacker can recover the Diffie-Hellman secret quickly — before the handshake ends — she can forge her own Finished messages. In that case the client and server will be none the wiser.
The upshot is that executing this attack requires the ability to solve a 512-bit discrete logarithm before the client and server exchange Finished messages. That seems like a tall order.
Can you really solve a discrete logarithm before a TLS handshake times out?
In practice, the fastest route to solving the discrete logarithm in finite fields is via an algorithm called the Number Field Sieve (NFS). Using NFS to solve a single 512-bit discrete logarithm instance requires several core-years — or about week of wall-clock time given a few thousand cores — which would seem to rule out solving discrete logs in real time.
However, there is a complication. In practice, NFS can actually be broken up into two different steps:
- Pre-computation (for a given prime p). This includes the process of polynomial selection, sieving, and linear algebra, all of which depend only on p. The output of this stage is a table for use in the second stage.
- Solving to find a (for a given ga mod p). The final stage, called the descent, uses the table from the precomputation. This is the only part of the algorithm that actually involves a specific g andga.
The important thing to know is that the first stage of the attack consumes the vast majority of the time, up to a full week on a large-scale compute cluster. The descent stage, on the other hand, requires only a few core-minutes. Thus the attack cost depends primarily on where the server gets its Diffie-Hellman parameters from. The best case for an attacker is when p is hard-coded into the server software and used across millions of machines. The worst case is when p is re-generated routinely by the server.
I’ll let you guess what real TLS servers actually do.
In fact, large-scale Internet scans by the team at University of Michigan show that most popular web servers software tends to re-use a small number of primes across thousands of server instances. This is done because generating prime numbers is scary, so implementers default to using a hard-coded value or a config file supplied by your Linux distribution. The situation for export Diffie-Hellman is particularly awful, with only two (!) primes used across up 92% of enabled Apache/mod_ssl sites.
|Number of seconds to solve a 512-bit discrete log (source: paper).|
The upshot of all of this is that about two weeks of pre-computation is sufficient to build a table that allows you to perform the downgrade against most export-enabled servers in just a few minutes (see the chart at right). This is fast enough that it can be done before the TLS connection timeout. Moreover, even if this is not fast enough, the connection can often be held open longer by using clever protocol tricks, such as sending TLS warning messages to reset the timeout clock.
Keep in mind that none of this shared prime craziness matters when you’re using sufficiently large prime numbers (on the order of 2048 bits or more). It’s only a practical issue you’re using small primes, like 512-bit, 768-bit or — and here’s a sticky one I’ll come back to in a minute — 1024 bit.
How do you fix the downgrade to export DHE?
The best and most obvious fix for this problem is to exterminate export ciphersuites from the Internet. Unfortunately, these awful configurations are the default in a number of server software packages (looking at you Postfix), and getting people to update their configurations is surprisingly difficult (see e.g., FREAK).
A simpler fix is to upgrade the major web browsers to resist the attack. The easy way to do this is to enforce a larger minimum size for received DHE keys. The problem here is that the fix itself causes some collateral damage — it will break a small but significant fraction of lousy servers that organically negotiate (non-export) DHE with 512 bit keys.
The good news here is that the major browsers have decided to break the Internet (a little) rather than allow it to break them. Each has agreed to raise the minimum size limit to at least 768 bits, and some to a minimum of 1024 bits. It’s still not perfect, since 1024-bit DHE may not be cryptographically sound against powerful attackers, but it does address the immediate export attack. In the longer term the question is whether to use larger negotiated DHE groups, or abandon DHE altogether and move to elliptic curves.
What does this mean for larger parameter sizes?
The good news so far is that 512-bit Diffie-Hellman is only used by a fraction of the Internet, even when you account for active downgrade attacks. The vast majority of servers use Diffie-Hellman moduli of length at least 1024 bits. (The widespread use of 1024 is largely due to a hard-cap in older Java clients. Go away Java.)
While 2048-bit moduli are generally believed to be outside of anyone’s reach, 1024-bit DHE has long been considered to be at least within groping range of nation-state attackers. We’ve known this for years, of course, but the practical implications haven’t been quite clear. This paper tries to shine some light on that, using Internet-wide measurements and software/hardware estimates.
If you recall from above, the most critical aspect of the NFS attack is the need to perform large amounts of pre-computation on a given Diffie-Hellman prime p, followed by a relatively short calculation to break any given connection that uses p. At the 512-bit size the pre-computation only requires about a week. The question then is, how much does it cost for a 1024-bit prime, and how common are shared primes?
While there’s no exact way to know how much the 1024-bit attack would cost, the paper attempts to provide some extrapolations based on current knowledge. With software, the cost of the pre-computation seems quite high — on the order of 35 million core-years. Making this happen for a given prime within a reasonable amount of time (say, one year) would appear to require billions of dollars of computing equipment if we assume no algorithmic improvements. Even if we rule out such improvements, it’s conceivable that this cost might be brought down to a few hundred million dollars using hardware. This doesn’t seem out of bounds when you consider leaked NSA cryptanalysis budgets.
What’s interesting is that the descent stage, required to break a given Diffie-Hellman connection, is much faster. Based on some implementation experiments by the CADO-NFS team, it may be possible to break a Diffie-Hellman connection in as little as 30 core-days, with parallelization hugely reducing the wall-clock time. This might even make near-real-time decryption of Diffie-Hellman connections practical.
Is the NSA actually doing this?
So far all we’ve noted is that NFS pre-computation is at least potentially feasible when 1024-bit primes are re-used. That doesn’t mean the NSA is actually doing any of it.
There is some evidence, however, that suggests the NSA has decryption capability that’s at least consistent with such a break. This evidence comes from a series of Snowden documents published last winter in Der Spiegel. Together they describe a large-scale effort at NSA and GCHQ, capable of decrypting ‘vast’ amounts of Internet traffic, including IPSec, SSH and HTTPS connections.
|NSA slide illustrating exploitation
of IPSec encrypted traffic (source: Spiegel).
While the architecture described by the documents mentions attacks against many protocols, the bulk of the energy seems to be around the IPSec and IKE protocols, which are used to establish Virtual Private Networks (VPNs) between individuals and corporate networks such as financial institutions.
The nature of the NSA’s exploit is never made clear in the documents, but diagram at right gives a lot of the architectural details. The system involves collecting Internet Key Exchange (IKE) handshakes, transmitting them to the NSA’s Cryptanalysis and Exploitation Services (CES) enclave, and feeding them into a decryption system that controls substantial high performance computing resources to process the intercepted exchanges. This is at least circumstantially consistent with Diffie-Hellman cryptanalysis.
Of course it’s entirely possible that the attack is based on a bad random number generator, weak symmetric encryption, or any number of engineered backdoors. There are a few pieces of evidence that militate towards a Diffie-Hellman break, however:
- IPSec (or rather, the IKE key exchange) uses Diffie-Hellman for every single connection, meaning that it can’t be broken without some kind of exploit, although this doesn’t rule out the other explanations.
- The IKE exchange is particularly vulnerable to pre-computation, since IKE uses a small number of standardized prime numbers called the Oakley groups, which are going on 17 years old now. Large-scale Internet scanning by the Michigan team shows that a majority of responding IPSec endpoints will gladly negotiate using Oakley Group 1 (768 bit) or Group 2 (1024 bit), even when the initiator offers better options.
- The NSA’s exploit appears to require the entire IKE handshake as well as any pre-shared key (PSK). These inputs would be necessary for recovery of IKEv1 session keys, but are not required in a break that involves only symmetric cryptography.
- The documents explicitly rule out the use of malware, or rather, they show that such malware (‘TAO implants’) is in use — but that malware allows the NSA to bypass the IKE handshake altogether.
I would stipulate that beyond the Internet measurements and computational analysis, this remains firmly in the category of ‘crazy-eyed informed speculation’. But while we can’t rule out other explanations, this speculation is certainly consistent with a hardware-optimized break of Diffie-Hellman 768 and 1024-bit, along with some collateral damage to SSH and related protocols.
So what next?
The paper gives a detailed set of recommendations on what to do about these downgrade attacks and (relatively) weak DHE groups. The website provides a step-by-step guide for server administrators. In short, probably the best long-term move is to switch to elliptic curves (ECDHE) as soon as possible. Failing this, clients and servers should enforce at least 2048-bit Diffie-Hellman across the Internet. If you can’t do that, stop using common primes.
Making this all happen on anything as complicated as the Internet will probably consume a few dozen person-lifetimes. But it’s something we have to do, and will do, to make the Internet work properly.
* There are reasons for this. Some SSL/TLS ciphersuites (such as the RSA encryption-based ciphersuites) don’t use signatures within the protocol, so the only way to authenticate the handshake is to negotiate a ciphersuite, run the key exchange protocol, then use the resulting shared secret to authenticate the negotiation messages after the fact. But SSL/TLS DHE involves digital signatures, so it should be possible to achieve a stronger level of security than this. It’s unfortunate that the protocol does not.