Yesterday, David Benjamin posted a pretty esoteric note on the IETF’s TLS mailing list. At a superficial level, the post describes some seizure-inducingly boring flaws in older Canon printers. To most people that was a complete snooze. To me and some of my colleagues, however, it was like that scene in X-Files where Mulder and Scully finally learn that aliens are real.
Those fossilized printers confirmed a theory we’d developed in 2014, but had been unable to prove: namely, the existence of a specific feature in RSA’s BSAFE TLS library called “Extended Random” — one that we believe to be evidence of a concerted effort by the NSA to backdoor U.S. cryptographic technology.
Before I get to the details, I want to caveat this post in two different ways. First, I’ve written about the topic of cryptographic backdoors way too much. In 2013, the Snowden revelations revealed the existence of a campaign to sabotage U.S. encryption systems. Since that time, cryptographers have spent thousands of hours identifying, documenting, and trying to convince people to care about these backdoors. We’re tired and we want to do more useful things.
The second caveat covers a problem with any discussion of cryptographic backdoors. Specifically, you never really get absolute proof. There’s always some innocent or coincidental explanation that could sort of fit the evidence — maybe it was all a stupid mistake. So you look for patterns of unlikely coincidences, and use Occam’s razor a lot. You don’t get a Snowden every day.
With all that said, let’s talk about Extended Random, and what this tells us about the NSA. First some background.
Dual_EC_DRBG and RSA BSAFE
To understand the context of this discovery, you need to know about a standard called Dual EC DRBG. This was a proposed random number generator that the NSA developed in the early 2000s. It was standardized by NIST in 2007, and later deployed in some important cryptographic products — though we didn’t know it at the time.
Dual EC has a major problem, which is that it likely contains a backdoor. This was pointed out in 2007 by Shumow and Ferguson, and effectively confirmed by the Snowden leaks in 2013. Drama ensued. NIST responded by pulling the standard. (For an explainer on the Dual EC backdoor, see here.)
Somewhere around this time the world learned that RSA Security had made Dual EC the default random number generator in their popular cryptographic library, which was called BSAFE. RSA hadn’t exactly kept this a secret, but it was such a bonkers thing to do that nobody (in the cryptographic community) had known. So for years RSA shipped their library with this crazy algorithm, which made its way into all sorts of commercial devices.
The RSA drama didn’t quite end there, however. In late 2013, Reuters reported that RSA had taken $10 million to backdoor their software. RSA sort of denies this. Or something. It’s not really clear.
Regardless of the intention, it’s known that RSA BSAFE did incorporate Dual EC. This could have been an innocent decision, of course, since Dual EC was a NIST standard. To shed some light on that question, in 2014 my colleagues and I decided to reverse-engineer the BSAFE library to see if it the alleged backdoor in Dual EC was actually exploitable by an attacker like the NSA. We figured that specific engineering decisions made by the library designers could be informative in tipping the scales one way or the other.
It turns out they were.
In the course of reverse engineering the Java version of BSAFE, we discovered a funny inclusion. Specifically, we found that BSAFE supports a non-standard extension to the TLS protocol called “Extended Random”.
The Extended Random extension is an IETF Draft proposed by an NSA employee named Margaret Salter (at some point the head of NSA’s Information Assurance Directorate, which worked on “defensive” crypto for DoD) along with Eric Rescorla as a contractor. (Eric was very clearly hired to develop a decent proposal that wouldn’t hurt TLS, and would primarily be used on government machines. The NSA did not share their motivations with him.)
It’s important to note that Extended Random by itself does not introduce any cryptographic vulnerabilities. All it does is increase the amount of random data (“nonces”) used in a TLS protocol connection. This shouldn’t hurt TLS at all, and besides it was largely intended for U.S. government machines.
The only thing that’s interesting about Extended Random is what happens when that random data is generated using the Dual EC algorithm. Specifically, this extra data acts as “rocket fuel”, significantly increasing the efficiency of exploiting the Dual EC backdoor to decrypt TLS connections.
In short, if you’re an agency like the NSA that’s trying to use Dual EC as a backdoor to intercept communications, you’re muchbetter off with a system that uses both Dual EC DRBG and Extended Random. Since Extended Random was never standardized by the IETF, it shouldn’t be in any systems. In fact, to the best of our knowledge, BSAFE is the only system in the world that implements it.
In addition to Extended Random, we discovered a variety of features that, combined with the Dual EC backdoor, could make RSA BSAFE fairly easy to exploit. But Extended Random is by far the strangest and hardest to justify.
So where did this standard come from? For those who like technical mysteries, it turns out that Extended Random isn’t the only funny-smelling proposal the NSA made. It’s actually one of four failed IETF proposals made by NSA employees, or contractors who work closely with the NSA, all of which try to boost the amount of randomness in TLS. Thomas Ptacek has a mind-numbingly detailed discussion of these proposals and his view of their motivation in this post.
Oh my god I never thought spies could be so boring. What’s the new development?
Despite the fact that we found Extended Random in RSA BSAFE (a free version we downloaded from the Internet), a fly in the ointment was that it didn’t actually seem to be enabled. That is: the code was there but the switches to enable it were hard-coded to “off”.
This kind of put a wrench in our theory that RSA might have included Extended Random to make BSAFE connections more exploitable by the NSA. There might be some commercial version of BSAFE out there with this code active, but we were never able to find it or prove it existed. And even worse, it might appear only in some special “U.S. government only” version of BSAFE, which would tend to undermine the theory that there was something intentional about including this code — after all, why would the government spy on itself?
Which finally brings us to the news that appeared on the TLS mailing list the other day. It turns out that certain Canon printers are failing to respond properly to connections made using the new version of TLS (which is called 1.3), because they seem to have implemented an unauthorized TLS extension using the same number as an extension that TLS 1.3 needs in order to operate correctly. Here’s the relevant section of David’s post:
The web interface on some Canon printers breaks with 1.3-capable ClientHello messages. We have purchased one and confirmed this with a PIXMA MX492. User reports suggest that it also affects PIXMA MG3650 and MX495 models. It potentially affects a wide range of Canon printers.
These printers use the RSA BSAFE library to implement TLS and this library implements the extended_random extension and assigns it number 40. This collides with the key_share extension and causes 1.3-capable handshakes to fail.
So in short, this news appears to demonstrate that commercial (non-free) versions of RSA BSAFE did deploy the Extended Random extension, and made it active within third-party commercial products. Moreover, they deployed it specifically to machines — specifically off-the-shelf commercial printers — that don’t seem to be reserved for any kind of special government use.
(If these turn out to be special Department of Defense printers, I will eat my words.)
Ironically, the printers are now the only thing that still exhibits the features of this (now deprecated) version of BSAFE. This is not because the NSA was targeting printers. Whatever devices they were targeting are probably gone by now. It’s because printer firmware tends to be obsolete and yet highly persistent. It’s like a remote pool buried beneath the arctic circle that preserves software species that would otherwise vanish from the Internet.
Which brings us to the moral of the story: not only are cryptographic backdoors a terrible idea, but they totally screw up the assigned numbering system for future versions of your protocol.
Actually no, that’s a pretty useless moral. Instead, let’s just say that you can deploy a cryptographic backdoor, but it’s awfully hard to control where it will end up.
A few months ago it was starting to seem like you couldn’t go a week without a new attack on TLS. In that context, this summer has been a blessed relief. Sadly, it looks like our vacation is over, and it’s time to go back to school.
Today brings the news that Karthikeyan Bhargavan and Gaëtan Leurent out of INRIA have a new paper that demonstrates a practical attack on legacy ciphersuites in TLS (it’s called “Sweet32”, website here). What they show is that ciphersuites that use 64-bit blocklength ciphers — notably 3DES — are vulnerable to plaintext recovery attacks that work even if the attacker cannot recover the encryption key.
While the principles behind this attack are well known, there’s always a difference between attacks in principle and attacks in practice. What this paper shows is that we really need to start paying attention to the practice.
So what’s the matter with 64-bit block ciphers?
Block ciphers are one of the most widely-used cryptographic primitives. As the nameimplies, these are schemes designed to encipher data in blocks, rather than a single bit at a time.
The two main parameters that define a block cipher are its block size (the number of bits it processes in one go), and its key size. The two parameters need not be related. So for example, DES has a 56-bit key and a 64-bit block. Whereas 3DES (which is built from DES) can use up to a 168-bit key and yet still has the same 64-bit block. More recent ciphers have opted for both larger blocks and larger keys.
When it comes to the security provided by a block cipher, the most important parameter is generally the key size. A cipher like DES, with its tiny 56-bit key, is trivially vulnerable to brute force attacks that attempt decryption with every possible key (often using specialized hardware). A cipher like AES or 3DES is generally not vulnerable to this sort of attack, since the keys are much longer.
However, as they say: key size is not everything. Sometimes the block size matters too.
You see, in practice, we often need to encrypt messages that are longer than a single block. We also tend to want our encryption to be randomized. To accomplish this, most protocols use a block cipher in a scheme called a mode of operation. The most popular mode used in TLS is CBC mode. Encryption in CBC looks like this:
The nice thing about CBC is that (leaving aside authentication issues) it can be proven (semantically) secure if we make various assumptions about the security of the underlying block cipher. Yet these security proofs have one important requirement. Namely, the attacker must not receive too much data encrypted with a single key.
The reason for this can be illustrated via the following simple attack.
Imagine that an honest encryptor is encrypting a bunch of messages using CBC mode. Following the diagram above, this involves selecting a random Initialization Vector () of size equal to the block size of the cipher, then XORing with the first plaintext block (), and enciphering the result (). The is sent (in the clear) along with the ciphertext.
Most of the time, the resulting ciphertext block will be unique — that is, it won’t match any previous ciphertext block that an attacker may have seen. However, if the encryptor processes enough messages, sooner or later the attacker will see a collision. That is, it will see a ciphertext block that is the same as some previous ciphertext block. Since the cipher is deterministic, this means the cipher’s input () must be identical to the cipher’s previous input that created the previous block.
In other words, we have , which can be rearranged as . Since the IVs are random and known to the attacker, the attacker has (with high probability) learned the XOR of two (unknown) plaintexts!
What can you do with the XOR of two unknown plaintexts? Well, if you happen to know one of those two plaintext blocks — as you might if you were able to choose some of the plaintexts the encryptor was processing — then you can easily recover the other plaintext. Alternatively, there are known techniques that can sometimes recover useful data even when you don’t know both blocks.
The main lesson here is that this entire mess only occurs if the attacker sees a collision. And the probabilityof such a collision is entirely dependent on the size of the cipher block. Worse, thanks to the (non-intuitive) nature of the birthday bound, this happens much more quickly than you might think it would. Roughly speaking, if the cipher block is b bits long, then we should expect a collision after roughly encrypted blocks.
In the case of a 64-bit blocksize cipher like 3DES, this is somewhere in the vicinity of , or around 4 billion enciphered blocks.
(As a note, the collision does not really need to occur in the first block. Since all blocks in CBC are calculated in the same way, it could be a collision anywhere within the messages.)
Whew. I thought this was a practical attack. 4 billion is a big number!
It’s true that 4 billion blocks seems like an awfully large number. In a practical attack, the requirements would be even larger — since the most efficient attack is for the attacker to know a lot of the plaintexts, in the hope that she will be able to recover one unknown plaintext when she learns the value (P ⊕ P’).
However, it’s worth keeping in mind that these traffic numbers aren’t absurd for TLS. In practice, 4 billion 3DES blocks works out to 32GB of raw ciphertext. A lot to be sure, but not impossible. If, as the Sweet32 authors do, we assume that half of the plaintext blocks are known to the attacker, we’d need to increase the amount of ciphertext to about 64GB. This is a lot, but not impossible.
The Sweet32 authors take this one step further. They imagine that the ciphertext consists of many HTTPS connections, consisting of 512 bytes of plaintext, in each of which is embedded the same secret 8-byte cookie — and the rest of the session plaintext is known. Calculating from these values, they obtain a requirement of approximately 256GB of ciphertext needed to recover the cookie with high probability.
That is really a lot.
How does the TLS attack work?
While the cryptographic community has been largely pushing TLS away from ciphersuites like CBC, in favor of modern authenticated modes of operation, these modes still exist in TLS. And they exist not only for use not only with modern ciphers like AES, but they are often available for older ciphersuites like 3DES. For example, here’s a connection I just made to Google:
Of course, just because a server supports 3DES does not mean that it’s vulnerable to this attack. In order for a particular connection to be vulnerable, both the client and server must satisfy three main requirements:
The client and server must negotiate a 64-bit cipher. This is a relatively rare occurrence, but can happen in cases where one of the two sides is using an out-of-date client. For example, stock Windows XP does not support any of the AES-based ciphersuites. Similarly, SSL3 connections may negotiate 3DES ciphersuites.
The server and client must support long-lived TLS sessions, i.e., encrypting a great deal of data with the same key. Unfortunately, most web browsers place no limit on the length of an HTTPS session if Keep-Alive is used, provided that the server allows the session. The Sweet32 authors scanned and discovered that many servers (including IIS) will allow sessions long enough to run their attack.Across the Internet, the percentage of vulnerable servers is small (less than 1%), but includes some important sites.
So what do we do now?
While this is not an earthshaking result, it’s roughly comparable to previous results we’ve seen with legacy ciphers like RC4.
In short, while these are not the easiest attacks to run, it’s a big problem that there even exist semi-practical attacks that undo the encryption used in standard encryption protocols. This is a problem that we should address, and these attack papers help to make those problems more clear.
To every thing there is a season. And in the world of cryptography, today we have the first signs of the season of TLS vulnerabilities.
This year’s season is off to a roaring start with not one, but two serious bugs announcements by the OpenSSL project, each of which guarantees that your TLS connections are much less than private than you’d like them to be. I can’t talk about both vulnerabilities and keep my sanity, so today I’m going to confine myself to the more dramatic of the two vulnerabilities: a new cross-protocol attack on TLS named “DROWN”.
Technically DROWN stands for “Decrypting RSA using Obsolete and Weakened eNcryption”, but honestly feel free to forget that because the name itself is plenty descriptive. In short, due to a series of dumb mistakes on the part of a vast number of people, DROWN means that TLS connections to a depressingly huge slice of the web (and mail servers, VPNs etc.) are essentially open to attack by fairly modest adversaries.
So that’s bad news. The worse news — as I’ll explain below — is that this whole mess was mostly avoidable.
For a detailed technical explanation of DROWN, you should go read the complete technical paper by Aviram et al. Or visit the DROWN team’s excellent website. If that doesn’t appeal to you, read on for a high level explanation of what DROWN is all about, and what it means for the security of the web. Here’s the TL;DR:
If you’re running a web server configured to use SSLv2, and particularly one that’s running OpenSSL (even with all SSLv2 ciphers disabled!), you may be vulnerable to a fast attack that decrypts many recorded TLS connections made to that box. Most worryingly, the attack does not require the client to ever make an SSLv2 connection itself, and it isn’t a downgrade attack. Instead, it relies on the fact that SSLv2 — and particularly the legacy “export” ciphersuites it incorporates — are pure poison, and simply having these active on a server is enough to invalidate the security of all connections made to that device.
For the rest of this post I’ll use the “fun” question and answer format I save for this kind of attack. First, some background.
What are TLS and SSLv2, and why should I care?
Transport Layer Security (TLS) is the most important security protocol on the Internet. You should care about it because nearly every transaction you conduct on the Internet relies on TLS (v1.0, 1.1 or 1.2) to some degree, and failures in TLS can flat out ruin your day.
But TLS wasn’t always TLS. The protocol began its life at Netscape Communications under the name “Secure Sockets Layer”, or SSL. Rumor has it that the first version of SSL was so awful that the protocol designers collected every printed copy and buried them in a secret New Mexico landfill site. As a consequence, the first public version of SSL is actually SSL version 2. It’s pretty terrible as well — but not (entirely) for the reasons you might think.
Let me explain.
The reason you might think SSLv2 is terrible is because it was a product of the mid-1990s, which modern cryptographers view as the “dark ages of cryptography”. Many of the nastier cryptographic attacks we know about today had not yet been discovered. As a result, the SSLv2 protocol designers were forced to essentially grope their way in the dark, and so were frequently devoured by grues — to their chagrin and our benefit, since the attacks on SSLv2 offered priceless lessons for the next generation of protocols.
And yet, these honest mistakes are not worst thing about SSLv2. The most truly awful bits stem from the fact that the SSLv2 designers were forced to ruin their own protocol. This was the result of needing to satisfy the U.S. government’s misguided attempt to control the export of cryptography. Rather than using only secure encryption, the designers were forced to build in a series of “export-grade ciphersuites” that offered abysmal 40-bit session keys and other nonsense. I’ve previously written about the effect of export crypto on today’s security. Today we’ll have another lesson.
Wait, isn’t SSLv2 ancient history?
For some time in the early 2000s, SSLv2 was still supported by browsers as a fallback protocol, which meant that active attackers could downgrade an SSLv3 or TLS connection by tricking a browser into using the older protocol. Fortunately those attacks are long gone now: modern web browsers have banished SSLv2 entirely — along with export cryptography in general. If you’re using a recent version of Chrome, IE or Safari, you should never have to worry about accidentally making an SSLv2 connection.
The problem is that while clients (such as browsers) have done away with SSLv2, many servers still support the protocol. In most cases this is the result of careless server configuration. In others, the blame lies with crummy and obsolete embedded devices that haven’t seen a software update in years — and probably never will. (You can see if your server is vulnerable here.)
And then there’s the special case of OpenSSL, which helpfully provides a configuration option that’s intended to disable SSLv2 ciphersuites — but which, unfortunately, does no such thing. In the course of their work, the DROWN researchers discovered that even when this option is set, clients may still request arbitrary SSLv2 ciphersuites. (This issue was quietly patched in January. Upgrade.)
The reason this matters is that SSL/TLS servers do a very silly thing. You see, since people don’t like to buy multiple certificates, a server that’s configured to use both TLS and SSLv2 will generally use the same RSA private keyto support both protocols. This means any bugs in the way SSLv2 handles that private key could very well affect the security of TLS.
And this is where DROWN comes in.
So what is DROWN?
DROWN is a classic example of a “cross protocol attack”. This type of attack makes use of bugs in one protocol implementation (SSLv2) to attack the security of connections made under a different protocol entirely — in this case, TLS. More concretely, DROWN is based on the critical observation that while SSLv2 and TLS both support RSA encryption, TLS properly defends against certain well-known attacks on this encryption — while SSLv2’s export suites emphatically do not.
I will try to make this as painless as possible, but here we need to dive briefly into the weeds.
You see, both SSLv2 and TLS use a form of RSA encryption padding known as RSA-PKCS#1v1.5. In the late 1990s, a man named Daniel Bleichenbacher proposed an amazing attack on this encryption scheme that allows an attacker to decrypt an RSA ciphertext efficiently — under the sole condition that they can ask an online server to decrypt many related ciphertexts, and give back only one bit of information for each one — namely, the bit representing whether decryption was successful or not.
Bleichenbacher’s attack proved particularly devastating for SSL servers, since the standard SSL RSA-based handshake involves the client encrypting a secret (called the Pre-Master Secret, or PMS) under the server’s RSA public key, and then sending this value over the wire. An attacker who eavesdrops the encrypted PMS can run the Bleichenbacher attack against the server, sending it thousands of related values (in the guise of new SSL connections), and using the server’s error responses to gradually decrypt the PMS itself. With this value in hand, the attacker can compute SSL session keys and decrypt the recorded SSL session.
The main SSL/TLS countermeasure against Bleichenbacher’s attack is basically a hack. When the server detects that an RSA ciphertext has decrypted improperly, it lies. Instead of returning an error, which the attacker could use to implement the attack, it generates a random pre-master secret and continues with the rest of the protocol as though this bogus value was what it actually decrypted. This causes the protocol to break down later on down the line, since the server will compute essentially a random session key. But it’s sufficient to prevent the attacker from learning whether the RSA decryption succeeded or not, and that kills the attack dead.
Now let’s take a moment to reflect and make an observation.
If the attacker sends a valid RSA ciphertext to be decrypted, the server will decrypt it and obtain some PMS value. If the attacker sends the same valid ciphertext a second time, the server will decrypt and obtain the same PMS value again. Indeed, the server will always get the same PMS even if the attacker sends the same valid ciphertext a hundred times in a row.
On the other hand, if the attacker repeatedly sends the same invalid ciphertext, the server will choose a different PMS every time. This observation is crucial.
In theory, if the attacker holds a ciphertext that might be valid or invalid — and the attacker would like to know which is true — they can send the same ciphertext to be decrypted repeatedly. This will lead to two possible conditions. In condition (1) where the ciphertext is valid, decryption will produce the “same PMS every time”. Condition (2) for an invalid ciphertext will produce a “different PMS each time”. If the attacker could somehow tell the difference between condition (1) and condition (2), they could determine whether the ciphertext was valid. That determination alone would be enough to resurrect the Bleichenbacher attack. Fortunately in TLS, the PMS is never used directly; it’s first passed through a strong hash function and combined with a bunch of random nonces to obtain a Master Secret. This result then used in further strong ciphers and hash functions. Thanks to the strength of the hash function and ciphers, the resulting keys are so garbled that the attacker literally cannot tell whether she’s observing condition (1) or (2).
And here we finally we run into the problem of SSLv2.
You see, SSLv2 implementations include a similar anti-Bleichenbacher countermeasure. Only here there are some key differences. In SSLv2 there is no PMS — the encrypted value is used as the Master Secret and employed directly to derive the encryption session key. Moreover, in export modes, the Master Secret may be as short as 40 bits, and used with correspondingly weak export ciphers. This means an attacker can send multiple ciphertexts, then brute-force the resulting short keys. After recovering these keys for a tiny number of sessions, they will be able to determine whether they’re in condition (1) or (2). This would effectively resurrect the Bleichenbacher attack.
This still sounds like an attack on SSLv2, not on TLS. What am I missing?
And now we come to the full horror of SSLv2.
Since most servers configured with both SSLv2 and TLS support will use the same RSA private key for decrypting sessions from either protocol, a Bleichenbacher attack on the SSLv2 implementation — with its vulnerable crappy export ciphersuites — can be used to decrypt the contents of a normal TLS-based RSA ciphertext. After all, both protocols are using the same darned secret key. Due to formatting differences in the RSA ciphertext between the two protocols, this attack doesn’t work all the time — but it does work for approximately one out of a thousand TLS handshakes.
To put things succinctly: with access to a whole hell of a lot of computation, an attacker can intercept a TLS connection, then at their leisure make many thousands of queries to the SSLv2-enabled server, and decrypt that connection. The “general DROWN” attack actually requires watching about 1,000 TLS handshakes to find a vulnerable RSA ciphertext, about 40,000 queries to the server, and about 2^50 offline operations.
LOL. That doesn’t sound practical at all. You cryptographers suck.
First off, that isn’t really a question, it’s more a rude statement. But since this is exactly the sort of reaction cryptographers often get when they point out perfectly practical theoretical attacks on real protocols, I’d like to take a moment to push back.
While the attack described above seems costly, it can be conducted in several hours and $440 on Amazon EC2. Are your banking credentials worth $440? Probably not. But someone else’s probably are. Given all the things we have riding on TLS, it’s better for it not to be broken at all.
More urgently, the reason cryptographers spend time on “impractical attacks” is that attacks always get better. And sometimes they get better fast.
The attack described above is called “General DROWN” and yes, it’s a bit impractical. But in the course of writing just this single paper, the DROWN researchers discovered a second variant of their attack that’s many orders of magnitude faster than the general one described above. This attack, which they call “Special DROWN” can decrypt a TLS RSA ciphertext in about one minute on a single CPU core.
This attack relies on a bug in the way OpenSSL handles SSLv2 key processing, a bug that was (inadvertently) fixed in March 2015, but remains open across the Internet. The Special DROWN bug puts DROWN squarely in the domain of script kiddies, for thousands of websites across the Internet.
So how many sites are vulnerable?
This is probably the most depressing part of the entire research project. According to wide-scale Internet scans run by the DROWN researchers, more than 2.3 million HTTPS servers with browser-trusted certificates are vulnerable to special DROWN, and 3.5 million HTTPS servers are vulnerable to General DROWN. That’s a sizeable chunk of the encrypted web, including a surprising chunk of the Chinese and Colombian Internet.
And while I’ve focused on the main attacks in this post, it’s worth pointing out that DROWN also affects other protocol suites, like TLS with ephemeral Diffie-Hellman and even Google’s QUIC. So these vulnerabilities should not be taken lightly.
In January, OpenSSL patched the bug that allows the SSLv2 ciphersuites to remain alive. Last March, the project inadvertently fixed the bug that makes Special DROWN possible. But that’s hardly the end. The patch they’re announcing today is much more direct: hopefully it will make it impossible to turn on SSLv2 altogether. This will solve the problem for everyone… at least for everyone willing to patch. Which, sadly, is unlikely to be anywhere near enough.
More broadly, attacks like DROWN illustrate the cost of having old, vulnerable protocols on the Internet. And they show the terrible cost that we’re still paying for export cryptography systems that introduced deliberate vulnerabilities in encryption so that intelligence agencies could pursue a small short-term advantage — at the cost of long-term security.
In case you haven’t heard, there’s a new SSL/TLS vulnerability making the rounds. Nicknamed Logjam, the new attack is ‘special’ in that it may admit complete decryption or hijacking of any TLS connection you make to an improperly configured web or mail server. Worse, there’s at least circumstantial evidence that similar (and more powerful) attacks might already be in the toolkit of some state-level attackers such as the NSA.
This work is the result of an unusual collaboration between a fantastic group of co-authors spread all around the world, including institutions such as the University of Michigan, INRIA Paris-Rocquencourt, INRIA Paris-Nancy, Microsoft Research, Johns Hopkins and the University Of Pennsylvania. It’s rare to see this level of collaboration between groups with so many different areas of expertise, and I hope to see a lot more like it. (Disclosure: I am one of the authors.)
The absolute best way to understand the Logjam result is to read the technical research paper. This post is mainly aimed at people who want a slightly less technical form. For those with even shorter attention spans, here’s the TL;DR:
It appears that the the Diffie-Hellman protocol, as currently deployed in SSL/TLS, may be vulnerable to a serious downgrade attack that restores it to 1990s “export” levels of security, and offers a practical “break” of the TLS protocol against poorly configured servers. Even worse, extrapolation of the attack requirements — combined with evidence from the Snowden documents — provides some reason to speculate that a similar attack could be leveraged against protocols (including TLS, IPSec/IKE and SSH) using 768- and 1024-bit Diffie-Hellman.
I’m going to tackle this post in the usual ‘fun’ question-and-answer format I save for this sort of thing.
What is Diffie-Hellman and why should I care about TLS “export” ciphersuites?
Diffie-Hellman is probably the most famous public key cryptosystem ever invented. Publicly discovered by Whit Diffie and Martin Hellman in the late 1970s (and a few years earlier, in secret, by UK GCHQ), it allows two parties to negotiate a shared encryption key over a public connection.
Diffie-Hellman is used extensively in protocols such as SSL/TLS and IPSec, which rely on it to establish the symmetric keys that are used to transport data. To do this, both parties must agree on a set of parameters to use for the key exchange. In traditional (‘mod p‘) Diffie-Hellman, these parameters consist of a large prime number p, as well as a ‘generator’ g. The two parties now exchange keys as shown below:
TLS supports several variants of Diffie-Hellman. The one we’re interested in for this work is the ‘ephemeral’ non-elliptic (“DHE”) protocol variant, which works in a manner that’s nearly identical to the diagram above. The server takes the role of Alice, selecting (p, g, ga mod p) and signing this tuple (and some nonces) using its long-term signing key. The client responds gb mod p and the two sides then calculate a shared secret.
Just for fun, TLS also supports an obsolete ‘export’ variant of Diffie-Hellman. These export ciphersuites are a relic from the 1990s when it was illegal to ship strong encryption out of the country. What you need to know about “export DHE” is simple: it works identically to standard DHE, but limits the size of p to 512 bits. Oh yes, and it’s still out there today. Because the Internet.
How do you attack Diffie-Hellman?
The best known attack against a correct Diffie-Hellman implementation involves capturing the value ga and solving to find the secret key a. The problem of finding this value is known as the discrete logarithm problem, and it’s thought to be a mathematically intractable, at least when Diffie-Hellman is implemented in cryptographically strong groups (e.g., when p is of size 2048 bits or more).
Unfortunately, the story changes dramatically when p is relatively small — for example, 512 bits in length. Given a value ga mod p for a 512-bit p, itshould at least be possible to efficiently recover the secret a and read traffic on the connection.
Most TLS servers don’t use 512-bit primes, so who cares?
The good news here is that weak Diffie-Hellman parameters are almost never used purposely on the Internet. Only a trivial fraction of the SSL/TLS servers out there today will organically negotiate 512-bit Diffie-Hellman. For the most part these are crappy embedded devices such as routers and video-conferencing gateways.
However, there is a second class of servers that are capable of supporting 512-bit Diffie-Hellman when clients request it, using a special mode called the ‘export DHE’ ciphersuite. Disgustingly, these servers amount to about 8% of the Alexa top million sites (and a whopping 29% of SMTP/STARTLS mail servers). Thankfully, most decent clients (AKA popular browsers) won’t willingly negotiate ‘export-DHE’, so this would also seem to be a dead end.
ServerKeyExchange message (RFC 5246)
You see, before SSL/TLS peers can start engaging in all this fancy cryptography, they first need to decide which ciphers they’re going to use. This is done through a negotiation process in which the client proposes some options (e.g., RSA, DHE, DHE-EXPORT), and the server picks one.
This all sound simple enough. However, one of the early, well known flaws in SSL/TLS is the protocol’s failure to properly authenticate these ‘negotiation’ messages. In very early versions of SSL they were not authenticated at all. SSLv3 and TLS tacked on an authentication process — but one that takes place only at the end of the handshake.*
This is particularly unfortunate given that TLS servers often have the ability to authenticate their messages using digital signatures, but don’t really take advantage of this. For example, when two parties negotiate Diffie-Hellman, the parameters sent by the server are transmitted within a signed message called the ServerKeyExchange (shown at right). The signed portion of this message covers the parameters, but neglects to include any information about which ciphersuite the server thinks it’s negotiating. If you remember that the only difference between DHE and DHE-EXPORT is the size of the parameters the server sends down, you might start to see the problem.
Here it is in a nutshell: if the server supports DHE-EXPORT, the attacker can ‘edit’ the negotiation messages sent from the a client — even if the client doesn’t support export DHE — replacing the client’s list of supported ciphers with only export DHE. The server will in turn send back a signed 512-bit export-grade Diffie-Hellman tuple, which the client will blindly accept — because it doesn’t realize that the server is negotiating the export version of the ciphersuite. From its perspective this message looks just like ‘standard’ Diffie-Hellman with really crappy parameters.
Overview of the Logjam active attack (source: paper).
All this tampering should run into a huge snag at the end of the handshake, when he client and server exchange Finished messages embedding include a MAC of the transcript. At this point the client should learn that something funny is going on, i.e., that what it sent no longer matches what the server is seeing. However, the loophole is this: if the attacker can recover the Diffie-Hellman secret quickly — before the handshake ends — she can forge her own Finished messages. In that case the client and server will be none the wiser.
The upshot is that executing this attack requires the ability to solve a 512-bit discrete logarithm before the client and server exchange Finished messages. That seems like a tall order.
Can you really solve a discrete logarithm before a TLS handshake times out?
In practice, the fastest route to solving the discrete logarithm in finite fields is via an algorithm called the Number Field Sieve (NFS). Using NFS to solve a single 512-bit discrete logarithm instance requires several core-years — or about week of wall-clock time given a few thousand cores — which would seem to rule out solving discrete logs in real time.
However, there is a complication. In practice, NFS can actually be broken up into two different steps:
Pre-computation (for a given prime p). This includes the process of polynomial selection, sieving, and linear algebra, all of which depend only on p. The output of this stage is a table for use in the second stage.
Solving to find a (for a given ga mod p). The final stage, called the descent, uses the table from the precomputation. This is the only part of the algorithm that actually involves a specific g andga.
The important thing to know is that the first stage of the attack consumes the vast majority of the time, up to a full week on a large-scale compute cluster. The descent stage, on the other hand, requires only a few core-minutes. Thus the attack cost depends primarily on where the server gets its Diffie-Hellman parameters from. The best case for an attacker is when p is hard-coded into the server software and used across millions of machines. The worst case is when p is re-generated routinely by the server.
I’ll let you guess what real TLS servers actually do.
In fact, large-scale Internet scans by the team at University of Michigan show that most popular web servers software tends to re-use a small number of primes across thousands of server instances. This is done because generating prime numbers is scary, so implementers default to using a hard-coded value or a config file supplied by your Linux distribution. The situation for export Diffie-Hellman is particularly awful, with only two (!) primes used across up 92% of enabled Apache/mod_ssl sites.
Number of seconds to solve a 512-bit discrete log (source: paper).
The upshot of all of this is that about two weeks of pre-computation is sufficient to build a table that allows you to perform the downgrade against most export-enabled servers in just a few minutes (see the chart at right). This is fast enough that it can be done before the TLS connection timeout. Moreover, even if this is not fast enough, the connection can often be held open longer by using clever protocol tricks, such as sending TLS warning messages to reset the timeout clock.
Keep in mind that none of this shared prime craziness matters when you’re using sufficiently large prime numbers (on the order of 2048 bits or more). It’s only a practical issue you’re using small primes, like 512-bit, 768-bit or — and here’s a sticky one I’ll come back to in a minute — 1024 bit.
How do you fix the downgrade to export DHE?
The best and most obvious fix for this problem is to exterminate export ciphersuites from the Internet. Unfortunately, these awful configurations are the default in a number of server software packages (looking at you Postfix), and getting people to update their configurations is surprisingly difficult (see e.g., FREAK).
A simpler fix is to upgrade the major web browsers to resist the attack. The easy way to do this is to enforce a larger minimum size for received DHE keys. The problem here is that the fix itself causes some collateral damage — it will break a small but significant fraction of lousy servers that organically negotiate (non-export) DHE with 512 bit keys.
The good news here is that the major browsers have decided to break the Internet (a little) rather than allow it to break them. Each has agreed to raise the minimum size limit to at least 768 bits, and some to a minimum of 1024 bits. It’s still not perfect, since 1024-bit DHE may not be cryptographically sound against powerful attackers, but it does address the immediate export attack. In the longer term the question is whether to use larger negotiated DHE groups, or abandon DHE altogether and move to elliptic curves.
What does this mean for larger parameter sizes?
The good news so far is that 512-bit Diffie-Hellman is only used by a fraction of the Internet, even when you account for active downgrade attacks. The vast majority of servers use Diffie-Hellman moduli of length at least 1024 bits. (The widespread use of 1024 is largely due to a hard-cap in older Java clients. Go away Java.)
While 2048-bit moduli are generally believed to be outside of anyone’s reach, 1024-bit DHE has long been considered to be at least within groping range of nation-state attackers. We’ve known this for years, of course, but the practical implications haven’t been quite clear. This paper tries to shine some light on that, using Internet-wide measurements and software/hardware estimates.
If you recall from above, the most critical aspect of the NFS attack is the need to perform large amounts of pre-computation on a given Diffie-Hellman prime p, followed by a relatively short calculation to break any given connection that uses p. At the 512-bit size the pre-computation only requires about a week. The question then is, how much does it cost for a 1024-bit prime, and how common are shared primes?
While there’s no exact way to know how much the 1024-bit attack would cost, the paper attempts to provide some extrapolations based on current knowledge. With software, the cost of the pre-computation seems quite high — on the order of 35 million core-years. Making this happen for a given prime within a reasonable amount of time (say, one year) would appear to require billions of dollars of computing equipment if we assume no algorithmic improvements. Even if we rule out such improvements, it’s conceivable that this cost might be brought down to a few hundred million dollars using hardware. This doesn’t seem out of bounds when you consider leaked NSA cryptanalysis budgets.
What’s interesting is that the descent stage, required to break a given Diffie-Hellman connection, is much faster. Based on some implementation experiments by the CADO-NFS team, it may be possible to break a Diffie-Hellman connection in as little as 30 core-days, with parallelization hugely reducing the wall-clock time. This might even make near-real-time decryption of Diffie-Hellman connections practical.
Is the NSA actually doing this?
So far all we’ve noted is that NFS pre-computation is at least potentially feasible when 1024-bit primes are re-used. That doesn’t mean the NSA is actually doing any of it.
There is some evidence, however, that suggests the NSA has decryption capability that’s at least consistent with such a break. This evidence comes from a series of Snowden documents published last winter in Der Spiegel. Together they describe a large-scale effort at NSA and GCHQ, capable of decrypting ‘vast’ amounts of Internet traffic, including IPSec, SSH and HTTPS connections.
NSA slide illustrating exploitation of IPSec encrypted traffic (source: Spiegel).
While the architecture described by the documents mentions attacks against many protocols, the bulk of the energy seems to be around the IPSec and IKE protocols, which are used to establish Virtual Private Networks (VPNs) between individuals and corporate networks such as financial institutions.
The nature of the NSA’s exploit is never made clear in the documents, but diagram at right gives a lot of the architectural details. The system involves collecting Internet Key Exchange (IKE) handshakes, transmitting them to the NSA’s Cryptanalysis and Exploitation Services (CES) enclave, and feeding them into a decryption system that controls substantial high performance computing resources to process the intercepted exchanges. This is at least circumstantially consistent with Diffie-Hellman cryptanalysis.
Of course it’s entirely possible that the attack is based on a bad random number generator, weak symmetric encryption, or any number of engineered backdoors. There are a few pieces of evidence that militate towards a Diffie-Hellman break, however:
IPSec (or rather, the IKE key exchange) uses Diffie-Hellman for every single connection, meaning that it can’t be broken without some kind of exploit, although this doesn’t rule out the other explanations.
The IKE exchange is particularly vulnerable to pre-computation, since IKE uses a small number of standardized prime numbers called the Oakley groups, which are going on 17 years old now. Large-scale Internet scanning by the Michigan team shows that a majority of responding IPSec endpoints will gladly negotiate using Oakley Group 1 (768 bit) or Group 2 (1024 bit), even when the initiator offers better options.
The NSA’s exploit appears to require the entireIKE handshake as well as any pre-shared key (PSK). These inputs would be necessary for recovery of IKEv1 session keys, but are not required in a break that involves only symmetric cryptography.
The documents explicitly rule out the use of malware, or rather, they show that such malware (‘TAO implants’) is in use — but that malware allows the NSA to bypass the IKE handshake altogether.
I would stipulate that beyond the Internet measurements and computational analysis, this remains firmly in the category of ‘crazy-eyed informed speculation’. But while we can’t rule out other explanations, this speculation is certainly consistent with a hardware-optimized break of Diffie-Hellman 768 and 1024-bit, along with some collateral damage to SSH and related protocols.
So what next?
The paper gives a detailed set of recommendations on what to do about these downgrade attacks and (relatively) weak DHE groups. The website provides a step-by-step guide for server administrators. In short, probably the best long-term move is to switch to elliptic curves (ECDHE) as soon as possible. Failing this, clients and servers should enforce at least 2048-bit Diffie-Hellman across the Internet. If you can’t do that, stop using common primes.
Making this all happen on anything as complicated as the Internet will probably consume a few dozen person-lifetimes. But it’s something we have to do, and will do, to make the Internet work properly.
Notes: * There are reasons for this. Some SSL/TLS ciphersuites (such as the RSA encryption-based ciphersuites) don’t use signatures within the protocol, so the only way to authenticate the handshake is to negotiate a ciphersuite, run the key exchange protocol, then use the resulting shared secret to authenticate the negotiation messages after the fact. But SSL/TLS DHE involves digital signatures, so it should be possible to achieve a stronger level of security than this. It’s unfortunate that the protocol does not.
This is the story of how a handful of cryptographers ‘hacked’ the NSA. It’s also a story of encryption backdoors, and why they never quite work out the way you want them to.
But I think I’m getting ahead of myself a bit here.
Today’s Washington Post has the story of a nasty bug in some TLS/SSL servers and clients, one that has the potential to downgrade the security of your TLS connections to something that isn’t really secure at all. In this post I’m going to talk about the technical aspects of the attack, why it matters, and how bad it is.
If you don’t want to read a long blog post, let me give you a TL;DR:
A group of cryptographers at INRIA, Microsoft Research and IMDEA have discovered some serious vulnerabilities in OpenSSL (e.g., Android) clients and Apple TLS/SSL clients (e.g., Safari) that allow a ‘man in the middle attacker’ to downgrade connections from ‘strong’ RSA to ‘export-grade’ RSA. These attacks are real and exploitable against a shocking number of websites — including government websites. Patch soon and be careful.
You can find a detailed description of the work by the researchers — Beurdouche, Bhargavan, Delignat-Lavaud, Fournet, Kohlweiss, Pironti, Strub, Zinzindohoue, Zanella-Béguelin — at their site SmackTLS.com. You should go visit that site and read about the exploits directly. The proof of concept implementation also involved contributions from Nadia Heninger at U. Penn.
I’m going to explain the rest of it in the ‘fun’ question and answer format I save for this kind of attack.
What is SSL/TLS and what are ‘EXPORT cipher suites’ anyway?
In case you’re not familiar with SSL and its successor TLS, what you should know is that they’re the most important security protocols on the Internet. In a world full of untrusted networks, SSL and TLS are what makes modern communication possible.
Or rather, that’s the theory. In practice, SSL and TLS have been a more like a work in progress. In part this is because they were developed during an era when modern cryptographic best practices weren’t nailed down yet. But more to the point: it’s because even when the crypto is right, many software implementations still get things wrong.
With all that in mind, there’s a third aspect of SSL/TLS that doesn’t get nearly as much attention. That is: the SSL protocol itself was deliberately designed to be broken.
Let me explain what I mean by that.
Back in the early 1990s when SSL was first invented at Netscape Corporation, the United States maintained a rigorous regime of export controls for encryption systems. In order to distribute crypto outside of the U.S., companies were required to deliberately ‘weaken’ the strength of encryption keys. For RSA encryption, this implied a maximum allowed key length of 512 bits.*
The 512-bit export grade encryption was a compromise between dumb and dumber. In theory it was designed to ensure that the NSA would have the ability to ‘access’ communications, while allegedly providing crypto that was still ‘good enough’ for commercial use. Or if you prefer modern terms, think of it as the original “golden master key“.
The need to support export-grade ciphers led to some technical challenges. Since U.S. servers needed to support both strong and weak crypto, the SSL designers used a ‘cipher suite’ negotiation mechanism to identify the best cipher both parties could support. In theory this would allow ‘strong’ clients to negotiate ‘strong’ ciphersuites with servers that supported them, while still providing compatibility to the broken foreign clients.
This story has a happy ending, after a fashion. The U.S eventually lifted the most onerous of its export policies. Unfortunately, the EXPORT ciphersuites didn’t go away. Today they live on like zombies — just waiting to eat our flesh.
If EXPORT ciphers are known to be broken, what’s the news here?
We don’t usually worry about export-grade cipher suites very much, because supposedly they aren’t very relevant to the modern Internet. There are three general reasons we don’t think they matter anymore:
Most ‘modern’ clients (e.g., web browsers) won’t offer export grade ciphersuites as part of the negotiation process. In theory this means that even if the server supports export-grade crypto, your session will use strong crypto.
Almost no servers, it was believed, even offer export-grade ciphersuites anymore.
Even if you do accidentally negotiate an export-grade RSA ciphersuite, a meaningful attack still requires the attacker to factor a 512-bit RSA key (or break a 40-bit symmetric cipher). This is doable, but it’s generally considered too onerous if you have to do it for every single connection.
This was the theory anyway. It turns out that theory is almost always different than practice. Which brings us to the recent work by Beurdouche et al. from INRIA, Microsoft Research and IMDEA.
You see, it turns out that some modern TLS clients — including Apple’s SecureTransport and OpenSSL — have a bug in them. This bug causes them to accept RSA export-grade keys even when the client didn’t ask for export-grade RSA. The impact of this bug can be quite nasty: it admits a ‘man in the middle’ attack whereby an active attacker can force down the quality of a connection, provided that the client is vulnerable and the server supports export RSA.
The MITM attack works as follows:
In the client’s Hello message, it asks for a standard ‘RSA’ ciphersuite.
The MITM attacker changes this message to ask for ‘export RSA’.
The server responds with a 512-bit export RSA key, signed with its long-term key.
The client accepts this weak key due to the OpenSSL/SecureTransport bug.
The attacker factors the RSA modulus to recover the corresponding RSA decryption key.
When the client encrypts the ‘pre-master secret’ to the server, the attacker can now decrypt it to recover the TLS ‘master secret’.
From here on out, the attacker sees plaintext and can inject anything it wants.
So that’s bad news and it definitely breaks our assumption in point (1) above. But at least in theory we should still be safe based on points (2) and (3).
How common are export-enabled TLS servers?
No matter how bad you think the Internet is, it can always surprise you. The surprise in this case is that export-grade RSA is by no means as extinct as we thought it was.
(Facebook have updated their configuration as a result of this work.)
Factoring an RSA key seems pretty expensive for breaking one session.
This brings us to the most awful part of this attack. You don’t have to be that fast.
You see, it turns out that generating fresh RSA keys is a bit costly. So modern web servers don’t do it for every single connection. In fact, Apache mod_ssl by default will generate a single export-grade RSA key when the server starts up, and will simply re-use that key for the lifetime of that server.
What this means is that you can obtain that RSA key once, factor it, and break every session you can get your ‘man in the middle’ mitts on until the server goes down. And that’s the ballgame.
PoC or GTFO.
Fortunately, a proof of concept for this attack requires only a few ingredients. First, you need some tooling to actually run the MITM attack. Then you need the ability to (quickly) factor 512-bit RSA keys. From there it’s just a question of finding a vulnerable client and server.
This is what happens to EC2 spot pricing
when Nadia runs 75 ‘large’ instances to factor a 512-bit key.
Just because someone saysan implementation is vulnerable doesn’t mean it actually is. You should ask for proof.
The guts of the PoC were put together by Karthik Bhargavan and Antoine Delignat-Lavaud at INRIA. They assembled an MITM proxy that can intercept connections and re-write them to use export-RSA against a willing website.
To factor the 512-bit export keys, the project enlisted the help of Nadia Heninger at U. Penn, who has been working on “Factoring as a Service” for exactly this purpose. Her platform uses cado-nfs on a cluster of EC2 virtual servers, and (with Nadia doing quite a bit of handholding to deal with crashes) was able to factor a bunch of 512-bit keys — each in about 7.5 hours for $104 in EC2 time.
From there all you need is a vulnerable website.
Since the NSA was the organization that demanded export-grade crypto, it’s only fitting that they should be the first site affected by this vulnerability. There’s great video on the SmackTLS site. After a few hours of factoring, one can take the original site (which looked like this):
And change it into this:
Attack images courtesy Karthik, Antoine INRIA.
Some will point out that an MITM attack on the NSA is not really an ‘MITM attack on the NSA’ because NSA outsources its web presence to the Akamai CDN (see obligatory XKCD at right). These people may be right, but they also lack poetry in their souls.
Is it patched?
The most recent of OpenSSL does have a patch. This was announced (though not very loudly) in January of this year.
Apple is working on a patch.
Akamai and other CDNs are also rolling out a patch to solve these problems. Over the next two weeks we will hopefully see export ciphersuites extinguished from the Internet. In the mean time, try to be safe.
What does it all mean?
You might think this is all a bit absurd and doesn’t affect you very much. In a strictly technical sense you’re probably right. The client bugs will soon be patched (update your devices! unless you have Android in which case you’re screwed). With good luck, servers supporting export-grade RSA cipher suites will soon be rare curiosity.
Still, to take this as the main lesson of the work would, I think, be missing the forest for the trees. There’s a much more important moral to this story.
The export-grade RSA ciphers are the remains of a 1980s-vintage effort to weaken cryptography so that intelligence agencies would be able to monitor foreign traffic. This was done badly. So badly, that while the policies were ultimately scrapped, they’re still hurting us today.
This might be an academic point if it was only a history lesson. However, for the past several months, U.S. and European politicians have been publicly mooting the notion of a new set of cryptographicbackdoors in systems we use today. While the proposals aren’t explicit, they would presumably involve deliberately weakening encryption tech so that governments can intercept and read our conversations. While officials carefully avoid the term “back door” — or any suggestion of weakening our encryption systems against real attackers — this is wishful thinking. These systems are already so complex that even normal issues stress them to the breaking point. There’s just no room for new backdoors.
To be blunt about it, the moral is pretty simple:
Encryption backdoors will always turn around and bite you in the ass. They are never worth it.
Special thanks to Karthik and Antoine for sharing this with me, Nadia for factoring, Ivan Ristic for interrupting his vacation to get us data, and the CADO-NFS team for the software that made this possible. Notes:
* Export controls might have made some sense in the days when ‘encryption’ meant big clunky pieces of hardware, but it was nonsensical in a world of software. Non-U.S. users could easily skirt the paltry IP-address checks to download strong versions of browsers such as Netscape, and — when that was too much trouble — they could easily re-implement the crypto themselves or use foreign open source libraries. (The requirements became so absurd that mainstream U.S. companies like RSA Security wound up hiring foreign developers to build their encryption libraries, since it was easier to import strong encryption than to export it.)
“To be clear, Superfish comes with Lenovo consumer products only and is a technology that helps users find and discover products visually,” the representative continued. “The technology instantly analyses images on the web and presents identical and similar product offers that may have lower prices, helping users search for images without knowing exactly what an item is called or how to describe it in a typical text-based search engine.”
The problem here is not just that this is a lousy idea. It’s that Lenovo used the same certificate on every single Laptop it shipped with Superfish. And since the proxy software also requires the corresponding private key to decrypt and modify your web sessions, that private key was also shipped on every laptop. It took all of a day for a numberof researchers to find that key and turn themselves into Lenovo-eating interception proxies. This sucks for Lenovo users.
Instead, what I’d like to discuss is some of the options for large-scale automated fixes to this kind of vulnerability. It’s quite possible that Lenovo will do this by themselves — pushing an automated patch to all of their customers to remove the product — but I’m not holding my breath. If Lenovo does not do this, there are roughly three options:
Lenovo users live with this and/or manually patch. If the patch requires manual effort, I’d estimate it’ll be applied to about 30% of Lenovo laptops. Beware: the current uninstall package does not remove the certificate from the root store!
Microsoft drops the bomb. Microsoft has a nuclear option themselves in terms of cleaning up nasty software — they can use the Windows Update mechanism or (less universally) the Windows Defender tool to remove spyware/adware. Unfortunately not everyone uses Defender, and Microsoft is probably loath to push out updates like this without massive testing and a lot of advice from the lawyers.
Google and Mozilla fix internally. This seems like a more promising option. Google Chrome in particular is well known for quickly pushing out security updates that revoke keys, add public key pins, and generally make your browsing experience more secure.
It seems unlikely that #1 and #2 will happen anytime soon, so the final option looks initially like the most promising. Unfortunately it’s not that easy. To understand why, I’m going to sum up some reasoning given to me (on Twitter) by a couple of members of the Chrome security team.
The obvious solution to fixing things at the Browser level is to have Chrome and/or Mozilla push out an update to their browsers that simply revokes the Superfish certificate. There’s plenty of precedent for that, and since the private key is now out in the world, anyone can use it to build their own interception proxy. Sadly, this won’t work! If Google does this, they’ll instantly break every Lenovo laptop with Superfish still installed and running. That’s not nice, or smart business for Google.
A more promising option is to have Chrome at least throw up a warning whenever a vulnerable Lenovo user visits a page that’s obviously been compromised by a Superfish certificate. This would include most (secure) sites any Superfish-enabled Lenovo user visits — which would be annoying — and just a few pages for those users who have uninstalled Superfish but still have the certificate in their list of trusted roots.
This seems much nicer, but runs into two problems. First, someone has to write this code — and in a hurry, because attacks may begin happening immediately. Second, what action item are these warnings going to give people? Manually uninstalling certificates is hard, and until a very nice tool becomes available a warning will just be an irritation for most users.
One option for Google is to find a way to deal with these issues systemically — that is, provide an option for their browser to tunnel traffic through some alternative (secure) protocol to a proxy, where it can then go securely to its location without being molested by Superfish attackers of any flavor. This would obviously require consent by the user — nobody wants their traffic being routed through Google otherwise. But it’s at least technically feasible.
Google even has an extension for Android/iOS that works something like this: it’s a compressing proxy extension that you can install in Chrome. It will shrink your traffic down and send it to a proxy (presumably at Google). Unfortunately this proxy won’t work even if it was available for Windows machines — because Superfish will likely just intercept its connections too 😦
So that’s out too, and with it the last obvious idea I have for dealing with this in a clean, automated way. Hopefully the Google team will keep going until they find a better solution.
The moral of this story, if you choose to take one, is that you should never compromise security for the sake of a few bucks — because security is so terribly, awfully difficult to get back.
If you don’t follow NSA news obsessively, you might have missed yesterday’s massive Snowden document dump from Der Spiegel. The documents provide a great deal of insight into how the NSA breaks our cryptographic systems. I was very lightly involved in looking at some of this material, so I’m glad to see that it’s been published.
Unfortunately with so much material, it can be a bit hard to separate the signal from the noise. In this post I’m going to try to do that a little bit — point out the bits that I think are interesting, the parts that are old news, and the things we should keep an eye on.
Those who read this blog will know that I’ve been wondering for a long time how NSA works its way around our encryption. This isn’t an academic matter, since it affects just about everyone who uses technology today.
What we’ve learned since 2013 is that NSA and its partners hoover up vast amounts of Internet traffic from fiber links around the world. Most of this data is plaintext and therefore easy to intercept. But at least some of it is encrypted — typically protected by protocols such as SSL/TLS or IPsec.
Conventional wisdom pre-Snowden told us that the increasing use of encryption ought to have shut the agencies out of this data trove. Yet the documents we’ve seen so far indicate just the opposite. Instead, the NSA and GCHQ have somehow been harvesting massive amounts of SSL/TLS and IPSEC traffic, and appear to be making inroads into other technologies such as Tor as well.
How are they doing this? To repeat an old observation, there are basically three ways to crack an encrypted connection:
Go after the mathematics. This is expensive and unlikely to work well against modern encryption algorithms (with a fewexceptions). The leaked documents give very little evidence of such mathematical breaks — though a bit more on this below.
Go after the implementation. The new documents confirm a previously-reported and aggressive effort to undermine commercial cryptographic implementations. They also provide context for how important this type of sabotage is to the NSA.
Steal the keys. Of course, the easiest way to attack any cryptosystem is simply to steal the keys. Yesterday we received a bit more evidence that this is happening.
I can’t possibly spend time on everything that’s covered by these documents — you should go read them yourself — so below I’m just going to focus on the highlights.
Not so Good Will Hunting
First, the disappointing part. The NSA may be the largest employer of cryptologic mathematicians in the United States, but — if the new story is any indication — those guys really aren’t pulling their weight.
In fact, the only significant piece of cryptanalytic news in the entire stack comes is a 2008 undergraduate research project looking at AES. Sadly, this is about as unexciting as it sounds — in fact it appears to be nothing more than a summer project by a visiting student. More interesting is the context it gives around the NSA’s efforts to break block ciphers such as AES, including the NSA’s view of the difficulty of such cryptanalysis, and confirmation that NSA has some ‘in-house techniques’.
Additionally, the documents include significant evidence that NSA has difficulty decrypting certain types of traffic, including Truecrypt, PGP/GPG, Tor and ZRTP from implementations such as RedPhone. Since these protocols share many of the same underlying cryptographic algorithms — RSA, Diffie-Hellman, ECDH and AES — some are presenting this as evidence that those primitives are cryptographically strong.
As with the AES note above, this ‘good news’ should also be taken with a grain of salt. With a small number of exceptions, it seems increasingly obvious that the Snowden documents are geared towards NSA’s analysts and operations staff. In fact, many of the systems actually seem aimed at protecting knowledge of NSA’s cryptanalytic capabilities from NSA’s own operational staff (and other Five Eyes partners). As an analyst, it’s quite possible you’ll never learn why a given intercept was successfully decrypted.
To put this a bit more succinctly: the lack of cryptanalytic red meat in these documents may not truly be representative of the NSA’s capabilities. It may simply be an artifact of Edward Snowden’s clearances at the time he left the NSA.
One of the most surprising aspects of the Snowden documents — to those of us in the security research community anyway — is the NSA’s relative ineptitude when it comes to de-anonymizing users of the Tor anonymous communications network.
The reason for our surprise is twofold. First, Tor was never really designed to stand up against a global passive adversary — that is, an attacker who taps a huge number of communications links. If there’s one thing we’ve learned from the Snowden leaks, the NSA (plus GCHQ) is the very definition of the term. In theory at least, Tor should be a relatively easy target for the agency.
The real surprise, though, is that despite this huge signals intelligence advantage, the NSA has barely even tested their ability to de-anonymize users. In fact, this leak provides the first concrete evidence that NSA is experimenting with traffic confirmation attacks to find the source of Tor connections. Even more surprising, their techniques are relatively naive, even when compared to what’s going on in the ‘research’ community.
This doesn’t mean you should view Tor as secure against the NSA. It seems very obvious that the agency has identified Tor as a high-profile target, and we know they have the resources to make much more headway against the network. The real surprise is that they haven’t tried harder. Maybe they’re trying now.
SSL/TLS and IPSEC
A few months ago I wrote a long post speculating about how the NSA breaks SSL/TLS. Because it’s increasingly clear that the NSA does break these protocols, and at relatively large scale.
The new documents don’t tell us much we didn’t already know, but they do confirm the basic outlines of the attack. The first portion requires endpoints around the world that are capable of performing the raw decryption of SSL/TLS sessions provided they know the session keys. The second is a separate infrastructure located on US soil that can recover those session keys when needed.
All of the real magic happens within the key recovery infrastructure. These documents provide the first evidence that a major attack strategy for NSA/GCHQ involves key databases containing the private keys for major sites. For the RSA key exchange ciphersuites of TLS, a single private key is sufficient to recover vast amounts of session traffic — in real time or even after the fact.
The interesting question is how the NSA gets those private keys. The easiest answer may be the least technical. A different Snowden leak shows gives some reason to believe that the NSA may have relationships with employees at specific named U.S. entities, and may even operate personnel “under cover”. This would certainly be one way to build a key database.
But even without the James Bond aspect of this, there’s every reason to believe that NSA has other means to exfiltrate RSA keys from operators. During the period in question, we know of at least one vulnerability (Heartbleed) that could have been used to extract private keys from software TLS implementations. There are still other, unreported vulnerabilities that could be used today.
Pretty much everything I said about SSL/TLS also applies to VPN protocols, with the additional detail that many VPNs use broken protocols and relatively poorly-secured pre-shared secrets that can in some cases be brute-forced. The NSA seems positively gleeful about this.
Open Source packages: Redphone, Truecrypt, PGP and OTR
The documents provide at least circumstantial evidence that some open source encryption technologies may thwart NSA surveillance. These include Truecrypt, ZRTP implementations such as RedPhone, PGP implementations, and Off the Record messaging. These packages have a few commonalities:
They’re all open source, and relatively well studied by researchers.
They’re not used at terribly wide scale (as compared to e.g., SSL or VPNs)
They all work on an end-to-end basis and don’t involve service providers, software distributers, or other infrastructure that could be corrupted or attacked.
What’s at least as interesting is which packages are not included on this list. Major corporate encryption protocols such as iMessage make no appearance in these documents, despite the fact that they ostensibly provide end-to-end encryption. This may be nothing. But given all we know about NSA’s access to providers, this is definitely worrying.
A note on the ethics of the leak
Before I finish, it’s worth addressing one major issue with this reporting: are we, as citizens, entitled to this information? Would we be safer keeping it all under wraps? And is this all ‘activist nonsense‘?
This story, more than some others, skates close to a line. I think it’s worth talking about why this information is important.
To sum up a complicated issue, we live in a world where targeted surveillance is probably necessary and inevitable. The evidence so far indicates that NSA is very good at this kind of work, despite some notable failuresin actually executing on the intelligence it produces.
Unfortunately, the documents released so far also show that a great deal of NSA/GCHQ surveillance is not targeted at all. Vast amounts of data are scooped up indiscriminately, in the hope that some of it will someday prove useful. Worse, the NSA has decided that bulk surveillance justifies its efforts to undermine many of the security technologies that protect our own information systems. The President’s own hand-picked review council has strongly recommended this practice be stopped, but their advice has — to all appearances — been disregarded. These are matters that are worthy of debate, but this debate hasn’t happened.
Unfortunate if we can’t enact changes to fix these problems, technology is probably about all that’s left. Over the next few years encryption technologies are going to be widely deployed, not only by individuals but also by corporations desperately trying to reassure overseas customers who doubt the integrity of US technology.
In that world, it’s important to know what works and doesn’t work. Insofar as this story tells us that, it makes us all better off.