Tuesday, December 22, 2015

On the Juniper backdoor

You might have heard that a few days ago, Juniper Systems announced the discovery of "unauthorized code" in the ScreenOS software that underlies the NetScreen line of devices. As a result of this discovery, the company announced a pair of separate vulnerabilities, CVE-2015-7755 and CVE-2015-7756 and urged their customers to patch immediately.

The first of these CVEs (#7755) was an authentication vulnerability, caused by a malicious hardcoded password in SSH and Telnet. Rapid7 has an excellent writeup of the issue. This is a pretty fantastic vulnerability, if you measure by the impact on security of NetScreen users. But on the technological awesomeness scale it rates about a two out of ten, maybe a step above 'hit the guy with a wrench'.

The second vulnerability is a whole different animal. The advisory notes that CVE-7756 -- which is independent of the first issue -- "may allow a knowledgeable attacker who can monitor VPN traffic to decrypt that traffic." This is the kind of vulnerability that makes applied cryptographers cry tears of joy. It certainly did that for me:
And while every reasonable person knows you can't just drop "passive decryption vulnerability" and expect the world to go on with its business, this is exactly what Juniper tried to do. Since they weren't talking about it, it fell to software experts to try to work out what was happening by looking carefully at firmware released by the company.

Now I want to be clear that I was not one of those software experts. IDA scares the crap out of me. But I'm fortunate to know some of the right kind of people, like Steve Checkoway, who I was able to get on the job, mostly by encouraging him to neglect his professional obligations. I also follow some talented folks on Twitter, like H.D. Moore and Ralf Philipp Weinmann. So I was fortunate enough to watch them work, and occasionally (I think?) chip in a helpful observation.

And yes, it was worth it. Because what Ralf and Steve et al. found is beyond belief. Ralf's excellent post provides all of the technical details, and you should honest just stop reading now and go read that. But since you're still here, the TL;DR is this:
For the past several years, it appears that Juniper NetScreen devices have incorporated a potentially backdoored random number generator, based on the NSA's Dual_EC_DRBG algorithm. At some point in 2012, the NetScreen code was further subverted by some unknown party, so that the very same backdoor could be used to eavesdrop on NetScreen connections. While this alteration was not authorized by Juniper, it's important to note that the attacker made no major code changes to the encryption mechanism -- they only changed parameters. This means that the systems were potentially vulnerable to other parties, even beforehand. Worse, the nature of this vulnerability is particularly insidious and generally messed up.
In the rest of this post I'm going to try to fill in the last part of that statement. But first, some background.

Dual EC DRBG

Pretty much every cryptographic system depends on a secure random number generator, or RNG. These algorithms produce the unpredictable random bits that are consumed by cryptographic protocols. The key word in this description is unpredictable: if an attacker can predict the output of your RNG, then virtually everything you build on it will end up broken.

This fact has not been lost on attackers! The most famous (alleged) example of deliberate random number generator subversion was discovered in 2007 by Dan Shumow and Neils Ferguson from Microsoft, when they announced the possibility of a backdoor in a NIST standard called Dual_EC_DRBG.

I've written extensively about the design of the Dual_EC generator and the process that led to it its standardization. Omitting the mathematics, the short version is that Dual EC relies on a special 32-byte constant called Q, which -- if generated by a malicious attacker -- can allow said attacker to predict future outputs of the RNG after seeing a mere 30 bytes of raw output from your generator.

The NIST specification of Dual_EC comes with a default value for Q that was generated by the NSA. Nobody has ever been able to determine how NSA generated this value, but leaks by Edward Snowden in 2013 provide strong evidence that NSA may have generated it maliciously. It certainly doesn't smell good.

But enough with the ancient history. We're here to talk about Juniper.

Juniper ScreenOS -- before the "unauthorized code"

Although it was not widely publicized before this week, Juniper's ScreenOS devices have used Dual EC for some time -- probably since before Juniper acquired NetScreen Technologies. Prior to this week, the only place you might have learned about this was on this tiny page posted by the company after the Snowden leaks.


The decision to use Dual EC is strange all by itself. But it gets weirder.

First, ScreenOS doesn't use the NSA's default Q. Instead, they use an alternative Q value that was generated by Juniper and/or NetScreen. If Juniper had really wanted to rule out the possibility of surveillance, this would have been a good opportunity to do so. They could have generated the new constant in a "provably" safe way -- e.g., by hashing some English-language string. Since we have no evidence that they did so, a conservative view holds that the Juniper/NetScreen constant may also have been generated maliciously, rendering it useful for eavesdropping.

Next, ScreenOS uses Dual EC in a strange, non-standard way. Rather than generating all of their random numbers with Dual EC (which would be slow), they only use Dual EC to generate a seed for a fast 3DES-based generator called ANSI X9.17. Since that generator is actually FIPS-140 approved and generally believed to be sufficient to the purpose, it's not clear what value Dual EC is really adding to the system in the first place -- except, of course, its usefulness as a potential backdoor.

The good news here is that the post-processing by ANSI X9.17 should kill the Dual EC backdoor, since the attack relies on the attacker seeing raw output from Dual EC. The ANSI generator appears to completely obfuscate this output, thus rendering Dual EC "safe". This is indeed the argument Juniper made in 2013 when it decided to leave the Dual EC code in ScreenOS.

The problem with this argument is that it assumes that no other code could ever "accidentally" exfiltrate a few bytes bit of raw Dual EC output. Yet this is exactly the kind of threat you'd worry about in a deliberately backdoored system -- the threat that, just maybe, the developers are not your friend. Thus Dual EC is safe only if you assume no tiny bug in the code could accidentally leak out 30 bytes or so of raw Dual EC output. If it did, this would make all subsequent seeding calls predictable, and thus render all numbers generated by the system predictable. In general, this would spell doom for the confidentiality of VPN connections.

And unbelievably, amazingly, who coulda thunk it, it appears that such a bug does exist in many versions of ScreenOS, dating to both before and after the "unauthorized code" noted by Juniper. This issue was discovered by Willem Pinckaers and can be illustrated by the following code snippet, which Steve Checkoway decompiled (see full context here):

void prng_generate()
{
  int index; // eax@4
  unsigned int bytes_generated; // ebx@4
  int v2; // eax@9
  int v3; // eax@12
  char v4; // ST04_1@12
  int time[2]; // [sp+10h] [bp-10h]@1

  time[0] = 0;
  time[1] = get_cycles();
  prng_output_index = 0; // this is a global
  ++blocks_generated_since_reseed;
  if ( !do_not_need_to_reseed() )
    // the routine below fills a buffer with raw Dual EC output
    prng_reseed(); // internally sets prng_output_index to 32
  for ( ; (unsigned int)prng_output_index <= 0x1F; prng_output_index += 8 )
  {
    // this loop is supposed to process the same buffer
    // through the ANSI (3DES) generator. However, since the
    // value of prng_output_index was set to 32 above, it never executes
    memcpy(prev_prng_seed, prng_seed, 8);
    memcpy(prev_prng_block, prng_block, 8);
    ANSI_X9_31_generate_block(time, prng_seed, prng_key, prng_block);
    ...


Thus what comes out from this function is 32 bytes of raw Dual EC output, which is all you need to recover the internal Dual EC generator state and predict all future outputs.

So if this was the authorized code, what the hell was the unauthorized code?

The creepiest thing about CVE-2015-7756 is that there doesn't seem to be any unauthorized code. Indeed, what's changed in the modified versions is simply the value of the Q point. According to Ralf this point changed in 2012, presumably to a value that the hacker(s) generated themselves. This would likely have allowed these individuals to passively decrypt ScreenOS VPN sessions.

In the more recent Juniper patch to fix the vulnerability, is simply set back to the the original Juniper/NetScreen value.

The attacker also replaced some test vectors. But that appears to be it.

To sum up, some hacker or group of hackers noticed an existing backdoor in the Juniper software, which may have been intentional or unintentional -- you be the judge! They then piggybacked on top of it to build a backdoor of their own, something they were able to do because all of the hard work had already been done for them. The end result was a period in which someone -- maybe a foreign government -- was able to decrypt Juniper traffic in the U.S. and around the world.

And all because Juniper had already paved the road.

So why does this matter?

For the past several months I've been running around with various groups of technologists, doing everything I can to convince important people that the sky is falling. Or rather, that the sky will fall if they act on some of the very bad, terrible ideas that are currently bouncing around Washington -- namely, that our encryption systems should come equipped with "backdoors" intended to allow law enforcement and national security agencies to access our communications.

One of the most serious concerns we raise during these meetings is the possibility that encryption backdoors could be subverted. Specifically, that a backdoor intended for law enforcement could somehow become a backdoor for people who we don't trust to read our messages. Normally when we talk about this, we're concerned about failures in storage of things like escrow keys. What this Juniper vulnerability illustrates is that the danger is much broader and more serious than that.

The problem with cryptographic backdoors isn't that they're the only way that an attacker can break into our cryptographic systems. It's merely that they're one of the best. They take care of the hard work, the laying of plumbing and electrical wiring, so attackers can simply walk in and change the drapes.

This post made possible by Ralf Philipp Weinmann, H. D. Moore, Steve Checkoway, Willem Pinckaers, Nadia Heninger, Shaanan Cohney and the letter Q.

Wednesday, November 11, 2015

Why the Tor attack matters

Earlier today, Motherboard posted a court document filed in a prosecution against a Silk Road 2.0 user, indicating that the user had been de-anonymized on the Tor network thanks to research conducted by a "university-based research institute".

Source: Motherboard.
As Motherboard pointed out, the timing of this research lines up with an active attack on the Tor network that was discovered and publicized in July 2014. Moreover, the details of that attack were eerily similar to the abstract of a (withdrawn) BlackHat presentation submitted by two researchers at the CERT division of Carnegie Mellon University (CMU).

A few hours later, the Tor Project made the allegations more explicit, posting a blog entry accusing CMU of accepting $1 million to conduct the attack. A spokesperson for CMU didn't exactly deny the allegations, but demanded better evidence and stated that he wasn't aware of any payment. No doubt we'll learn more in the coming weeks as more documents become public.

You might wonder why this is important. After all, the crimes we're talking about are pretty disturbing. One defendant is accused of possessing child pornography, and if the allegations are true, the other was a staff member on Silk Road 2.0. If CMU really did conduct Tor de-anonymization research for the benefit of the FBI, the people they identified were allegedly not doing the nicest things. It's hard to feel particularly sympathetic.

Except for one small detail: there's no reason to believe that the defendants were the only people affected.

If the details of the attack are as we understand them, a group of academic researchers deliberately took control of a significant portion of the Tor network. Without oversight from the University research board, they exploited a vulnerability in the Tor protocol to conduct a traffic confirmation attack, which allowed them to identify Tor client IP addresses and hidden services. They ran this attack for five months, and potentially de-anonymized thousands of users. Users who depend on Tor to protect them from serious harm.

It's quite possible that the CMU researchers exercised strict protocols to ensure that they didn't accidentally de-anonymize innocent bystanders. This would be standard procedure in any legitimate research involving human subjects, particularly research that has the potential to do harm. If the researchers did take such steps, it would be nice to know about them. CMU hasn't even admitted to the scope of the research project, nor have they published any results, so we just don't know.

While most of the computer science researchers I know are fundamentally ethical people, as a community we have a blind spot when it comes to the ethical issues in our field. There's a view in our community that Institutional Review Boards are for medical researchers, and we've somehow been accidentally caught up in machinery that wasn't meant for us. And I get this -- IRBs are unpleasant to work with. Sometimes the machinery is wrong.

But there's also a view that computer security research can't really hurt people, so there's no real reason for sort of ethical oversight machinery in the first place. This is dead wrong, and if we want to be taken seriously as a mature field, we need to do something about it.

We may need different machinery, but we need something. That something begins with the understanding that active attacks that affect vulnerable users can be dangerous, and should never be conducted without rigorous oversight -- if they must be conducted at all. It begins with the idea that universities should have uniform procedures for both faculty researchers and quasi-government organizations like CERT, if they live under the same roof. It begins with CERT and CMU explaining what went on with their research, rather than treating it like an embarrassment to be swept under the rug.

Most importantly, it begins with researchers looking beyond their own research practices. So far the response to the Tor news has been a big shrug. It's wonderful that most of our community is responsible. But none of that matters if we look the other way when others in our community fail to act responsibly.

Wednesday, October 21, 2015

A riddle wrapped in a curve

If you’re looking for a nice dose of crypto conspiracy theorizing and want to read a paper by some very knowledgeable cryptographers, I have just the paper for you. Titled “A Riddle Wrapped in an Enigma” by Neal Koblitz and Alfred J. Menezes, it tackles one of the great mysteries of the year 2015. Namely: why did the NSA just freak out and throw its Suite B program down the toilet?

Koblitz and Menezes are noted cryptographers and mathematicians, together responsible for scores of papers, including a recent series of position papers such as “Another Look at Provable Security” (mentioned previously on this blog) and “The Brave New World of Bodacious Assumptions in Cryptography”. (Yes, that is a real paper title.)

Their latest paper is highly accessible and contains almost no mathematics, so it’s worth your time to go read it now. If you don’t, or can’t handle reading things typeset in LaTeX, I’m going to sum up the gist of their arguments as best I can, then talk about why this all matters.

Let’s start with some background.

What is Suite B, and what do you mean by NSA ‘freaking out’?

For most of the past century, the NSA (and its predecessors) have jealously guarded a monopoly on advanced cryptography. This approach worked reasonably well through the 1980s, then rapidly became unsustainable. There were a couple of reasons for this. First, the PC revolution made it impossible for the NSA to keep encryption out of the hands of ordinary users and industry -- as much as the NSA tried to make this so. Second, Congress got fed up with paying for expensive bespoke computer systems, and decided the DoD should buy its computing equipment off the shelf like everyone else.

The result was the Cryptographic Modernization Program, and Suite B — the first public cryptography standard to include non-classified algorithms certified for encrypting Secret and Top Secret data. Suite B was standardized for industry in the mid/late 1990s, and was notable for its exclusive reliance on (then new) field of elliptic curve cryptography (ECC) for public key encryption and key agreement.

At the time of Suite B’s adoption, ECC was relatively new to the non-classified world. Many industry and academic cryptographers didn’t feel it had been reasonably studied (Koblitz and Menezes’ anecdotes on this are priceless). ECC was noteworthy for using dramatically shorter keys than alternative public-key algorithms such as RSA and “classical” Diffie-Hellman, largely because new sub-exponential attacks that worked in those settings did not seem to have any analogue in the ECC world. 

The NSA pushed hard for adoption. Since they had the best mathematicians, and moreover, clearly had the most to lose if things went south, the standards bodies were persuaded to go along. The algorithms of Suite B were standardized and — aside from a few intellectual property concerns — have been gradually adopted even by the non-classified community.

Then, in August of this year, NSA freaked out.

It was a quiet freakout, as you would expect from the world’s largest spy agency. But it made big waves with cryptographers. Citing concerns about future developments in quantum computing, the Agency updated the Suite B website to announce a rapid transition away from Suite B, and to a new set of quantum-resistant algorithms in the coming years.

Why all the sudden and immediate concern about quantum computers? Well, the NSA obviously isn’t willing to talk about that. But as of 2013, leaked slides from the Snowden cache show no real evidence of any massive quantum breakthroughs on the horizon that would necessitate a rapid transition from ECC.

Which quantum-resistant algorithms are we supposed to move to? Good question. Cryptographers haven’t agreed on any yet. Which makes the timing of this announcement so particularly mysterious.

So let me spell this out: despite the fact that quantum computers seem to be a long ways off and reasonable quantum-resistant replacement algorithms are nowhere to be seen, NSA decided to make this announcement publicly and not quietly behind the scenes. Weirder still, if you haven’t yet upgraded to Suite B, you are now being urged not to. In practice, that means some firms will stay with algorithms like RSA rather than transitioning to ECC at all. And RSA is also vulnerable to quantum attacks.

When reporters finally reached the NSA for comment, the only sound on the line was breaking glass and the sound of digging. (No, I’m kidding about that — but NSA did update their site again just to make it clear they weren’t playing some kind of deeply misguided April Fool’s prank.)

So that’s the background of the Koblitz/Menezes paper. The riddle is: what does the NSA know that we don’t?

No quantum advance?

Koblitz and Menezes exhaustively detail the possible reasons for the NSA’s announcement, ranging from the straightforward — that NSA actually has made a huge advance in developing quantum computers — to the downright bizarre. They even moot the possibility that the NSA is so embarrassed about getting caught tampering with national standards that they’ve just decided to flip the table and act like crazy people.

I’m going to skip most of the details, not because it isn’t interesting, but rather because I’d like to focus on what I think is the most intriguing hypotheses in the paper. Namely, that the NSA isn’t worried about quantum computers at all, but rather, that they’ve made a major advance in classical cryptanalysis of the elliptic curve discrete logarithm problem — and panic is the result.

Let me lay the groundwork. The security of most EC cryptosystems rests on the presumed intractability of a few basic mathematical problems. The most important of these is called the Elliptic Curve Discrete Logarithm Problem (ECDLP). This problem must be supremely hard to solve, or virtually every cryptosystem built on ECC unravels very quickly.

The definition of “hard” is important here. What we mean in the context of ECC is that the best known algorithm for solving the ECDLP requires a number of operations that is fully exponential in the security parameter. In practice, this means we can achieve 128-bit equivalent security using an elliptic curve set in a 256 bit field — which implies similarly-small keys and ciphertexts. To achieve equivalent security with RSA, where sub-exponential algorithms apply, your keys need to be at least 3072 bits (!) long.

But while the ability to use (relatively) tiny elliptic curve points is wonderful for implementers, it leaves no room for error. If NSA’s mathematicians began to make even modest, but sustained advances in the state of the art for solving the ECDLP, it would put the entire field at risk. Beginning with the smallest of the standard curves, P-256, which would now provide less than the required 128-bit security.

Did I mention that as part of the recent announcement, NSA also deprecated P-256

Of course, there’s no reason to believe that classical cryptanalysis is at fault here. It’s possible that NSA really has reason to believe that quantum computing is moving faster than it was in 2013. It’s also possible that they’re just sloppy, and didn’t realize that having a very public freakout about cryptography you just promoted for years might not be such a good idea. But that seems doubtful.

You can read the Koblitz and Menezes paper for their take on all of the alternative hypotheses. I want to take a last few paragraphs to cover a slightly tangential issue, one that’s been the subject of a great deal of panic on the part of our community.

Did the NSA really backdoor the NIST elliptic curves????1??1

One of the nicer points of the Koblitz and Menezes paper is that it tackles the argument that the NSA deliberately backdoored the NIST elliptic curves. 

This argument can be found in various places around the Internet. It originates with accomplished cryptographers who are quite rightly applying skepticism to an oddball and ad hoc process. It then heads off into crazytown. Whether people admit it or not, this concern has been hugely influential in the recent adoption of non-NIST alternative curves by the CFRG and IETF.

The accusation being leveled at NSA can be described simply. It’s known that the NIST pseudorandom curves were generated by the NSA hashing “seeds” through SHA1 to generate the curve parameters. This process was supposed to reassure the public, by eliminating the NSA’s ability to tamper with the parameters at will. However, nobody thought to restrict the NSA’s choice of input seeds — leading to a potential attack where they tried many different random seeds until they found a curve that contained a weakness the NSA could later exploit.

Koblitz and Menezes tackle this argument on grounds of basic common sense, history, and mathematics. The common sense argument, of course, is that you don’t sh*t where you eat. In other words, the NSA wouldn’t deliberately choose weak elliptic curves given that they planned to use them for encrypting Secret and Top Secret data for the next 20 years. 

But common sense convinces nobody. The mathematical argument is a bit more subtle, and may work where self interest didn’t. Roughly speaking, Koblitz and Menezes point out that the NSA, despite their vast computing power, can’t try every possible seed. This is especially true given that they were generating these curves using 1990s equipment, and the process of evaluating (counting points on) an elliptic curve is computationally intensive. Thus, they assume that the NSA could try only test 2^{48} or so different curves for their hypothetical weakness.

By calculating the number of possible curve families, Koblitz and Menezes show that a vast proportion of curves (for P-256, around 2^{209} out of 2^{257}) would have to be weak in order for the NSA to succeed in this attack. The implications of such a large class of vulnerable curves is very bad for the field of ECC. It dwarfs every previous known weak curve class and would call into question the decision to use ECC at all.

In other words, Koblitz and Menezes are saying that if you accept the weak curve hypothesis into your heart, the solution is not to replace the NIST elliptic curves with anything at all, but rather, to leave the building as rapidly as possible and perhaps not shut the door on the way out. No joke.

On the gripping hand, this sounds very much like the plan NSA is currently implementing. Perhaps we should be worried.

Tuesday, September 8, 2015

Let's talk about iMessage (again)

Source: Gizmodo
Yesterday's New York Times carried a story entitled "Apple and other tech companies tangle with U.S. over data access". It's a vague headline that manages to obscure the real thrust of the story, which is that according to reporters at the Times, Apple has not been forced to backdoor their popular encrypted iMessage system. This flies in the face of some rumors to the contrary.

While there's not much new information in here, people on Twitter seem to have some renewed interest in how iMessage works; whether Apple could backdoor it if they wanted to; and whether the courts could force them to. The answers to those questions are respectively: "very well", "absolutely", and "do I look like a national security lawyer?"

So rather than tackle the last one, which nobody seems to know the answer to, I figure it would be informative to talk about the technical issues with iMessage (again). So here we go.

How does iMessage work?

Fundamentally the mantra of iMessage is "keep it simple, stupid". It's not really designed to be an encryption system as much as it is a text message system that happens to include encryption. As such, it's designed to take away most of the painful bits you expect from modern encryption software, and in the process it makes the crypto essentially invisible to the user. Unfortunately, this simplicity comes at some cost to security.

Let's start with the good: Apple's marketing material makes it clear that iMessage encryption is "end-to-end" and that decryption keys never leave the device. This claim is bolstered by their public security documentation as well as outside efforts to reverse-engineer the system. In iMessage, messages are encrypted with a combination of 1280-bit RSA public key encryption and 128-bit AES, and signed with ECDSA under a 256-bit NIST curve. It's honestly kind of ridiculous, but whatever. Let's call it good enough.

iMessage encryption in a nutshell boils down to this: I get your public key, you get my public key, I can send you messages encrypted to you, and you can be sure that they're authentic and really came from me. Everyone's happy.

But here's the wrinkle: where do those public keys come from?

Where do you get the keys?


Key request to Apple's server.
It's this detail that exposes the real weakness of iMessage. To make key distribution 'simple', Apple takes responsibility for handing out your friends' public keys. It does this using a proprietary key server that Apple owns and operates. Your iPhone requests keys from Apple using a connection that's TLS-encrypted, and employs some fancy cryptographic tokens. But fundamentally, it relies on the assumption that Apple is good, and is really going to give you you the right keys for the person you want to talk to.

But this honesty is just an assumption. Since the key lookup is completely invisible to the user, there's nothing that forces Apple to be honest. They could, if inspired, give you a public key of their choosing, one that they hold the decryption key for. They could give you the FBI's key. They could give you Dwayne "The Rock" Johnson's key, though The Rock would presumably be very non-plussed by this.

Indeed it gets worse. Because iMessage is designed to support several devices attached to the same account, each query to the directory server can bring back many keys -- one for each of your devices. An attacker can simply add a device (or a fake 'ghost device') to Apple's key server, and senders will encrypt messages to that key along with the legitimate ones. This enables wiretapping, provided you can get Apple to help you out.

But why do you need Apple to help you out?

As described, this attack doesn't really require direct collaboration from Apple. In principle, the FBI could just guess the target's email password, or reset the password and add a new device all on their own. Even with a simple subpoena, Apple might be forced to hand over security questions and/or password hashes.

The real difficulty is caused by a final security feature in iMessage: when you add a new device, or modify the devices attached to your account, Apple's key server sends a notification to each of the existing devices already to the account. It's not obvious how this feature is implemented, but one thing is clear -- it seems likely that, at least in theory, Apple could shut it off if they needed to.* After all, this all comes down to code in the key server.

Fixing this problem seems hard. You could lock the key server in a giant cage, then throw away the key. But as long as Apple retains the ability to update their key server software, solving this problem seems fundamentally challenging. (Though not impossible -- I'll come back to this in a moment.)

Can governments force Apple to modify their key server?

It's not clear. While it seems pretty obvious that Apple could in theory substitute keys and thus enable eavesdropping, in practice it may require substantial changes to Apple's code. And while there are a few well-known cases in which the government has forced companies to turn over keys, changing the operation of a working system is a whole different ball of wax.

And iMessage is not just any working system. According to Apple, it handles several billion messages every day, and is fundamental to the operation of millions of iPhones. When you have a deployed system at that scale, the last thing you want to do is mess with it -- particularly if it involves crypto code that may not even be well understood by its creators. There's no amount of money you could pay me to be 'the guy who broke iMessage', even for an hour.

Any way you slice it, it's a risky operation. But for a real answer, you'll have to talk to a lawyer.

Why isn't key substitution a good solution to the 'escrow' debate?

Another perspective on iMessage -- one I've heard from some attorney friends -- is that key server tampering sounds like a pretty good compromise solution to the problem of creating a 'secure golden key' (AKA giving governments access to plaintext).

This view holds that key substitution allows only proactive eavesdropping: the government has to show up with a warrant before they can eavesdrop on a customer. They can't spy on everyone, and they can't go back and read your emails from last month. At the same time, most customers still get true 'end to end' encryption.

I see two problems with this view. First, tampering with the key server fundamentally betrays user trust, and undermines most of the guarantees offered by iMessage. Apple claims that they offer true end-to-end encryption that they can't read -- and that's reasonable in the threat model they've defined for themselves. The minute they start selectively substituting keys, that theory goes out the window. If you can substitute a few keys, why not all of them? In this world, Apple should expect requests from every Tom, Dick and Harry who wants access to plaintext, ranging from divorce lawyers to foreign governments.

A snapshot of my seven (!) currently enrolled iMessage
devices, courtesy Frederic Jacobs.
The second, more technical problem is that key substitution is relatively easy to detect. While Apple's protocols are an obfuscated mess, it is at least in theory possible for users to reverse-engineer them to view the raw public keys being transmitted -- and thus determine whether the key server is being honest. While most criminals are not this sophisticated, a few are. And if they aren't sophisticated, then tools can be built to make this relatively easy. (Indeed, people have already built such tools -- see my key registration profile at right.)

Thus key substitution represents at most a temporary solution to the 'government access' problem, and one that's fraught with peril for law enforcement, and probably disastrous for the corporations involved. It might seem tempting to head down this rabbit hole, but it's rabbits all the way down.

What can providers do to prevent key substitution attacks?
Signal's "key fingerprint" screen.

From a technical point of view, there are a number of things that providers can do to harden their key servers. One is to expose 'key fingerprints' to users who care, which would allow them to manually compare the keys they receive with the keys actually registered by other users. This approach is used by OpenWhisperSystems' Signal, as well as PGP. But even I acknowledge that this kind of stinks.

A more user-friendly approach is to deploy a variant of Certificate Transparency, which requires providers to publish a publicly verifiable proof that every public key they hand out is being transmitted to the whole world. This allows each client to check that the server is handing out the actual keys they registered -- and by implication, that every other user is seeing the same thing.

The most complete published variant of this is called CONIKS, and it was proposed by a group at Princeton, Stanford and the EFF (one of the more notable authors is Ed Felten, now Deputy U.S. Chief Technology Officer). CONIKS combined key transparency with a 'verification protocol' that allows clients to ensure that they aren't being sidelined and fed false information.

CONIKS isn't necessarily the only game in town when it comes to preventing key substitution attacks, but it represents a powerful existence proof that real defenses can be mounted. Even though Apple hasn't chosen to implement CONIKS, the fact that it's out there should be a strong disincentive for law enforcement to rely heavily on this approach.

So what next?

That's the real question. If we believe the New York Times, all is well -- for the moment. But not for the future. In the long term, law enforcement continues to ask for an approach that allows them to access the plaintext of encrypted messages. And Silicon Valley continues to find new ways to protect the confidentiality of their user's data, against a range of threats beginning in Washington and proceeding well beyond.

How this will pan out is anyone's guess. All we can say is that it will be messy.

Notes:

* How they would do this is really a question for Apple. The feature may involve the key server sending an explicit push message to each of the devices, in which case it would be easy to turn this off. Alternatively, the devices may periodically retrieve their own keys to see what Apple's server is sending out to the world, and alert the user when they see a new one. In the latter case, Apple could selectively transmit a doctored version of the key list to the device owner.

Sunday, August 16, 2015

The network is hostile

Yesterday the New York Times and ProPublica posted a lengthy investigation based on leaked NSA documents, outlining the extensive surveillance collaboration between AT&T and the U.S. government. This surveillance includes gems such as AT&T's assistance in tapping the main fiber connection supporting the United Nations, and that's only the start.

The usual Internet suspects are arguing about whether this is actually news. The answer is both yes and no, though I assume that the world at large will mostly shrug at this point. After all, we've learned so much about the NSA's operations at this point that we're all suffering from revelation-fatigue. It would take a lot to shock us now.

But this isn't what I want to talk about. Instead, the effect of this story was to inspire me to look back on the NSA leaks overall, to think about what they've taught us. And more importantly -- what they mean for the design of the Internet and our priorities as security engineers. That's what I'm going to ruminate about below.

The network is hostile

Anyone who has taken a network security class knows that the first rule of Internet security is that there is no Internet security. Indeed, this assumption is baked into the design of the Internet and most packet-switched networks -- systems where unknown third parties are responsible for handling and routing your data. There is no way to ensure that your packets will be routed as you want them, and there's absolutely no way to ensure that they won't be looked at.

Indeed, the implications of this were obvious as far back as ARPANET. If you connect from point A to point B, it was well known that your packets would traverse untrusted machines C, D and E in between. In the 1970s the only thing preserving the privacy of your data was a gentleman's agreement not to peek. If that wasn't good enough, the network engineers argued, you had to provide your own security between the endpoints themselves.

My take from the NSA revelations is that even though this point was 'obvious' and well-known, we've always felt it more intellectually than in our hearts. Even knowing the worst was possible, we still chose to believe that direct peering connections and leased lines from reputable providers like AT&T would make us safe. If nothing else, the NSA leaks have convincingly refuted this assumption.

We don't encrypt nearly enough

The most surprising lesson of the NSA stories is that 20 years after the development of SSL encryption, we're still sending vast amounts of valuable data in the clear.

Even as late as 2014, highly vulnerable client-to-server connections for services like Yahoo Mail were routinely transmitted in cleartext -- meaning that they weren't just vulnerable to the NSA, but also to everyone on your local wireless network. And web-based connections were the good news. Even if you carefully checked your browser connections for HTTPS usage, proprietary extensions and mobile services would happily transmit data such as your contact list in the clear. If you noticed and shut down all of these weaknesses, it still wasn't enough -- tech companies would naively transmit the same data through vulnerable, unencrypted inter-datacenter connections where the NSA could scoop them up yet again.

There is a view in our community that we're doing much better now, and to some extent we may be. But I'm less optimistic. From an attacker's point of view, the question is not how much we're encrypting, but rather, which valuable scraps we're not protecting. As long as we tolerate the existence of unencrypted protocols and services, the answer is still: way too much.

It's the metadata, stupid

Even if we, by some miracle, manage to achieve 100% encryption of communications content, we still haven't solved the whole problem. Unfortunately, today's protocols still leak a vast amount of useful information via session metadata. And we have no good strategy on the table to defend against it.

Examples of metadata leaked by today's protocols include protocol type, port number, and routing information such as source and destination addresses. It also includes traffic characteristics, session duration, and total communications bandwidth. Traffic analysis remains a particular problem: even knowing the size of the files requested by a TLS-protected browser connection can leak a vast amount of information about the user's browsing habits.

Absolutely none of this is news to security engineers. The problem is that there's so little we can do about it. Anonymity networks like Tor protect the identity of endpoints in a connection, but they do so at a huge cost in additional bandwidth and latency -- and they offer only limited protection in the face of a motivated global adversary. IPSec tunnels only kick the can to a different set of trusted components that themselves can be subverted.

'Full take' culture

Probably the most eye-opening fact of the intelligence leaks is the sheer volume of data that intelligence agencies are willing to collect. This is most famously exemplified by the U.S. bulk data collection and international call recording programs -- but for network engineers the more worrying incarnation is "full take" Internet collection devices like TEMPORA.

If we restrict our attention purely to the collection of such data -- rather than how it's accessed -- it appears that the limiting factors are almost exclusively technical in nature. In other words, the amount of data collected is simply a function of processing power, bandwidth and storage. And this is bad news for our future.

That's because while meaningful human communication bandwidth (emails, texts, Facebook posts, Snapchats) continues to increase substantially, storage and processing power increase faster. With some filtration, and no ubiquitous encryption, 'full take' is increasingly going to be the rule rather than the exception.

We've seen the future, and it's not American

Even if you're not inclined to view the NSA as an adversary -- and contrary to public perception, that view is not uniform even inside Silicon Valley -- America is hardly the only intelligence agency capable of subverting the global communications network. Nations like China are increasingly gaining market share in telecommunications equipment and services, especially in developing parts of the world such as Africa and the Middle East.

While it's cheap to hold China out as some sort of boogeyman, it's significant that someday a large portion of the world's traffic will flow through networks controlled by governments that are, at least to some extent, hostile to the core values of Western democracies.

If you believe that this is the future, then the answer certainly won't involve legislation or politics. The NSA won't protect us through cyber-retaliation or whatever plan is on the table today. If you're concerned about the future, then the answer is to finally, truly believe our propaganda about network trust. We need to learn to build systems today that can survive such an environment. Failing that, we need to adjust to a very different world.

Monday, July 20, 2015

A history of backdoors

(credit)
The past several months have seen an almost eerie re-awakening of the 'exceptional access' debate -- also known as 'Crypto Wars'. For those just joining the debate, the TL;DR is that law enforcement wants software manufacturers to build wiretapping mechanisms into modern encrypted messaging systems. Software manufacturers, including Google and Apple, aren't very thrilled with that.

The funny thing about this debate is that we've had it before. It happened during the 1990s with the discussion around Clipper chip, and the outcome was not spectacular for the pro-'access' side. But not everyone agrees.

Take, for example, former NSA general counsel Stewart Baker, who has his own reading of history:
A good example is the media’s distorted history of NSA’s 1994 Clipper chip. That chip embodied the Clinton administration’s proposal for strong encryption that “escrowed” the encryption keys to allow government access with a warrant. ... The Clipper chip and its key escrow mechanism were heavily scrutinized by hostile technologists, and one, Matthew Blaze, discovered that it was possible with considerable effort to use the encryption offered by the chip while bypassing the mechanism that escrowed the key and thus guaranteed government access. ... In any event, nothing about Matt Blaze’s paper questioned the security being offered by the chip, as his paper candidly admitted.
...
The press has largely ignored Blaze’s caveat.  It doesn’t fit the anti-FBI narrative, which is that government access always creates new security holes. I don’t think it’s an accident that no one talks these days about what Matt Blaze actually found except to say that he discovered “security flaws” in Clipper.  This formulation allows the reader to (falsely) assume that Blaze’s research shows that government access always undermines security. 
It's not clear why Mr. Baker is focusing on Clipper, rather than the much more recent train wreck of NSA's 'export-grade crypto' access proposals. It's possible that Baker just isn't that familiar with the issue. Indeed, it's the almost proud absence of technological expertise on the pro-'government access' side that has made this debate so worrying.

But before we get to the more recent history, we should clarify a few things. Yes: the fact that Clipper -- a multi-million dollar, NSA designed technology -- emerged with fundamental flaws in its design is a big deal. It matters regardless of whether the exploit led to plaintext recovery or merely allowed criminals to misuse the technology in ways they weren't supposed to.

But Clipper is hardly the end of the story. In fact, Clipper is only one of several examples of 'government access' mechanisms that failed and blew back on us catastrophically. More recent examples have occurred as recently as this year with the FREAK and LogJam attacks on TLS, resulting in vulnerabilities that affected nearly 1/3 of secure websites -- including (embarrassingly) the FBI and NSA themselves. And these did undermine security.

With Mr. Baker's post as inspiration, I'm going to spend the rest of this post talking about how real-world government access proposals have fared in practice -- and how the actual record is worse than any technologist could have imagined at the time.

The Clipper chip


 image: Travis Goodspeed
(CC BY 2.0 via Wikimedia
Clipper is the most famous of government access proposals. The chip was promoted as a ubiquitous hardware solution for voice encryption in the early 1990s -- coincidentally, right on the eve of a massive revolution in software-based encryption and network voice communications. In simple terms, this meant that technologically Clipper was already a bit of a dinosaur by the time it was proposed. 

Clipper was designed by the NSA, with key pieces of its design kept secret and hidden within tamper-resistant hardware. One major secret was the design of the Skipjack block cipher it used for encryption. All of this secrecy made it hard to evaluate the design, but the secrecy wasn’t simply the result of paranoia. Its purpose was to inhibit the development of unsanctioned Clipper-compatible devices that bypass Clipper's primary selling point — an overt law enforcement backdoor.

The backdoor worked as follows. Each Clipper chip shipped with a unique identifier and unit key that was programmed by blowing fuses during manufacture. Upon negotiating a session key with another Clipper, the chip would transmit a 128-bit Law Enforcement Access Field (LEAF) that contained an encrypted version of the ID and session key, wrapped using the device's unit key. The government maintained a copy of each device's access key, split and stored at two different sites.

To protect the government’s enormous investment in hardware and secret algorithms, the Clipper designers also incorporated an authentication mechanism consisting of a further 16-bit checksum on the two components of the LEAF key, further encrypted using a family key shared between all devices. This prevented a user from tampering with or destroying the LEAF checksum as it transited the wire -- any other compatible Clipper could decrypt and verify the checksum, then refuse the connection if it was invalid.


A simple way to visualize the Clipper design is to present it as three legs of a tripod, (badly) illustrated as follows:


The standout feature of Clipper's design is its essential fragility. If one leg of the tripod fails, the entire construction tumbles down around it. For example: if the algorithms and family keys became public, then any bad actor can build a software emulator that produced apparently valid but useless LEAFs. If tamper resistance failed, the family key and algorithm designs would leak out. And most critically: if the LEAF checksum failed to protect against on-the-wire modification, then all the rest of would be a waste of money and time. Criminals could hack legitimate Clippers to interoperate without fear of interception.

In other words, everything had to work, or nothing made any sense at all. Moreover, since most of the design was secret, users were forced to trust in its security. One high-profile engineering failure would tend to undermine that confidence.

Which brings us to Matt Blaze's results. In a famous 1994 paper, Blaze looked specifically at the LEAF authentication mechanism, and outlined several techniques for bypassing it on real Clipper prototypes. These ranged from the 'collaborative' -- the sender omits the LEAF from its transmission, and the receiver reflects its own LEAF back into its device -- to the 'unidirectional' where a sender simply generates random garbage LEAFs and until it finds one with a valid checksum. With only a 16-bit checksum, the latter techniques requires on average 65,536 attempts, and the sender's own device can be used as an oracle to check the consistency of each candidate. Blaze was able to implement a system that did this in minutes — and potentially in seconds, with parallelization.

That was essentially the ballgame for Clipper.

And now we can meditate on both the accuracy and utter irrelevance of Mr. Baker’s point. It’s true that Blaze's findings didn't break the confidentiality of Clipper conversations, nor were the techniques themselves terribly practical. But none of that mattered. 

What did matter were the implications for the Clipper system as a whole. The flaws in authentication illustrated that the designers and implementers of Clipper had made elementary mistakes that fundamentally undermined the purpose of all those other, expensive design componentsWithout the confidence of users or law enforcement, there was no reason for Clipper to exist. 

SSL/TLS Export ciphersuites: FREAK and LogJam

This would be the end of story if Clipper was the only 'government access' proposal to run off the road due to bad design and unintended consequences. Mr. Baker and Matt Blaze could call it a draw and go their separate ways. But of course, the story doesn't end with Clipper.

Mr. Baker doesn't mention this in his article, but we're still living with a much more pertinent example of a 'government access' system that failed catastrophically. Unlike Clipper, this failure really did have a devastating impact on the security of real encrypted connections. Indeed, it renders web browsing sessions completely transparent to a moderately clever attacker. Even worse, it affected hundreds of thousands of websites as recently as 2015.

The flaws I'm referring to stem from the U.S. government's pre-2000 promotion of 'export'-grade cryptography in the SSL and TLS protocols, which are used to secure web traffic and email all over the world. In order to export cryptography outside of the United States, the U.S. government required that web browsers and servers incorporate deliberately weakened ciphers that were (presumably) within the NSA's ability to access.

Unsurprisingly, while the export regulations were largely abandoned as a bad job in the late 1990s, the ciphersuites themselves live on in modern TLS implementations because that's what happens when you inter a broken thing into a widely-used standard. 

For the most part these weakened ciphers lay abandoned and ignored (but still active on many web servers) until this year, when researchers showed that it was possible to downgrade normal TLS connections to use export-grade ciphers. Ciphers that are, at this point, so weak that they can be broken in seconds on single personal computer.
Logjam is still unpatched in Chrome/MacOS as of the date of this post.
At the high watermark in March of this year, more than one out of three websites were vulnerable to either FREAK or LogJam downgrade attacks. This included banks, e-commerce sites, and yes -- the NSA website and FBI tip reporting line. Hope you didn't care much about that last one.

Now you could argue that the export requirements weren't designed to facilitate law enforcement access. But that's just shifting the blame from one government agency to another. Worse, it invites us to consider the notion that the FBI is going to get cryptography right when the NSA didn't. This is not a conceptual framework you want to hang your policies on.

Conclusion

This may sound disingenuous, but the truth is that I sympathize with Mr. Baker. It's frustrating that we're so bad at building security systems in this day and age. It's maddening that we can't engineer crypto reliably even when we're trying our very best.

But that's the world we live in. It's a world where we know our code is broken, and a world where a single stupid Heartbleed or Shellshock can burn off millions of dollars in a few hours. These bugs exist, and not just the ones I listed. They exist right now as new flaws that we haven't discovered yet. Sooner or later maybe I'll get to write about them.

The idea of deliberately engineering weakened crypto is, quite frankly, terrifying to experts. It gives us the willies. We're not just afraid to try it. We have seen it tried -- in the examples I list above, and in still others -- and it's just failed terribly.

Friday, May 22, 2015

Attack of the week: Logjam

In case you haven't heard, there's a new SSL/TLS vulnerability making the rounds. Nicknamed Logjam, the new attack is 'special' in that it may admit complete decryption or hijacking of any TLS connection you make to an improperly configured web or mail server. Worse, there's at least circumstantial evidence that similar (and more powerful) attacks might already be in the toolkit of some state-level attackers such as the NSA.

This work is the result of an unusual collaboration between a fantastic group of co-authors spread all around the world, including institutions such as the University of Michigan, INRIA Paris-Rocquencourt, INRIA Paris-Nancy, Microsoft Research, Johns Hopkins and the University Of Pennsylvania. It's rare to see this level of collaboration between groups with so many different areas of expertise, and I hope to see a lot more like it. (Disclosure: I am one of the authors.)

The absolute best way to understand the Logjam result is to read the technical research paper. This post is mainly aimed at people who want a slightly less technical form. For those with even shorter attention spans, here's the TL;DR:
It appears that the the Diffie-Hellman protocol, as currently deployed in SSL/TLS, may be vulnerable to a serious downgrade attack that restores it to 1990s "export" levels of security, and offers a practical "break" of the TLS protocol against poorly configured servers. Even worse, extrapolation of the attack requirements -- combined with evidence from the Snowden documents -- provides some reason to speculate that a similar attack could be leveraged against protocols (including TLS, IPSec/IKE and SSH) using 768- and 1024-bit Diffie-Hellman. 
I'm going to tackle this post in the usual 'fun' question-and-answer format I save for this sort of thing.
What is Diffie-Hellman and why should I care about TLS "export" ciphersuites?
Diffie-Hellman is probably the most famous public key cryptosystem ever invented. Publicly discovered by Whit Diffie and Martin Hellman in the late 1970s (and a few years earlier, in secret, by UK GCHQ), it allows two parties to negotiate a shared encryption key over a public connection.

Diffie-Hellman is used extensively in protocols such as SSL/TLS and IPSec, which rely on it to establish the symmetric keys that are used to transport data. To do this, both parties must agree on a set of parameters to use for the key exchange. In traditional ('mod p') Diffie-Hellman, these parameters consist of a large prime number p, as well as a 'generator' g. The two parties now exchange keys as shown below:

Classical Diffie-Hellman (source).
TLS supports several variants of Diffie-Hellman. The one we're interested in for this work is the 'ephemeral' non-elliptic ("DHE") protocol variant, which works in a manner that's nearly identical to the diagram above. The server takes the role of Alice, selecting (p, g, ga mod p) and signing this tuple (and some nonces) using its long-term signing key. The client responds gb mod p and the two sides then calculate a shared secret.

Just for fun, TLS also supports an obsolete 'export' variant of Diffie-Hellman. These export ciphersuites are a relic from the 1990s when it was illegal to ship strong encryption out of the country. What you need to know about "export DHE" is simple: it works identically to standard DHE, but limits the size of p to 512 bits. Oh yes, and it's still out there today. Because the Internet.
How do you attack Diffie-Hellman?
The best known attack against a correct Diffie-Hellman implementation involves capturing the value gand solving to find the secret key a. The problem of finding this value is known as the discrete logarithm problem, and it's thought to be a mathematically intractable, at least when Diffie-Hellman is implemented in cryptographically strong groups (e.g., when p is of size 2048 bits or more).

Unfortunately, the story changes dramatically when p is relatively small -- for example, 512 bits in length. Given a value gmod p for a 512-bit p, itshould at least be possible to efficiently recover the secret a and read traffic on the connection.
Most TLS servers don't use 512-bit primes, so who cares?
The good news here is that weak Diffie-Hellman parameters are almost never used purposely on the Internet. Only a trivial fraction of the SSL/TLS servers out there today will organically negotiate 512-bit Diffie-Hellman. For the most part these are crappy embedded devices such as routers and video-conferencing gateways.

However, there is a second class of servers that are capable of supporting 512-bit Diffie-Hellman when clients request it, using a special mode called the 'export DHE' ciphersuite. Disgustingly, these servers amount to about 8% of the Alexa top million sites (and a whopping 29% of SMTP/STARTLS mail servers). Thankfully, most decent clients (AKA popular browsers) won't willingly negotiate 'export-DHE', so this would also seem to be a dead end.

It isn't. 

ServerKeyExchange message (RFC 5246)
You see, before SSL/TLS peers can start engaging in all this fancy cryptography, they first need to decide which ciphers they're going to use. This is done through a negotiation process in which the client proposes some options (e.g., RSA, DHE, DHE-EXPORT), and the server picks one.

This all sound simple enough. However, one of the early, well known flaws in SSL/TLS is the protocol's failure to properly authenticate these 'negotiation' messages. In very early versions of SSL they were not authenticated at all. SSLv3 and TLS tacked on an authentication process -- but one that takes place only at the end of the handshake.* 

This is particularly unfortunate given that TLS servers often have the ability to authenticate their messages using digital signatures, but don't really take advantage of this. For example, when two parties negotiate Diffie-Hellman, the parameters sent by the server are transmitted within a signed message called the ServerKeyExchange (shown at right). The signed portion of this message covers the parameters, but neglects to include any information about which ciphersuite the server thinks it's negotiating. If you remember that the only difference between DHE and DHE-EXPORT is the size of the parameters the server sends down, you might start to see the problem.

Here it is in a nutshell: if the server supports DHE-EXPORT, the attacker can 'edit' the negotiation messages sent from the a client -- even if the client doesn't support export DHE -- replacing the client's list of supported ciphers with only export DHE. The server will in turn send back a signed 512-bit export-grade Diffie-Hellman tuple, which the client will blindly accept -- because it doesn't realize that the server is negotiating the export version of the ciphersuite. From its perspective this message looks just like 'standard' Diffie-Hellman with really crappy parameters. 

Overview of the Logjam active attack (source: paper).
All this tampering should run into a huge snag at the end of the handshake, when he client and server exchange Finished messages embedding include a MAC of the transcript. At this point the client should learn that something funny is going on, i.e., that what it sent no longer matches what the server is seeing. However, the loophole is this: if the attacker can recover the Diffie-Hellman secret quickly -- before the handshake ends -- she can forge her own Finished messages. In that case the client and server will be none the wiser.

The upshot is that executing this attack requires the ability to solve a 512-bit discrete logarithm before the client and server exchange Finished messages. That seems like a tall order.
Can you really solve a discrete logarithm before a TLS handshake times out?
In practice, the fastest route to solving the discrete logarithm in finite fields is via an algorithm called the Number Field Sieve (NFS). Using NFS to solve a single 512-bit discrete logarithm instance requires several core-years -- or about week of wall-clock time given a few thousand cores -- which would seem to rule out solving discrete logs in real time.

However, there is a complication. In practice, NFS can actually be broken up into two different steps:
  1. Pre-computation (for a given prime p). This includes the process of polynomial selection, sieving, and linear algebra, all of which depend only on p. The output of this stage is a table for use in the second stage.
  2. Solving to find a (for a given gmod p). The final stage, called the descent, uses the table from the precomputation. This is the only part of the algorithm that actually involves a specific g and ga.
The important thing to know is that the first stage of the attack consumes the vast majority of the time, up to a full week on a large-scale compute cluster. The descent stage, on the other hand, requires only a few core-minutes. Thus the attack cost depends primarily on where the server gets its Diffie-Hellman parameters from. The best case for an attacker is when p is hard-coded into the server software and used across millions of machines. The worst case is when p is re-generated routinely by the server.

I'll let you guess what real TLS servers actually do.

In fact, large-scale Internet scans by the team at University of Michigan show that most popular web servers software tends to re-use a small number of primes across thousands of server instances. This is done because generating prime numbers is scary, so implementers default to using a hard-coded value or a config file supplied by your Linux distribution. The situation for export Diffie-Hellman is particularly awful, with only two (!) primes used across up 92% of enabled Apache/mod_ssl sites.
Number of seconds to solve a
512-bit discrete log (source: paper).

The upshot of all of this is that about two weeks of pre-computation is sufficient to build a table that allows you to perform the downgrade against most export-enabled servers in just a few minutes (see the chart at right). This is fast enough that it can be done before the TLS connection timeout. Moreover, even if this is not fast enough, the connection can often be held open longer by using clever protocol tricks, such as sending TLS warning messages to reset the timeout clock.

Keep in mind that none of this shared prime craziness matters when you're using sufficiently large prime numbers (on the order of 2048 bits or more). It's only a practical issue you're using small primes, like 512-bit, 768-bit or -- and here's a sticky one I'll come back to in a minute -- 1024 bit.
How do you fix the downgrade to export DHE?
The best and most obvious fix for this problem is to exterminate export ciphersuites from the Internet. Unfortunately, these awful configurations are the default in a number of server software packages (looking at you Postfix), and getting people to update their configurations is surprisingly difficult (see e.g., FREAK).

A simpler fix is to upgrade the major web browsers to resist the attack. The easy way to do this is to enforce a larger minimum size for received DHE keys. The problem here is that the fix itself causes some collateral damage -- it will break a small but significant fraction of lousy servers that organically negotiate (non-export) DHE with 512 bit keys.

The good news here is that the major browsers have decided to break the Internet (a little) rather than allow it to break them. Each has agreed to raise the minimum size limit to at least 768 bits, and some to a minimum of 1024 bits. It's still not perfect, since 1024-bit DHE may not be cryptographically sound against powerful attackers, but it does address the immediate export attack. In the longer term the question is whether to use larger negotiated DHE groups, or abandon DHE altogether and move to elliptic curves.
What does this mean for larger parameter sizes?
The good news so far is that 512-bit Diffie-Hellman is only used by a fraction of the Internet, even when you account for active downgrade attacks. The vast majority of servers use Diffie-Hellman moduli of length at least 1024 bits. (The widespread use of 1024 is largely due to a hard-cap in older Java clients. Go away Java.)

While 2048-bit moduli are generally believed to be outside of anyone's reach, 1024-bit DHE has long been considered to be at least within groping range of nation-state attackers. We've known this for years, of course, but the practical implications haven't been quite clear. This paper tries to shine some light on that, using Internet-wide measurements and software/hardware estimates.

If you recall from above, the most critical aspect of the NFS attack is the need to perform large amounts of pre-computation on a given Diffie-Hellman prime p, followed by a relatively short calculation to break any given connection that uses p. At the 512-bit size the pre-computation only requires about a week. The question then is, how much does it cost for a 1024-bit prime, and how common are shared primes?

While there's no exact way to know how much the 1024-bit attack would cost, the paper attempts to provide some extrapolations based on current knowledge. With software, the cost of the pre-computation seems quite high -- on the order of 35 million core-years. Making this happen for a given prime within a reasonable amount of time (say, one year) would appear to require billions of dollars of computing equipment if we assume no algorithmic improvements. Even if we rule out such improvements, it's conceivable that this cost might be brought down to a few hundred million dollars using hardware. This doesn't seem out of bounds when you consider leaked NSA cryptanalysis budgets.

What's interesting is that the descent stage, required to break a given Diffie-Hellman connection, is much faster. Based on some implementation experiments by the CADO-NFS team, it may be possible to break a Diffie-Hellman connection in as little as 30 core-days, with parallelization hugely reducing the wall-clock time. This might even make near-real-time decryption of Diffie-Hellman connections practical.
Is the NSA actually doing this?
So far all we've noted is that NFS pre-computation is at least potentially feasible when 1024-bit primes are re-used. That doesn't mean the NSA is actually doing any of it.

There is some evidence, however, that suggests the NSA has decryption capability that's at least consistent with such a break. This evidence comes from a series of Snowden documents published last winter in Der Spiegel. Together they describe a large-scale effort at NSA and GCHQ, capable of decrypting 'vast' amounts of Internet traffic, including IPSec, SSH and HTTPS connections.

NSA slide illustrating exploitation
of IPSec encrypted traffic (source: Spiegel).
While the architecture described by the documents mentions attacks against many protocols, the bulk of the energy seems to be around the IPSec and IKE protocols, which are used to establish Virtual Private Networks (VPNs) between individuals and corporate networks such as financial institutions.

The nature of the NSA's exploit is never made clear in the documents, but diagram at right gives a lot of the architectural details. The system involves collecting Internet Key Exchange (IKE) handshakes, transmitting them to the NSA's Cryptanalysis and Exploitation Services (CES) enclave, and feeding them into a decryption system that controls substantial high performance computing resources to process the intercepted exchanges. This is at least circumstantially consistent with Diffie-Hellman cryptanalysis.

Of course it's entirely possible that the attack is based on a bad random number generator, weak symmetric encryption, or any number of engineered backdoors. There are a few pieces of evidence that militate towards a Diffie-Hellman break, however:

  1. IPSec (or rather, the IKE key exchange) uses Diffie-Hellman for every single connection, meaning that it can't be broken without some kind of exploit, although this doesn't rule out the other explanations.
  2. The IKE exchange is particularly vulnerable to pre-computation, since IKE uses a small number of standardized prime numbers called the Oakley groups, which are going on 17 years old now. Large-scale Internet scanning by the Michigan team shows that a majority of responding IPSec endpoints will gladly negotiate using Oakley Group 1 (768 bit) or Group 2 (1024 bit), even when the initiator offers better options.
  3. The NSA's exploit appears to require the entire IKE handshake as well as any pre-shared key (PSK). These inputs would be necessary for recovery of IKEv1 session keys, but are not required in a break that involves only symmetric cryptography.
  4. The documents explicitly rule out the use of malware, or rather, they show that such malware ('TAO implants') is in use -- but that malware allows the NSA to bypass the IKE handshake altogether.
I would stipulate that beyond the Internet measurements and computational analysis, this remains firmly in the category of  'crazy-eyed informed speculation'. But while we can't rule out other explanations, this speculation is certainly consistent with a hardware-optimized break of Diffie-Hellman 768 and 1024-bit, along with some collateral damage to SSH and related protocols.
So what next?
The paper gives a detailed set of recommendations on what to do about these downgrade attacks and (relatively) weak DHE groups. The website provides a step-by-step guide for server administrators. In short, probably the best long-term move is to switch to elliptic curves (ECDHE) as soon as possible. Failing this, clients and servers should enforce at least 2048-bit Diffie-Hellman across the Internet. If you can't do that, stop using common primes.

Making this all happen on anything as complicated as the Internet will probably consume a few dozen person-lifetimes. But it's something we have to do, and will do, to make the Internet work properly.

Notes:

* There are reasons for this. Some SSL/TLS ciphersuites (such as the RSA encryption-based ciphersuites) don't use signatures within the protocol, so the only way to authenticate the handshake is to negotiate a ciphersuite, run the key exchange protocol, then use the resulting shared secret to authenticate the negotiation messages after the fact. But SSL/TLS DHE involves digital signatures, so it should be possible to achieve a stronger level of security than this. It's unfortunate that the protocol does not.