The strange story of “Extended Random”

Yesterday, David Benjamin posted a pretty esoteric note on the IETF’s TLS mailing list. cap032At a superficial level, the post describes some seizure-inducingly boring flaws in older Canon printers. To most people that was a complete snooze. To me and some of my colleagues, however, it was like that scene in X-Files where Mulder and Scully finally learn that aliens are real.

Those fossilized printers confirmed a theory we’d developed in 2014, but had been unable to prove: namely, the existence of a specific feature in RSA’s BSAFE TLS library called “Extended Random” — one that we believe to be evidence of a concerted effort by the NSA to backdoor U.S. cryptographic technology.

Before I get to the details, I want to caveat this post in two different ways. First, I’ve written about the topic of cryptographic backdoors way too much. In 2013, the Snowden revelations revealed the existence of a campaign to sabotage U.S. encryption systems. Since that time, cryptographers have spent thousands of hours identifying, documenting, and trying to convince people to care about these backdoors. We’re tired and we want to do more useful things.

The second caveat covers a problem with any discussion of cryptographic backdoors. Specifically, you never really get absolute proof. There’s always some innocent or coincidental explanation that could sort of fit the evidence — maybe it was all a stupid mistake. So you look for patterns of unlikely coincidences, and use Occam’s razor a lot. You don’t get a Snowden every day.

With all that said, let’s talk about Extended Random, and what this tells us about the NSA. First some background.

Dual_EC_DRBG and RSA BSAFE

To understand the context of this discovery, you need to know about a standard called Dual EC DRBG. This was a proposed random number generator that the NSA developed in the early 2000s. It was standardized by NIST in 2007, and later deployed in some important cryptographic products — though we didn’t know it at the time.

Dual EC has a major problem, which is that it likely contains a backdoor. This was pointed out in 2007 by Shumow and Ferguson, and effectively confirmed by the Snowden leaks in 2013. Drama ensued. NIST responded by pulling the standard. (For an explainer on the Dual EC backdoor, see here.)

Somewhere around this time the world learned that RSA Security had made Dual EC the default random number generator in their popular cryptographic library, which was called BSAFE. RSA hadn’t exactly kept this a secret, but it was such a bonkers thing to do that nobody (in the cryptographic community) had known. So for years RSA shipped their library with this crazy algorithm, which made its way into all sorts of commercial devices.

The RSA drama didn’t quite end there, however. In late 2013, Reuters reported that RSA had taken $10 million to backdoor their software. RSA sort of denies this. Or something. It’s not really clear.

Regardless of the intention, it’s known that RSA BSAFE did incorporate Dual EC. This could have been an innocent decision, of course, since Dual EC was a NIST standard. To shed some light on that question, in 2014 my colleagues and I decided to reverse-engineer the BSAFE library to see if it the alleged backdoor in Dual EC was actually exploitable by an attacker like the NSA. We figured that specific engineering decisions made by the library designers could be informative in tipping the scales one way or the other.

It turns out they were.

Extended Random

In the course of reverse engineering the Java version of BSAFE, we discovered a funny inclusion. Specifically, we found that BSAFE supports a non-standard extension to the TLS protocol called “Extended Random”.

The Extended Random extension is an IETF Draft proposed by an NSA employee named Margaret Salter (at some point the head of NSA’s Information Assurance Directorate, which worked on “defensive” crypto for DoD) along with Eric Rescorla as a contractor. (Eric was very clearly hired to develop a decent proposal that wouldn’t hurt TLS, and would primarily be used on government machines. The NSA did not share their motivations with him.)

It’s important to note that Extended Random by itself does not introduce any cryptographic vulnerabilities. All it does is increase the amount of random data (“nonces”) used in a TLS protocol connection. This shouldn’t hurt TLS at all, and besides it was largely intended for U.S. government machines.

The only thing that’s interesting about Extended Random is what happens when that random data is generated using the Dual EC algorithm. Specifically, this extra data acts as “rocket fuel”, significantly increasing the efficiency of exploiting the Dual EC backdoor to decrypt TLS connections.

In short, if you’re an agency like the NSA that’s trying to use Dual EC as a backdoor to intercept communications, you’re much better off with a system that uses both Dual EC DRBG and Extended Random. Since Extended Random was never standardized by the IETF, it shouldn’t be in any systems. In fact, to the best of our knowledge, BSAFE is the only system in the world that implements it.

In addition to Extended Random, we discovered a variety of features that, combined with the Dual EC backdoor, could make RSA BSAFE fairly easy to exploit. But Extended Random is by far the strangest and hardest to justify.

So where did this standard come from? For those who like technical mysteries, it turns out that Extended Random isn’t the only funny-smelling proposal the NSA made. It’s actually one of four failed IETF proposals made by NSA employees, or contractors who work closely with the NSA, all of which try to boost the amount of randomness in TLS. Thomas Ptacek has a mind-numbingly detailed discussion of these proposals and his view of their motivation in this post.

Oh my god I never thought spies could be so boring. What’s the new development?

Despite the fact that we found Extended Random in RSA BSAFE (a free version we downloaded from the Internet), a fly in the ointment was that it didn’t actually seem to be enabled. That is: the code was there but the switches to enable it were hard-coded to “off”.

This kind of put a wrench in our theory that RSA might have included Extended Random to make BSAFE connections more exploitable by the NSA. There might be some commercial version of BSAFE out there with this code active, but we were never able to find it or prove it existed. And even worse, it might appear only in some special “U.S. government only” version of BSAFE, which would tend to undermine the theory that there was something intentional about including this code — after all, why would the government spy on itself?

Which finally brings us to the news that appeared on the TLS mailing list the other day. It turns out that certain Canon printers are failing to respond properly to connections made using the new version of TLS (which is called 1.3), because they seem to have implemented an unauthorized TLS extension using the same number as an extension that TLS 1.3 needs in order to operate correctly. Here’s the relevant section of David’s post:

The web interface on some Canon printers breaks with 1.3-capable
ClientHello messages. We have purchased one and confirmed this with a
PIXMA MX492. User reports suggest that it also affects PIXMA MG3650
and MX495 models. It potentially affects a wide range of Canon
printers.

These printers use the RSA BSAFE library to implement TLS and this
library implements the extended_random extension and assigns it number
40. This collides with the key_share extension and causes 1.3-capable
handshakes to fail.

So in short, this news appears to demonstrate that commercial (non-free) versions of RSA BSAFE did deploy the Extended Random extension, and made it active within third-party commercial products. Moreover, they deployed it specifically to machines — specifically off-the-shelf commercial printers — that don’t seem to be reserved for any kind of special government use.

(If these turn out to be special Department of Defense printers, I will eat my words.)

Ironically, the printers are now the only thing that still exhibits the features of this (now deprecated) version of BSAFE. This is not because the NSA was targeting printers. Whatever devices they were targeting are probably gone by now. It’s because printer firmware tends to be obsolete and yet highly persistent. It’s like a remote pool buried beneath the arctic circle that preserves software species that would otherwise vanish from the Internet.

Which brings us to the moral of the story: not only are cryptographic backdoors a terrible idea, but they totally screw up the assigned numbering system for future versions of your protocol.

Actually no, that’s a pretty useless moral. Instead, let’s just say that you can deploy a cryptographic backdoor, but it’s awfully hard to control where it will end up.

Hopefully the last post I’ll ever write on Dual EC DRBG

I’ve been working on some other blog posts, including a conclusion of (or at least an installment in) this exciting series on zero knowledge proofs. That’s coming soon, but first I wanted to take a minute to, well, rant.

The subject of my rant is this fascinating letter authored by NSA cryptologist Michael Wertheimer in February’s Notices of the American Mathematical Society. Dr. Wertheimer is currently the Director of Research at NSA, and formerly held the position of Assistant Deputy Director and CTO of the Office of the Director of National Intelligence for Analysis.

In other words, this is a guy who should know what he’s talking about.

The subject of Dr. Wertheimer’s letter is near and dear to my heart: the alleged subversion of NIST’s standards for random number generation — a subversion that was long suspected and apparently confirmed by classified documents leaked by Edward Snowden. The specific algorithm in question is called Dual EC DRBG, and it very likely contains an NSA backdoor. Those who’ve read this blog should know that I think it’s as suspicious as a three dollar bill.

Reading Dr. Wertheimer’s letter, you might wonder what I’m so upset about. On the face of it, the letter appears to express regret. To quote (with my emphasis):

With hindsight, NSA should have ceased supporting the Dual_EC_DRBG algorithm immediately after security researchers discovered the potential for a trapdoor. In truth, I can think of no better way to describe our failure to drop support for the Dual_EC_DRBG algorithm as anything other than regrettable. The costs to the Defense Department to deploy a new algorithm were not an adequate reason to sustain our support for a questionable algorithm. Indeed, we support NIST’s April 2014 decision to remove the algorithm. Furthermore, we realize that our advocacy for the Dual_EC_DRBG casts suspicion on the broader body of work NSA has done to promote secure standards. 

I agree with all that. The trouble is that on closer examination, the letter doesn’t express regret for the inclusion of Dual EC DRBG in national standards. The transgression Dr. Wertheimer identifies is merely that NSA continued to support the algorithm after major questions were raised. That’s bizarre.

Even worse, Dr. Wertheimer reserves a substantial section of his letter for a defense of the decision to deploy Dual EC. It’s those points that I’d like to address in this post.

Let’s take them one at a time.

1: The Dual_EC_DRBG was one of four random number generators in the NIST standard; it is neither required nor the default.

It’s absolutely true that Dual EC was only one of four generators in the NIST standard. It was not required for implementers to use it, and in fact they’d be nuts to use it — given that overall it’s at least two orders of magnitude slower than the other proposed generators.

The bizarre thing is that people did indeed adopt Dual EC in major commercial software packages. Specifically, RSA Security included it as the default generator in their popular BSAFE software library. Much worse, there’s evidence that RSA was asked to do this by NSA, and were compensated for their compliance.

This is the danger with standards. Once NIST puts its seal on an algorithm, it’s considered “safe”. If the NSA came to a company and asked it to use some strange, non-standard algorithm, the request would be considered deeply suspicious by company and customers alike. But how can you refuse to use a standard if your biggest client asks you to? Apparently RSA couldn’t.

2: The NSA-generated elliptic curve points were necessary for accreditation of the Dual_EC_DRBG but only had to be implemented for actual use in certain DoD applications.

This is a somewhat misleading statement, one that really needs to be unpacked.

First, the original NSA proposal of Dual EC DRBG contained no option for alternate curve points. This is an important point, since its the selection of curve points that give Dual EC its potential for a “back door”. By generating two default points (P, Q) in a specific way, the NSA may have been able to create a master key that would allow them to very efficiently decrypt SSL/TLS connections.

If you like conspiracy theories, here’s what NIST’s John Kelsey was told when he asked how the NSA’s points were generated:

In 2004-2005, several participants on the ANSI X9 tools committee pointed out the potential danger of this backdoor. One of them even went so far as to file a patent on using the idea to implement key escrow for SSL/TLS connections. (It doesn’t get more passive aggressive than that.)

In response to the discovery of such an obvious flaw, the ANSI X9 committee immediately stopped recommending the NSA’s points — and relegated them to be simply an option, one to be used by the niche set of government users who required them.

I’m only kidding! Actually the committee did no such thing.

Instead, at the NSA’s urging, the ANSI committee retained the original NSA points as the recommended parameters for the standard. It then added an optional procedure for generating alternative points. When NIST later adopted the generator in its SP800-90A standard, it mirrored the ANSI decision. But even worse, NIST didn’t even bother to publish the alternative point generation algorithm. To actually implement it, you’d need to go buy the (expensive) non-public-domain ANSI standard and figure it out to implement it yourself:

This is, to paraphrase Douglas Adams, the standards committee equivalent of putting the details in the bottom of a locked filing cabinet stuck in a disused lavatory with a sign on the door saying ‘Beware of the Leopard’.

To the best of our knowledge, nobody has ever used ANSI’s alternative generation procedure in a single one of the many implementations of Dual EC DRBG in commercial software.  It’s not even clear how you could have used that procedure in a FIPS-certified product, since the FIPS evaluation process (conducted by CMVP) still requires you to test against the NSA-generated points.

3. The trapdoor concerns were openly studied by ANSI X9F1, NIST, and by the public in 2007. 

This statement has the benefit of being literally true, while also being pretty damned misleading.

It is true that in 2007 — after Dual EC had been standardized — two Microsoft researchers, Dan Shumow and Neils Ferguson openly raised the alarm about Dual EC. The problem here is that the flaws in Dual EC were not first discovered in 2007. They were discovered much earlier in the standardization process and nobody ever heard about them.

As I noted above, the ANSI X9 committee detected the flaws in Dual EC as early as 2004, and in close consultation with NSA agreed to address them — in a manner that was highly beneficial to the NSA. But perhaps that’s understandable, given that the committee was anything but ‘open’.

In fact, this is an important aspect of the controversy that even NIST has criticized. The standardization of these algorithms was conducted through ANSI. And the closed ANSI committee consisted of representatives from a few select companies, NIST and the NSA. No public notice was given of the potential vulnerabilities discovered in the RNG. Moreover, a patent application that might have shone light on the backdoor was mired in NSA pre-publication review for over two years.

This timeline issue might seem academic, but bear this in mind: we now know that RSA Security began using the Dual EC DRBG random number generator in BSAFE — as the default, I remind you — way back in 2004. That means for three years this generator was widely deployed, yet serious concerns were not communicated to the public.

To state that the trapdoor concerns were ‘openly’ studied in 2007 is absolutely true. It’s just completely irrelevant.

In conclusion

I’m not a mathematician, but like anyone who works in a mathematical area, I find there are aspects of the discipline that I love. For me it’s the precision of mathematical statements, and the fact that the truth or falsity of a statement can — ideally — be evaluated from the statement itself, without resorting to differing opinions or understandings of the context.

While Dr. Wertheimer’s letter is hardly a mathematical work, it troubles me to see such confusing statements in a publication of the AMS. As a record of history, Dr. Wertheimer’s letter leaves much to be desired, and could easily lead people to the wrong understanding.

Given the stakes, we deserve a more exact accounting of what happened with Dual EC DRBG. I hope someday we’ll see that.

A few more notes on NSA random number generators

Last Friday, Joseph Menn from Reuters published an article claiming that RSA, the pioneering security firm and division of EMC, accepted $10 million dollars to include the 3a7ac-thomas_adversaryDual EC random number generator as the default in their flagship BSAFE library. I’ve written a bit about Dual EC on this blog, so readers will know that I don’t think highly of it. For one thing, it’s a rotten choice for a commercial product, due to its lousy software performance. Much worse: it may introduce a practical backdoor into any system that uses it.

Given the numerous problems with Dual EC it’s baffling that RSA would select this particular generator as the default for their software. It’s particularly strange given that BSAFE is designed for use in Java-based and embedded systems where performance truly is at a premium. And none of this can be explained by the needs of a single RSA customer, since those needs could be satisfied merely by making BSAFE an option, rather than the default.

Of course there have been many people who had viewed RSA’s decision as potentially malicious. Unsupported rumors have been floating around since long before Reuters got involved. What’s new this time is that RSA insiders appear to be going on the record.

I suppose time will tell how the industry reacts to this news. In the interim I have a few small facts to add to the discussion, so I thought I’d sketch them out in this post.

#1: Dual_EC_DRBG’s ‘backdoor’ was known as of January 2005

It’s widely believed that the ‘vulnerability’ in Dual_EC was first identified by Microsoft employees Dan Shumow and Niels Ferguson in the summer of 2007. Tanja Lange (and nymble) recently tipped me off to the fact that this isn’t precisely true.

In point of fact, the possibility of a backdoor was known to at least some members of the ANSI X9.82 standardization committee as far back in January 2005. This surprising news comes via a patent application filed by Certicom employees Dan Brown and Scott Vanstone. The application claims a priority date of January 2005. Here’s the scary bit:

If P and Q are established in a security domain controlled by an administrator, and the entity who generates Q for the domain does so with knowledge of e (or indirectly via knowledge of d). The administrator will have an escrow key for every ECRNG that follows that standard. 

Escrow keys are known to have advantages in some contexts. They can provide a backup functionality. If a cryptographic key is lost, then data encrypted under that key is also lost. However, encryption keys are generally the output of random number generators. Therefore, if the ECRNG is used to generate the encryption key K, then it may be possible that the escrow key e can be used to recover the encryption key K. Escrow keys can provide other functionality, such as for use in a wiretap. In this case, trusted law enforcement agents may need to decrypt encrypted traffic of criminals, and to do this they may want to be able to use an escrow key to recover an encryption key.

For example, in the SSL and TLS protocols, which are used for securing web (HTTP) traffic, a client and server perform a handshake in which their first actions are to exchange random values sent in the clear.

The patent also describes a number of ways to close the backdoor in Dual_EC_DRBG. Indeed, it may be due to Brown and Vanstone that the NIST standard includes an alternative method to close the backdoor (by generating a random Q point).

The existence of this patent does not mean that Brown and Vanstone were responsible for Dual EC. In fact, the generator appears to be an NSA invention, and may date back to the early 2000s. What this patent demonstrates is that some members of the ANSI committee, of which RSA was also a member, had reason to at least suspect that Dual EC could be used to create a wiretapping backdoor. (Update: John Kelsey confirms this.) It would be curious to know how widely this information was shared, and whether anyone on the committee ever inquired as to the provenance of the default parameters.

#2. Dual_EC_DRBG is not really a NIST standard

This is hardly a secret, but it’s something that hasn’t been widely acknowledged in the press. Dual_EC_DRBG is generally viewed as a NIST standard, since it was published in NIST Special Publication 800-90. But that’s not the only place it appears, nor was it developed at NIST.

A complete history of Dual_EC_DRBG would begin with NSA’s drive to include it in the ANSI X9.82 DRBG standard, with a standardization process kicked off in the early 2000s. The draft ANSI standard includes Dual_EC_DRBG with all of the known parameters, along with several additional elliptic curve parameters that were not included in the NIST standards.

Members of the ANSI X9F1 Tool Standards and Guidelines Group which wrote ANSI X9.82.

However you’ll also find Dual_EC_DRBG in the international standard ISO 18031. In other words, Dual EC was widely propagated in numerous standards long before NIST finalized it, and it’s hardly limited to US borders. The ANSI and ISO drafts are in some sense worse, since they don’t include a technique for generating your own Q parameter.

#3. Dual_EC_DRBG is not the only asymmetric random number generator in the ANSI and ISO standards

Cryptographers generally think of Dual EC as the only ‘public key’ random number generator to be widely standardized. We also point to NSA’s generation of the public parameters as evidence that Dual_EC may be compromised.

But in fact, the ANSI X9.82 and ISO standards each include a second generator based on public key cryptographic techniques. And like Dual EC, this one ships with a complete set of default parameters! The additional generator is based on an algorithm due to Micali and Schnorr, and relies for its security on assumptions related to the hardness of factoring large composite numbers. It requires an RSA-type modulus, several of which are conveniently provided in the specification.

Two default MS-DRBG moduli from the ISO 18031 specification.

There’s no reason to suspect that MS-DRBG is used by any real products, let alone that there’s a backdoor in the standard. In fact, there’s a curious fact about it: it’s not obvious from the public literature how one would attack the generator even if one knew the factorization of the n values above — though it seems intuitively likely that an attack does exist. Solving the problem, and finding a practical attack for known factorization, would be a fun project for an enthusiastic mathematician.

Since MS-DRBG comes from the same people who brought you Dual EC, if you are using it you might want to think twice.

RSA warns developers not to use RSA products

In today’s news of the weird, RSA (a division of EMC) has recommended that developers desist from using the (allegedly) ‘backdoored’ Dual_EC_DRBG random number generator — which happens to be the default in RSA’s BSafe cryptographic toolkit. Youch.

In case you’re missing the story here, Dual_EC_DRBG (which I wrote about yesterday) is the random number generator voted most likely to be backdoored by the NSA. The story here is that — despite many valid concerns about this generator — RSA went ahead and made it the default generator used for all cryptography in its flagship cryptography library. The implications for RSA and RSA-based products are staggering. In the worst case a modestly bad but by no means worst case, the NSA may be able to intercept SSL/TLS connections made by products implemented with BSafe.

So why would RSA pick Dual_EC as the default? You got me. Not only is Dual_EC hilariously slow — which has real performance implications — it was shown to be a just plain bad random number generator all the way back in 2006. By 2007, when Shumow and Ferguson raised the possibility of a backdoor in the specification, no sensible cryptographer would go near the thing.

And the killer is that RSA employs a number of highly distinguished cryptographers! It’s unlikely that they’d all miss the news about Dual_EC.

We can only speculate about the past. But here in the present we get to watch RSA’s CTO Sam Curry publicly defend RSA’s choices. I sort of feel bad for the guy. But let’s make fun of it anyway.

I’ll take his statement line by line (Sam is the boldface):

“Plenty of other crypto functions (PBKDF2, bcrypt, scrypt) will iterate a hash 1000 times specifically to make it slower.”

Password hash functions are built deliberately slow to frustrate dictionary attacks. Making a random number generator slow is just dumb.

At the time, elliptic curves were in vogue

Say what?

and hash-based RNG was under scrutiny.

Nonsense. A single obsolete hash based generator (FIPS 186) was under scrutiny — and fixed. The NIST SP800-90 draft in which Dual_EC appeared ALSO provided three perfectly nice non-backdoored generators: two based on hash functions and one based on AES. BSafe even implements some of them. Sam, this statement is just plain misleading.

The hope was that elliptic curve techniques—based as they are on number theory—would not suffer many of the same weaknesses as other techniques (like the FIPS 186 SHA-1 generator) that were seen as negative

Dual-EC suffers exactly the same sort of weaknesses as FIPS 186. Unlike the alternative generators in NIST SP800-90 it has a significant bias and really should not be used in production systems. RSA certainly had access to this information after the analyses were published in 2006.

and Dual_EC_DRBG was an accepted and publicly scrutinized standard.

And every bit of public scrutiny said the same thing: this thing is broken! Grab your children and run away!

SP800-90 (which defines Dual EC DRBG) requires new features like continuous testing of the output, mandatory re-seeding,

The exact same can be said for the hash-based and AES-based alternative generators you DIDN’T choose from SP800-90.

optional prediction resistance and the ability to configure for different strengths.

So did you take advantage of any of these options as part of the BSafe defaults? Why not? How about the very simple mitigations that NIST added to SP800-90A as a means to remove concerns that the generator might have a backdoor? Anyone?

There’s not too much else to say here. I guess the best way to put it is: this is all part of the process. First you find the disease. Then you see if you can cure it.

The Many Flaws of Dual_EC_DRBG

The Dual_EC_DRBG generator from NIST SP800-90A.

Update 9/19: RSA warns developers not to use the default Dual_EC_DRBG generator in BSAFE. Oh lord.

As a technical follow up to my previous post about the NSA’s war on crypto, I wanted to make a few specific points about standards. In particular I wanted to address the allegation that NSA inserted a backdoor into the Dual-EC pseudorandom number generator.

For those not following the story, Dual-EC is a pseudorandom number generator proposed by NIST for international use back in 2006. Just a few months later, Shumow and Ferguson made cryptographic history by pointing out that there might be an NSA backdoor in the algorithm. This possibility — fairly remarkable for an algorithm of this type — looked bad and smelled worse. If true, it spelled almost certain doom for anyone relying on Dual-EC to keep their system safe from spying eyes.

Now I should point out that much of this is ancient history. What is news today is the recent leak of classified documents that points a very emphatic finger towards Dual_EC, or rather, to an unnamed ‘2006 NIST standard’. The evidence that Dual-EC is this standard has now become so hard to ignore that NIST recently took the unprecedented step of warning implementers to avoid it altogether.

Better late than never.

In this post I’m going to try to explain the curious story of Dual-EC. While I’ll do my best to keep this discussion at a high and non-mathematical level, be forewarned that I’m probably going to fail at least at a couple of points. I you’re not the mood for all that, here’s a short summary:

  • In 2005-2006 NIST and NSA released a pseudorandom number generator based on elliptic curve cryptography. They released this standard — with very little explanation — both in the US and abroad.
  • This RNG has some serious issues with just being a good RNG. The presence of such obvious bugs was mysterious to cryptographers.
  • In 2007 a pair of Microsoft researchers pointed out that these vulnerabilities combined to produce a perfect storm, which — together with some knowledge that only NIST/NSA might have — opened a perfect backdoor into the random number generator itself.
  • This backdoor may allow the NSA to break nearly any cryptographic system that uses it.

If you’re still with me, strap in. Here goes the long version.

Dual-EC

For a good summary on the history of Dual-EC-DRBG, see this 2007 post by Bruce Schneier. Here I’ll just give the highlights.

Back in 2004-5, NIST decided to address a longstanding weakness of the FIPS standards, namely, the limited number of approved pseudorandom bit generator algorithms (PRGs, or ‘DRBGs’ in NIST parlance) available to implementers. This was actually a bit of an issue for FIPS developers, since the existing random number generators had some known design weaknesses.*

NIST’s answer to this problem was Special Publication 800-90, parts of which were later wrapped up into the international standard ISO 18031. The NIST pub added four new generators to the FIPS canon. None these algorithms is a true random number generator in the sense that they collect physical entropy. Instead, what they do is process the (short) output of a true random number generator — like the one in Linux — conditioning and stretching this ‘seed’ into a large number of random-looking bits you can use to get things done.** This is particularly important for FIPS-certified cryptographic modules, since the FIPS 140-2 standards typically require you to use a DRBG as a kind of ‘post-processing’ — even when you have a decent hardware generator.

The first three SP800-90 proposals used standard symmetric components like hash functions and block ciphers. Dual_EC_DRBG was the odd one out, since it employed mathematics more that are typically used to construct public-key cryptosystems. This had some immediate consequences for the generator: Dual-EC is slow in a way that its cousins aren’t. Up to a thousand times slower.

Now before you panic about this, the inefficiency of Dual_EC is not necessarily one of its flaws! Indeed, the inclusion of an algebraic generator actually makes a certain amount of sense. The academic literature includes a distinguished history of provably secure PRGs based on on number theoretic assumptions, and it certainly didn’t hurt to consider one such construction for standardization. Most developers would probably use the faster symmetric alternatives, but perhaps a small number would prefer the added confidence of a provably-secure construction.

Unfortunately, here is where NIST ran into their first problem with Dual_EC.

Flaw #1: Dual-EC has no security proof

Let me spell this out as clearly as I can. In the course of proposing this complex and slow new PRG where the only damn reason you’d ever use the thing is for its security reduction, NIST forgot to provide one. This is like selling someone a Mercedes and forgetting to attach the hood ornament.

I’d like to say this fact alone should have damned Dual_EC, but sadly this is par for the course for NIST — which treats security proofs like those cool Japanese cookies you can never find. In other words, a fancy, exotic luxury. Indeed, NIST has a nasty habit of dumping proof-writing work onto outside academics, often after the standard has been finalized and implemented in products everywhere.

So when NIST put forward its first draft of SP800-90 in 2005, academic cryptographers were left to analyze it from scratch. Which, to their great credit, they were quite successful at.

The first thing reviewers noticed is that Dual-EC follows a known design paradigm — it’s a weird variant of an elliptic curve linear congruential generator. However they also noticed that NIST had made some odd rookie mistakes.

Now here we will have to get slightly wonky — though I will keep mathematics to a minimum. (I promise it will all come together in the end!) Constructions like Dual-EC have basically two stages:

1. A stage that generates a series of pseudorandom elliptic curve points. Just like on the graph at right, an elliptic curve point is a pair (x, y) that satisfies an elliptic curve equation. In general, both x and y are elements of a finite field, which for our purposes means they’re just large integers.***

The main operation of the PRNG is to apply mathematical operations to points on the elliptic curve, in order to generate new points that are pseudorandom — i.e., are indistinguishable from random points in some subgroup.

And the good news is that Dual-EC seems to do this first part beautifully! In fact Brown and Gjøsteen even proved that this part of the generator is sound provided that the Decisional Diffie-Hellman problem is hard in the specific elliptic curve subgroup. This is a well studied hardness assumption so we can probably feel pretty confident in this proof.

2. Extract pseudorandom bits from the generated EC points. While the points generated by Dual-EC may be pseudorandom, that doesn’t mean the specific (x, y) integer pairs are random bitstrings. For one thing, ‘x‘ and ‘y‘ are not really bitstrings at all, they’re integers less than some prime number. Most pairs don’t satisfy the curve equation or are not in the right subgroup. Hence you can’t just output the raw x or y values and expect them to make good pseudorandom bits.

Thus the second phase of the generator is to ‘extract’ some (but not all) of the bits from the EC points. Traditional literature designs do all sorts of things here — including hashing the point or dropping up to half of the bits of the x-coordinate. Dual-EC does something much simpler: it grabs the x coordinate, throws away the most significant 16-18 bits, and outputs the rest.

In 2006, first Gjøsteen and later Schoenmakers and Sidorenko took a close look at Dual-EC and independently came up with a surprising result:

Flaw #2: Dual-EC outputs too many bits

Unlike those previous EC PRGs which output anywhere from 2/3 to half of the bits from the x-coordinate, Dual-EC outputs nearly the entire thing.

This is good for efficiency, but unfortunately it also gives Dual-EC a bias. Due to some quirks in the mathematics of the field operations, an attacker can now predict the next bits of Dual-EC output with a fairly small — but non-trivial — success probability, in the range of 0.1%. While this number may seem small to non-cryptographers, it’s basically a hanging offense for a cryptographic random number generator where probability of predicting a future bit should be many orders of magnitude lower.

What’s just plain baffling is that this flaw ever saw the light of day. After all, the specification was developed by bright people at NIST — in collaboration with NSA. Either of those groups should easily have discovered a bug like this, especially since this issue had been previously studied. Indeed, within a few months of public release, two separate groups of academic cryptographers found it, and were able to implement an attack using standard PC equipment.

So in summary, the bias is mysterious and it seems to be very much an ‘own-goal’ on the NSA’s part. Why in the world would they release so much information from each EC point? It’s hard to say, but a bit more investigation reveals some interesting consequences:

Flaw #3: You can guess the original EC point from looking at the output bits

By itself this isn’t really a flaw, but will turn out to be interesting in just a minute.

Since Dual-EC outputs so many bits from the x-coordinate of each point — all but the most significant 16 bits — it’s relatively easy to guess the original source point by simply brute-forcing the missing 16 bits and solving the elliptic curve equation for y. (This is all high-school algebra, I swear!)

While this process probably won’t uniquely identify the original (x, y), it’ll give you a modestly sized list of candidates. Moreover with only 16 missing bits the search can be done quickly even on a desktop computer. Had Dual_EC thrown away more bits of the x-coordinate, this search would not have been feasible at all.

So what does this mean? In general, recovering the EC point shouldn’t actually be a huge problem. In theory it could lead to a weakness — say predicting future outputs — but in a proper design you would still have to solve a discrete logarithm instance for each and every point in order to predict the next bytes output by the generator.

And here is where things get interesting.

Flaw #4: If you know a certain property about the Dual_EC parameters, and can recover an output point, you can predict all subsequent outputs of the generator

Did I tell you this would get interesting in a minute? I totally did.

The next piece of our puzzle was discovered by Microsoft researchers Dan Shumow and Niels Ferguson, and announced at the CRYPTO 2007 rump session. I think this result can best be described via the totally intuitive diagram below. (Don’t worry, I’ll explain it!)

Annotated diagram from Shumow-Ferguson presentation (CRYPTO 2007).
Colorful elements were added by yours truly. Thick green arrows mean ‘this part is
easy to reverse’. Thick red arrows should mean the opposite. Unless you’re the NSA.

The Dual-EC generator consists of two stages: a portion that generates the output bits (right) and a part that updates the internal state (left).

Starting from the “r_i” value (circled, center) and heading right, the bit generation part first computes the output point using the function “r_i * Q” — where Q is an elliptic curve point defined in the parameters — then truncates 16 bits its off its x-coordinate to get the raw generator output. The “*” operator here describes elliptic point multiplication, which is a complex operation that should be relatively hard to invert.

Note that everything after the point multiplication should be easy to invert and recover from the output, as we discussed in the previous section.

Every time the generator produces one block of output bits, it also updates its internal state. This is designed to prevent attacks where someone compromises the internal values of a working generator, then uses this value to wind the generator backwards and guess past outputs. Starting again from the circled “r_i” value, the generator now heads upwards and computes the point “r_i * P” where P is a different elliptic curve point also described in the parameters. It then does some other stuff.

The theory here is that P and Q should be random points, and thus it should be difficult to find “r_i * P” used for state update even if you know the output point “r_i * Q” — which I stress you do know, because it’s easy to find. Going from one point to the other requires you to know a relationship between P and Q, which you shouldn’t actually know since they’re supposed to be random values. The difficulty of this is indicated by the thick red arrow.

Looks totally kosher to me. (Source: NIST SP800-90A)

There is, however, one tiny little exception to this rule. What if P and Q aren’t entirely random values? What if you chose them yourself specifically so you’d know the mathematical relationship between the two points?

In this case it turns out you can easily compute the next PRG state after recovering a single output point (from 32 bytes of RNG output). This means you can follow the equations through and predict the next output. And the next output after that. And on forever and forever.****

This is a huge deal in the case of SSL/TLS, for example. If I use the Dual-EC PRG to generate the “Client Random” nonce transmitted in the beginning of an SSL connection, then the NSA will be able to predict the “Pre-Master” secret that I’m going to generate during the RSA handshake. Given this information the connection is now a cleartext read. This is not good.

So now you should all be asking the most important question of all: how the hell did the NSA generate the specific P and Q values recommended in Appendix A of Dual-EC-DRBG? And do they know the relationship that allows them to run this attack? All of which brings us to:

Flaw #5Nobody knows where the recommended parameters came from

And if you think that’s problematic, welcome to the club.

But why? And where is Dual-EC used?

The ten million dollar question of Dual-EC is why the NSA would stick such an obviously backdoored algorithm into an important standard. Keep in mind that cryptographers found the major (bias) vulnerabilities almost immediately after Dual-EC shipped. The possibility of a ‘backdoor’ was announced in summer 2007. Who would still use it?

A few people have gone through the list of CMVP-evaluated products and found that the answer is: quite a few people would. Most certify Dual-EC simply because it’s implemented in OpenSSL-FIPS, and they happen to use that library. But at least one provider certifies it exclusively. Yuck.

Hardcoded constants from the OpenSSL-FIPS
implementation of Dual_EC_DRBG. Recognize ’em?

It’s worth keeping in mind that NIST standards carry a lot of weight — even those that might have a backdoor. Folks who aren’t keeping up on the latest crypto results could still innocently use the thing, either by accident or (less innocently) because the government asked them to. Even if they don’t use it, they might include the code in their product — say through the inclusion of OpenSSL-FIPS or MS Crypto API — which means it’s just a function call away from being surreptitiously activated.

Which is why people need to stop including Dual-EC immediately. We have no idea what it’s for, but it needs to go away. Now.

So what about the curves?

The last point I want to make is that the vulnerabilities in Dual-EC have precisely nothing to do with the specifics of the NIST standard elliptic curves themselves. The ‘back door’ in Dual-EC comes exclusively from the relationship between P and Q — the latter of which is published only in the Dual-EC specification. The attack can work even if you don’t use the NIST pseudorandom curves.

However, the revelations about NIST and the NSA certainly make it worth our time to ask whether these curves themselves are somehow weak. The best answer to that question is: we don’t know. Others have observed that NIST’s process for generating the curves leaves a lot to be desired. But including some kind of hypothetical backdoor would be a horrible, horrific idea — one that would almost certainly blow back at us.

You’d think people with common sense would realize this. Unfortunately we can’t count on that anymore.

Thanks to Tanja Lange for her assistance proofing this post. Any errors in the text are entirely mine.

Notes:

* My recollection of this period is hazy, but prior to SP800-90 the two most common FIPS DRBGs in production were (1) the SHA1-based DSA generator of FIPS 186-2 and (2) ANSI X9.31. The DSA generator was a special-purpose generator based on SHA1, and was really designed just for that purpose. ANSI X9.31 used block ciphers, but suffered from some more subtle weaknesses it retained from the earlier X9.17 generator. These were pointed out by Kelsey, Schneier, Wagner and Hall.

** This is actually a requirement of the FIPS 140-2 specification. Since FIPS does not approve any true random number generators, it instead mandates that you run your true RNG output through a DRBG (PRNG) first. The only exception is if your true RNG has been approved ‘for classified use’.

*** Specifically, x and y are integers in the range 0 to p-1 where p is a large prime number. A point is a pair (x, y) such that y^2 = x^3 + ax + b mod p. The values a and b are defined as part of the curve parameters.

**** The process of predicting future outputs involves a few guesses, since you don’t know the exact output point (and had to guess at the missing 16 bits), but you can easily reduce this to a small set of candidates — then it’s just a question of looking at a few more bits of RNG output until you guess the right one.