The Dual_EC_DRBG generator from NIST SP80090A. 
As a technical follow up to my previous post about the NSA's war on crypto, I wanted to make a few specific points about standards. In particular I wanted to address the allegation that NSA inserted a backdoor into the DualEC pseudorandom number generator.
For those not following the story, DualEC is a pseudorandom number generator proposed by NIST for international use back in 2006. Just a few months later, Shumow and Ferguson made cryptographic history by pointing out that there might be an NSA backdoor in the algorithm. This possibility  fairly remarkable for an algorithm of this type  looked bad and smelled worse. If true, it spelled almost certain doom for anyone relying on DualEC to keep their system safe from spying eyes.
Now I should point out that much of this is ancient history. What is news today is the recent leak of classified documents that points a very emphatic finger towards Dual_EC, or rather, to an unnamed '2006 NIST standard'. The evidence that DualEC is this standard has now become so hard to ignore that NIST recently took the unprecedented step of warning implementers to avoid it altogether.
Better late than never.
In this post I'm going to try to explain the curious story of DualEC. While I'll do my best to keep this discussion at a high and nonmathematical level, be forewarned that I'm probably going to fail at least at a couple of points. I you're not the mood for all that, here's a short summary:
 In 20052006 NIST and NSA released a pseudorandom number generator based on elliptic curve cryptography. They released this standard  with very little explanation  both in the US and abroad.
 This RNG has some serious issues with just being a good RNG. The presence of such obvious bugs was mysterious to cryptographers.
 In 2007 a pair of Microsoft researchers pointed out that these vulnerabilities combined to produce a perfect storm, which  together with some knowledge that only NIST/NSA might have  opened a perfect backdoor into the random number generator itself.
 This backdoor may allow the NSA to break nearly any cryptographic system that uses it.
DualEC
For a good summary on the history of DualECDRBG, see this 2007 post by Bruce Schneier. Here I'll just give the highlights.
Back in 20045, NIST decided to address a longstanding weakness of the FIPS standards, namely, the limited number of approved pseudorandom bit generator algorithms (PRGs, or 'DRBGs' in NIST parlance) available to implementers. This was actually a bit of an issue for FIPS developers, since the existing random number generators had some known design weaknesses.*
NIST's answer to this problem was Special Publication 80090, parts of which were later wrapped up into the international standard ISO 18031. The NIST pub added four new generators to the FIPS canon. None these algorithms is a true random number generator in the sense that they collect physical entropy. Instead, what they do is process the (short) output of a true random number generator  like the one in Linux  conditioning and stretching this 'seed' into a large number of randomlooking bits you can use to get things done.** This is particularly important for FIPScertified cryptographic modules, since the FIPS 1402 standards typically require you to use a DRBG as a kind of 'postprocessing'  even when you have a decent hardware generator.
The first three SP80090 proposals used standard symmetric components like hash functions and block ciphers. Dual_EC_DRBG was the odd one out, since it employed mathematics more that are typically used to construct publickey cryptosystems. This had some immediate consequences for the generator: DualEC is slow in a way that its cousins aren't. Up to a thousand times slower.
Now before you panic about this, the inefficiency of Dual_EC is not necessarily one of its flaws! Indeed, the inclusion of an algebraic generator actually makes a certain amount of sense. The academic literature includes a distinguished history of provably secure PRGs based on on number theoretic assumptions, and it certainly didn't hurt to consider one such construction for standardization. Most developers would probably use the faster symmetric alternatives, but perhaps a small number would prefer the added confidence of a provablysecure construction.
Unfortunately, here is where NIST ran into their first problem with Dual_EC.
Flaw #1: DualEC has no security proof.Let me spell this out as clearly as I can. In the course of proposing this complex and slow new PRG where the only damn reason you'd ever use the thing is for its security reduction, NIST forgot to provide one. This is like selling someone a Mercedes and forgetting to attach the hood ornament.
I'd like to say this fact alone should have damned Dual_EC, but sadly this is par for the course for NIST  which treats security proofs like those cool Japanese cookies you can never find. In other words, a fancy, exotic luxury. Indeed, NIST has a nasty habit of dumping proofwriting work onto outside academics, often after the standard has been finalized and implemented in products everywhere.
So when NIST put forward its first draft of SP80090 in 2005, academic cryptographers were left to analyze it from scratch. Which, to their great credit, they were quite successful at.
The first thing reviewers noticed is that DualEC follows a known design paradigm  it's a weird variant of an elliptic curve linear congruential generator. However they also noticed that NIST had made some odd rookie mistakes.
Now here we will have to get slightly wonky  though I will keep mathematics to a minimum. (I promise it will all come together in the end!) Constructions like DualEC have basically two stages:
 A stage that generates a series of pseudorandom elliptic curve points. Just like on the graph at right, an elliptic curve point is a pair (x, y) that satisfies an elliptic curve equation. In general, both x and y are elements of a finite field, which for our purposes means they're just large integers.***
The main operation of the PRNG is to apply mathematical operations to points on the elliptic curve, in order to generate new points that are pseudorandom  i.e., are indistinguishable from random points in some subgroup.
And the good news is that DualEC seems to do this first part beautifully! In fact Brown and Gjøsteen even proved that this part of the generator is sound provided that the Decisional DiffieHellman problem is hard in the specific elliptic curve subgroup. This is a well studied hardness assumption so we can probably feel pretty confident in this proof.
 Extract pseudorandom bits from the generated EC points. While the points generated by DualEC may be pseudorandom, that doesn't mean the specific (x, y) integer pairs are random bitstrings. For one thing, 'x' and 'y' are not really bitstrings at all, they're integers less than some prime number. Most pairs don't satisfy the curve equation or are not in the right subgroup. Hence you can't just output the raw x or y values and expect them to make good pseudorandom bits.
Thus the second phase of the generator is to 'extract' some (but not all) of the bits from the EC points. Traditional literature designs do all sorts of things here  including hashing the point or dropping up to half of the bits of the xcoordinate. DualEC does something much simpler: it grabs the x coordinate, throws away the most significant 1618 bits, and outputs the rest.
Flaw #2: DualEC outputs too many bits.Unlike those previous EC PRGs which output anywhere from 2/3 to half of the bits from the xcoordinate, DualEC outputs nearly the entire thing.
This is good for efficiency, but unfortunately it also gives DualEC a bias. Due to some quirks in the mathematics of the field operations, an attacker can now predict the next bits of DualEC output with a fairly small  but nontrivial  success probability, in the range of 0.1%. While this number may seem small to noncryptographers, it's basically a hanging offense for a cryptographic random number generator where probability of predicting a future bit should be many orders of magnitude lower.
What's just plain baffling is that this flaw ever saw the light of day. After all, the specification was developed by bright people at NIST  in collaboration with NSA. Either of those groups should easily have discovered a bug like this, especially since this issue had been previously studied. Indeed, within a few months of public release, two separate groups of academic cryptographers found it, and were able to implement an attack using standard PC equipment.
So in summary, the bias is mysterious and it seems to be very much an 'owngoal' on the NSA's part. Why in the world would they release so much information from each EC point? It's hard to say, but a bit more investigation reveals some interesting consequences:
Flaw #3: You can guess the original EC point from looking at the output bits.By itself this isn't really a flaw, but will turn out to be interesting in just a minute.
Since DualEC outputs so many bits from the xcoordinate of each point  all but the most significant 16 bits  it's relatively easy to guess the original source point by simply bruteforcing the missing 16 bits and solving the elliptic curve equation for y. (This is all highschool algebra, I swear!)
While this process probably won't uniquely identify the original (x, y), it'll give you a modestly sized list of candidates. Moreover with only 16 missing bits the search can be done quickly even on a desktop computer. Had Dual_EC thrown away more bits of the xcoordinate, this search would not have been feasible at all.
So what does this mean? In general, recovering the EC point shouldn't actually be a huge problem. In theory it could lead to a weakness  say predicting future outputs  but in a proper design you would still have to solve a discrete logarithm instance for each and every point in order to predict the next bytes output by the generator.
And here is where things get interesting.
Flaw #4: If you know a certain property about the Dual_EC parameters, and can recover an output point, you can predict all subsequent outputs of the generator.Did I tell you this would get interesting in a minute? I totally did.
The next piece of our puzzle was discovered by Microsoft researchers Dan Shumow and Niels Ferguson, and announced at the CRYPTO 2007 rump session. I think this result can best be described via the totally intuitive diagram below. (Don't worry, I'll explain it!)

Starting from the "r_i" value (circled, center) and heading right, the bit generation part first computes the output point using the function "r_i * Q"  where Q is an elliptic curve point defined in the parameters  then truncates 16 bits its off its xcoordinate to get the raw generator output. The "*" operator here describes elliptic point multiplication, which is a complex operation that should be relatively hard to invert.
Note that everything after the point multiplication should be easy to invert and recover from the output, as we discussed in the previous section.
Every time the generator produces one block of output bits, it also updates its internal state. This is designed to prevent attacks where someone compromises the internal values of a working generator, then uses this value to wind the generator backwards and guess past outputs. Starting again from the circled "r_i" value, the generator now heads upwards and computes the point "r_i * P" where P is a different elliptic curve point also described in the parameters. It then does some other stuff.
The theory here is that P and Q should be random points, and thus it should be difficult to find "r_i * P" used for state update even if you know the output point "r_i * Q"  which I stress you do know, because it's easy to find. Going from one point to the other requires you to know a relationship between P and Q, which you shouldn't actually know since they're supposed to be random values. The difficulty of this is indicated by the thick red arrow.
Looks totally kosher to me. (Source: NIST SP80090A) 
There is, however, one tiny little exception to this rule. What if P and Q aren't entirely random values? What if you chose them yourself specifically so you'd know the mathematical relationship between the two points?
In this case it turns out you can easily compute the next PRG state after recovering a single output point (from 32 bytes of RNG output). This means you can follow the equations through and predict the next output. And the next output after that. And on forever and forever.****
This is a huge deal in the case of SSL/TLS, for example. If I use the DualEC PRG to generate the "Client Random" nonce transmitted in the beginning of an SSL connection, then the NSA will be able to predict the "PreMaster" secret that I'm going to generate during the RSA handshake. Given this information the connection is now a cleartext read. This is not good.
So now you should all be asking the most important question of all: how the hell did the NSA generate the specific P and Q values recommended in Appendix A of DualECDRBG? And do they know the relationship that allows them to run this attack? All of which brings us to:
Flaw #5: Nobody knows where the recommended parameters came from.And if you think that's problematic, welcome to the club.
But why? And where is DualEC used?
The ten million dollar question of DualEC is why the NSA would stick such an obviously backdoored algorithm into an important standard. Keep in mind that cryptographers found the major (bias) vulnerabilities almost immediately after DualEC shipped. The possibility of a 'backdoor' was announced in summer 2007. Who would still use it?
A few people have gone through the list of CMVPevaluated products and found that the answer is: quite a few people would. Most certify DualEC simply because it's implemented in OpenSSLFIPS, and they happen to use that library. But at least one provider certifies it exclusively. Yuck.
Hardcoded constants from the OpenSSLFIPS implementation of Dual_EC_DRBG. Recognize 'em? 
Which is why people need to stop including DualEC immediately. We have no idea what it's for, but it needs to go away. Now.
So what about the curves?
The last point I want to make is that the vulnerabilities in DualEC have precisely nothing to do with the specifics of the NIST standard elliptic curves themselves. The 'back door' in DualEC comes exclusively from the relationship between P and Q  the latter of which is published only in the DualEC specification. The attack can work even if you don't use the NIST pseudorandom curves.
However, the revelations about NIST and the NSA certainly make it worth our time to ask whether these curves themselves are somehow weak. The best answer to that question is: we don't know. Others have observed that NIST's process for generating the curves leaves a lot to be desired. But including some kind of hypothetical backdoor would be a horrible, horrific idea  one that would almost certainly blow back at us.
You'd think people with common sense would realize this. Unfortunately we can't count on that anymore.
Thanks to Tanja Lange for her assistance proofing this post. Any errors in the text are entirely mine.
Notes:
* My recollection of this period is hazy, but prior to SP80090 the two most common FIPS DRBGs in production were (1) the SHA1based DSA generator of FIPS 1862 and (2) ANSI X9.31. The DSA generator was a specialpurpose generator based on SHA1, and was really designed just for that purpose. ANSI X9.31 used block ciphers, but suffered from some more subtle weaknesses it retained from the earlier X9.17 generator. These were pointed out by Kelsey, Schneier, Wagner and Hall.
** This is actually a requirement of the FIPS 1402 specification. Since FIPS does not approve any true random number generators, it instead mandates that you run your true RNG output through a DRBG (PRNG) first. The only exception is if your true RNG has been approved 'for classified use'.
*** Specifically, x and y are integers in the range 0 to p1 where p is a large prime number. A point is a pair (x, y) such that mod p. The values a and b are defined as part of the curve parameters.
**** The process of predicting future outputs involves a few guesses, since you don't know the exact output point (and had to guess at the missing 16 bits), but you can easily reduce this to a small set of candidates  then it's just a question of looking at a few more bits of RNG output until you guess the right one.
I have degree from ITT Technical Institute. I was taught that the U.S. government mandates no encryption standard can surpass theirs and they have to be able to crack all encryption standards in play.
ReplyDeleteI guess a new trend in Information Technology is going to emerge. Specifically a heightened increase in national IT security and pride. Brazil has already announced plans for a new dedicated backbone bypassing America. So much for globalization.
The US government tried to mandate backdoors in everything by law in the 90s and lost due to public outcry. What they couldn't mandate by law they have ever since resorted to trying to get by subterfuge. They will never listen to what the people want, they are against the people.
DeleteAnd reducing USAcentricness of the internet isn't killing globalization, it's enhancing globalization. The whole definition of "globalization" excludes there being one massive central authority controlling everything like the US has been doing! Why do people have an "I must be in charge to ensure harmony by force, otherwise there's chaos" attitude to everything? Open Source has taught us that that is not the case, it's just an excuse that dictators use to keep power.
I have a degree from the Blaine School of Cosmetology and I would love for you to state an actually credible source for that.
DeleteHey Javier,
DeleteA quick question, how is Brazil being able to communicate with Europe without being MITM'd by USA bad for globalization?
It skips America because America is hostile, it's just common sense. Connections still would be able to reach America, or go through America if that is the most convenient path.
I wonder if the choice of Brazil to bypass the US will help them.... Point of arrival probably will be a NSAcooperative state. We all learned that the services of NSA and other secret services are so damned intertwined that it is impossible to transfer packets around the globe finally to hit one of their tapping points. And then.... With IPV6 we're lost.... Profiling is 50% of their information value, fully fully compatible with war time UK Ystations, only capable to log the 'gibberish' from the Germans (Enigma traffic) but very very good in profiling the radio networks. At the start of the war the job actually was done by half. Look at the analogy.... Amazing, isn't it?
Deletewww.cryptomuseum.com
You Mr Green, Lange, Schneier, Bernstein etc became famous because of hero whistleblower Mr Snowden and the evil NSA. Your life must have changed for the better. Keep it up. In math and cryptographers we trust.
ReplyDeleteThank you, that was just the detailed analysis of the issue I was hoping to eventually find here :)
ReplyDeleteI'm wondering about one thing: Assuming that somebody chose P and Q specifically for their mathematical relationship (i.e., they own the "secret key" to the DRBG), would they also be able to "rewind" the DRBG and recover previous outputs, or does this concern future outputs only?
From 'just' knowing log_Q P you only get the future outputs. For getting past ones requires computing discrete logarithms. So far we don't know any way that this mathematical problem is easier on the NIST curves.
DeleteThank your for the very informative and interesting read :)
ReplyDeleteMaybe I got it wrong but
 The EC points P and Q are defined in the DualEC standard appendix and are recommendations (?)
 They are basically the seed for the algorithm
So, without real knowledge of crypto, I would naively assume that, if we just use other EC points than the predefined P and Q, we could bypass the NSAknowstheecpointrelationship problem and just use the basic algorithm of the standard.
Or is the problem one level deeper, at the EC itself? Is the EC fundamentally corrupted?
MK
No, you're absolutely right. Replacing "Q" with one generated randomly *will* kill the backdoor. Later versions of the spec include an (optional) Appendix describing how to do this. However I've looked at a couple of open implementations  including OpenSSL  and they still use the default parameters.
DeleteAs I said above, the backdoor in Dual_EC has nothing to do with corruption of the elliptic curves themselves. But there's nothing wrong with an excess of caution right now.
Ok, thank you for the clarification :)
DeleteMK
If the problem is in just two number P and Q, why hasn't somebody generated new set of numbers and explained to everyone how he did it? Seems like guys from OpenSSL could have done this. Why not?
ReplyDeleteYou can generate your own "Q" value. But it's not the default in most implementations.
DeleteThis isn't about common sense  it's about corruption and lies.
ReplyDeleteThe reason why tou would use the NIST (read NSA) mandated EC Points is the reason you would use any NIST (read NSA) approved algorithm : It has supposedly been vetted by experts cryptographers and are A (not necessarily THE) good choice. Implementing DUALEC exactly as NIST specified requires programming skills while changing the points requires some mild understanding of what a Elliptic Curve is.
ReplyDeleteAnd last but not least, I'm not sure you can get a product certified if you actually chose other parameters as the ones that were specified by NIST (read, sayitwithme NSA).
Thanks for the post, Dr. Green. In the future I hope you tackle the question of the NIST elliptic curve generation "controversy" and let us know if there is truly anything fishy with that whole process.
ReplyDeleteIt's sad, but this whole Snowden debacle is going to cause a lot of distrust in NIST from now on. Perhaps that's a good thing as it might result in new findings from cryptographers who go back and really analyze these standards again.
Assume there are more obscure hacks hiding behind the obvious fix. In the intelligence biz, it's called a 'modified limited hangout': the ability to say "oh, yes, you caught us dead to rights" while distracting you from deeper wrongness.
ReplyDeleteI never fully understood what my teacher in a cryptography 101 lesson at university 20 years ago meant with: Be suspicious with algorithms where the design principles are not published. I simply couldn't imagine how a backdoor could be planted into an algorithms  now I can!
ReplyDeleteThanks for making this understandable to someone who is dealing with security in everyday practice, but who is not a mathematician. :)
Purely for the entertainment value to the audience here, I offer that it occurred to me that the suspect PQ could have been a test case provided by the NSA, along the lines of "Given how the algorithm is supposed to work, if we corrupt the PQ pair by making them nonrandom using a specific mathematical relationship between them, then the algorithm should be provably not secure. Demonstrating this should increase the confidence that the correctly implemented algorithm is secure." Then what happened is some arrogant scientist at NIST (full disclosureI was formerly a NIST employee, and the terms of my departure still burn as a fire in the pit of my stomach) conveniently "forgot" to put the correct ones in the standard, or did it on purpose since "Anyone of modest skill in cryptography will detect the problem and come up with their own PQ pair correctly. Anyone who doesn't deserves what they get." There are, in my estimation, people that arrogant employed by NIST.
ReplyDeleteThere’s no word to describe such a great masterpiece. You made such an interesting piece to read, giving every subject enlightenment for us to gain knowledge and information Porcelain Veneers
ReplyDeleteI would appreciate your advice on how to publish a freeware encryption program I constructed.
ReplyDeletePlease reply to DavidLNewton@comcast.net
This comment has been removed by the author.
ReplyDeleteSome of the comments ask why not just generate your own points. The naive answer is, "Sure, you can do that." NIST SP 80090A even discusses that as an option. However, if you want to have your device or code validated as FIPS 1402 compliant, it has to pass some known answer tests. (You feed in the supplied value as though it was entropy, and check the first so many output blocks match the known answers.) And those tests are (I think) dependent on P and Q being as published in NIST SP 80090A.
ReplyDeleteSo the short answer is "yes, you can", but the catch is that you will have to have some sort of option to test with the published points, and the use your own, and you will have to convince the validation lab that this is all okay.
The "I think" is because the published examples are on the NIST web site, but that is down due to the government shutdown, so I can't go look.