Thursday, February 19, 2015

How to paint yourself into a corner (Lenovo edition)

The information security news today is all about Lenovo’s default installation of a piece of adware called “Superfish” on a number of laptops shipped before February 2015. The Superfish system is essentially a tiny TLS/SSL “man in the middle” proxy that attacks secure connections by making them insecure — so that the proxy can insert ads in order to, oh, I don’t know, let’s just let Lenovo tell it:
“To be clear, Superfish comes with Lenovo consumer products only and is a technology that helps users find and discover products visually,” the representative continued. “The technology instantly analyses images on the web and presents identical and similar product offers that may have lower prices, helping users search for images without knowing exactly what an item is called or how to describe it in a typical text-based search engine.”
Whatever.

The problem here is not just that this is a lousy idea. It’s that Lenovo used the same certificate on every single Laptop it shipped with Superfish. And since the proxy software also requires the corresponding private key to decrypt and modify your web sessions, that private key was also shipped on every laptop. It took all of a day for a number of researchers to find that key and turn themselves into Lenovo-eating interception proxies. This sucks for Lenovo users.

If you’re a Lenovo owner in the affected time period, go to this site to find out if you’re vulnerable and (hopefully) what to do about it. But this isn't what I want to talk about in this post.

Instead, what I’d like to discuss is some of the options for large-scale automated fixes to this kind of vulnerability. It’s quite possible that Lenovo will do this by themselves — pushing an automated patch to all of their customers to remove the product — but I'm not holding my breath. If Lenovo does not do this, there are roughly three options:
  1. Lenovo users live with this and/or manually patch. If the patch requires manual effort, I’d estimate it’ll be applied to about 30% of Lenovo laptops. Beware: the current uninstall package does not remove the certificate from the root store!
  2. Microsoft drops the bomb. Microsoft has a nuclear option themselves in terms of cleaning up nasty software — they can use the Windows Update mechanism or (less universally) the Windows Defender tool to remove spyware/adware. Unfortunately not everyone uses Defender, and Microsoft is probably loath to push out updates like this without massive testing and a lot of advice from the lawyers.
  3. Google and Mozilla fix internally. This seems like a more promising option. Google Chrome in particular is well known for quickly pushing out security updates that revoke keys, add public key pins, and generally make your browsing experience more secure.
It seems unlikely that #1 and #2 will happen anytime soon, so the final option looks initially like the most promising. Unfortunately it's not that easy. To understand why, I'm going to sum up some reasoning given to me (on Twitter) by a couple of members of the Chrome security team.

The obvious solution to fixing things at the Browser level is to have Chrome and/or Mozilla push out an update to their browsers that simply revokes the Superfish certificate. There's plenty of precedent for that, and since the private key is now out in the world, anyone can use it to build their own interception proxy. Sadly, this won't work! If Google does this, they'll instantly break every Lenovo laptop with Superfish still installed and running. That's not nice, or smart business for Google.

A more promising option is to have Chrome at least throw up a warning whenever a vulnerable Lenovo user visits a page that's obviously been compromised by a Superfish certificate. This would include most (secure) sites any Superfish-enabled Lenovo user visits -- which would be annoying -- and just a few pages for those users who have uninstalled Superfish but still have the certificate in their list of trusted roots.

This seems much nicer, but runs into two problems. First, someone has to write this code -- and in a hurry, because attacks may begin happening immediately. Second, what action item are these warnings going to give people? Manually uninstalling certificates is hard, and until a very nice tool becomes available a warning will just be an irritation for most users.

One option for Google is to find a way to deal with these issues systemically -- that is, provide an option for their browser to tunnel traffic through some alternative (secure) protocol to a proxy, where it can then go securely to its location without being molested by Superfish attackers of any flavor. This would obviously require consent by the user -- nobody wants their traffic being routed through Google otherwise. But it's at least technically feasible.

Google even has an extension for Android/iOS that works something like this: it's a compressing proxy extension that you can install in Chrome. It will shrink your traffic down and send it to a proxy (presumably at Google). Unfortunately this proxy won't work even if it was available for Windows machines -- because Superfish will likely just intercept its connections too :(

So that's out too, and with it the last obvious idea I have for dealing with this in a clean, automated way. Hopefully the Google team will keep going until they find a better solution.

The moral of this story, if you choose to take one, is that you should never compromise security for the sake of a few bucks -- because security is so terribly, awfully difficult to get back. 

Wednesday, February 18, 2015

Another update on the Truecrypt audit

There's a story on Hacker News asking what the hell is going on with the Truecrypt audit. I think that's a fair question, since we have been awfully quiet lately. To everyone who donated to the project, first accept my apologies for the slow pace. I want to promise you that we're not spending your money on tropical vacations (as appealing as that would be). In this post I'd like to offer you some news, including an explanation of why this has moved slowly.

For those of you who don't know what the Truecrypt audit is: in late 2013 Kenn White, myself, and a group of advisors started a project to undertake a crowdfunded audit of the Truecrypt disk encryption program. To the best of my knowledge, this is the first time anyone's tried this. The motivation for the audit is that lots of people use Truecrypt and depend on it for their security and safety -- yet the authors of the program are anonymous and somewhat mysterious to boot. Being anonymous and mysterious is not a crime, but it still seemed like a nice idea to take a look at their code.

We had an amazing response, collecting upwards of $70,000 in donations from a huge and diverse group of donors. We then went ahead and retained iSEC Partners to evaluate the bootloader and other vulnerability-prone areas of Truecrypt. The initial report was published here.

That initial effort was Part 1 of a two-part project. The second -- and much more challenging part -- involves a detailed look at the cryptography of Truecrypt, ranging from the symmetric encryption to the random number generator. We had some nice plans for this, and were well on our way to implementing them. (More on those in a second.)

Then in late Spring of 2014, something bizarre happened. The Truecrypt developers pulled the plug on the entire product -- in their typical, mysterious way.

This threw our plans for a loop. We had been planning a crowdsourced audit to be run by Thomas Ptacek and some others. However in the wake of TC pulling the plug, there were questions. Was this a good use of folks' time and resources? What about applying those resources to the new 'Truecrypt forks' that have sprung up (or are being developed?) There were a few other wrinkles as well, which Thomas talks about here -- although he takes on too much of the blame.

It took us a while to recover from this and come up with a plan B that works within our budget and makes sense. We're now implementing this. A few weeks ago we signed a contract with the newly formed NCC Group's Cryptography Services practice (which grew out of iSEC, Matasano and Intrepidus Group). The project will evaluate the original Truecrypt 7.1a which serves as a baseline for the newer forks, and it will begin shortly. However to minimize price -- and make your donations stretch farther -- we allowed the start date to be a bit flexible, which is why we don't have results yet.

In our copious spare time we've also been looking manually at some portions of the code, including the Truecrypt RNG and other parts of the cryptographic implementation. This will hopefully complement the NCC/iSEC work and offer a bit more confidence in the implementation.

I don't really have much more to say -- except to thank all of the donors for their contributions and their patience. This project has been a bit slower than any of us would like, but results are coming. Personally, my hope is that they'll be completely boring.

Tuesday, February 10, 2015

How do we pay for privacy?

(source)
If you haven't read Julia Angwin's excellent profile of GnuPG's lead developer Werner Koch, now would be a great time to check it out. Koch, who single-handedly wrote GnuPG in 1997, has been doggedly maintaining the codebase ever since -- and not getting paid very well for it. Despite good intentions on all sides, Koch has been essentially going broke.

The news is not all bad. In response to Angwin's piece, 'the Internet' rallied to GnuPG's aid. So far individual and corporate donors have coughed up over EU 200,000 to pay Koch and even hire him some help. It looks like GnuPG is saved, and so are its users — for the moment.

But is this model really sustainable? I'm pretty skeptical.

Sooner or later this money will run out. And next time this happens, the Internet might not have a quarter million in spare change to fork over. In fact, that's already the normal state of affairs for most privacy tools -- software ranging from GPGTools to OTR -- most of which subsist on meager donations and volunteer time. The scary part is that thousands of people depend on these tools for their privacy and even their physical safety.

This raises a question: how can we support the long-term development and maintenance of privacy tools? It turns out that nobody really knows the answer to this -- but a few people are groping towards a solution. In this (entirely non-technical) post I'm going to talk a bit about what people are doing today -- and how we might do better in the future.

NB: I should mention that most of the smart ideas in this post come from Meredith Whittaker, who leads Google’s Open Source Research Group, and helped found Simply Secure. The dumb ideas are all mine.

How we support privacy tools today

If you're developing, or are interested in developing privacy tools in 2015, there are a handful of funding sources for you to choose from. They include:

  1. Self-funding. The vast majority of privacy tools come from engineers working in their spare time. This is a great way to develop prototypes and simple tools, but it tends not to be very sustainable -- particularly when developers have to choose between family and code maintenance.
  2. Donations, charges and crowd-funding. A few major projects pull in some cash through donations, but this seems to work well only during a catastrophe or when the spotlight is on a particular project. GnuPG, for example, made a ton of money following their recent publicity, but before that they averaged a minuscule $20k per year — and this is for one of the most widely used privacy tools on the Internet! Projects like GPG Tools recently began charging for their software, which may work a bit better. Unfortunately this is anathema for young projects that rely on network effect for their success.
  3. Industry grants. From time to time major corporations give out modest chunks of money to tool developers, particularly when those companies use tools internally. Case in point: Google Stripe and Facebook just gave $50k each to GnuPG, and the Linux Foundation Core Infrastructure Initiative (essentially an industry funding group*) kicked in an additional $60k. Unfortunately this money is tragically difficult to come by -- for the average developer it might as well not exist.
  4. Government and NGOs. Perhaps the most promising source of money comes from the U.S. government, and a set of NGOs that have sprung up to disburse it. The State Dept. directly funds the Tor Project, and the government also provides block grants to groups such as the Open Technology Fund (via Radio Free Asia) and the -- confusingly similar -- Open Technology Institute (at New America Foundation). OTF in particular has done a phenomenal job at funding both development and audit of privacy tools.
  5. Internal industry funding. Once in a blue moon a company proposes to internally develop a privacy tool like Google/Yahoo End-to-End. I’ll believe this works when I see it.
  6. Academic research funding. A few academic tools have managed to slip into this space, most notably OTR and Tor. But this model is awfully hard to sustain, mostly because academics don't get paid to do things. We get paid to teach and write papers. It's hard to sustain software development this way.
  7. Bitcoin wallet theft. This is mostly a joke.

Of these funding sources, the U.S. government is by far the heaviest hitter -- responsible for funding many well-known projects such as Tor and TextSecure. While I tend to think any money is good money in the hands of right people, I should point out that this view is not universally shared. In part this is because we, admittedly, don’t have much of a process to validate code and convince non-experts that this process isn’t producing compromised code. 

As Jillian York points out, US government funding also comes with some political baggage, and sadly, tends to attract more than its fair share of paranoia.

Developers need more than just money!

If you give a starving privacy tool a million bucks, you get a well-fed privacy tool. Unfortunately it may not actually be a better privacy tool. That's not because people are trying to waste your cash. It turns out that software development, crypto, and security are just plain hard.

So yes, people need to eat -- that's a baseline. But beyond that what developers also need are things like expert guidance, security audits, analysis tools, and collaboration with other developers. They also really need help with the hardest problem in computer science, which is turning a pile of code into a product that people want to use.

A few groups like OTF (in case you're not getting the hint, I really like them*) are trying to help with some of this. They fund professional code audits through groups like iSEC Partners. They fund usability resources and provide communications help. They host regular meetings where project members can get together and dork out about how to handle encrypted spam. A friend calls this a 'dog park for developers who haven't been outside for a while.' This sounds silly, but it really, really helps.

Beyond those things, there's still an awful problem of coordinating all of this technical stuff so that auditing results adhere to some consistent standard and produce knowledge that gets retained within the organization, as well as seeing that tools get proper academic analysis where necessary. And finally, there are useful services such as connecting developers with UI designers and helping them to turn their tools into real, usable products.

A few groups have done well at this all on their own. Moxie's Open Whisper Systems not only launched a popular messaging app, but managed to get TextSecure (the protocol) into 600 million WhatsApp clients. Unfortunately this kind of success doesn't come easy to people and requires a lot of assistance. Institutions can really help.

How can we do better?

There are a lot of answers to this question. But since this is a blog post, let's swing for the fences. What's really needed is a privacy incubator. A place that provides both funding (or at least, guides funding) as well as in-house technical staff and researchers, non-technical help such a communications, infrastructure, a great advisory board, and access to tools and engineers.

In essence this center would combine all the best parts of NGOs, academic institutions, and corporate research into one center. It would help with projects ranging from research to development, and would also provide infrastructure for developers -- helping to keep them from re-inventing the wheel with each new idea, and perhaps even helping projects to merge when one has strengths. Connecting them with corporations who could conceivably deploy their tool.

This organization could also provide resources ranging from legal advice to marketing, two areas that software developers are notoriously bad at. It might even provide important, but miscellaneous resources healthcare.
Please keep in mind I'm not advocating the creation of an entirely new organization -- god forbid, we have enough of those already (the XKCD cartoon at right comes to mind). Instead, the goal should be to identify organizations that are already working and either connect that, or build up their capabilities with a large infusion of cash.

Anyway, we can all dream. But this is a dream that would actually make some difference.

So will this happen?

I guess it depends on the will and the money. It also depends on us: that is, on the willingness of the technically focused privacy/security community to accept that many of the elements we need to succeed are outside of our personal realm of expertise, and we need help with them.

Friends of mine also keep telling me that there are major philanthropic organizations out there looking to make a difference in this area. I'm still waiting to see it happen, but wheels turn slowly. One thing I can tell you: it wouldn't take much to do better than what we have today.

* Full disclosure: I'm on the advisory board for Linux Foundation's Core Infrastructure Initiative. I also once sat in on an advisory board meeting for OTF. Nobody paid me -- but they did feed me lunch.

Wednesday, January 14, 2015

Hopefully the last post I'll ever write on Dual EC DRBG

I've been working on some other blog posts, including a conclusion of (or at least an installment in) this exciting series on zero knowledge proofs. That's coming soon, but first I wanted to take a minute to, well, rant.

The subject of my rant is this fascinating letter authored by NSA cryptologist Michael Wertheimer in February's Notices of the American Mathematical Society. Dr. Wertheimer is currently the Director of Research at NSA, and formerly held the position of Assistant Deputy Director and CTO of the Office of the Director of National Intelligence for Analysis.

In other words, this is a guy who should know what he's talking about.

The subject of Dr. Wertheimer's letter is near and dear to my heart: the alleged subversion of NIST's standards for random number generation -- a subversion that was long suspected and apparently confirmed by classified documents leaked by Edward Snowden. The specific algorithm in question is called Dual EC DRBG, and it very likely contains an NSA backdoor. Those who've read this blog should know that I think it's as suspicious as a three dollar bill.

Reading Dr. Wertheimer's letter, you might wonder what I'm so upset about. On the face of it, the letter appears to express regret. To quote (with my emphasis):
With hindsight, NSA should have ceased supporting the Dual_EC_DRBG algorithm immediately after security researchers discovered the potential for a trapdoor. In truth, I can think of no better way to describe our failure to drop support for the Dual_EC_DRBG algorithm as anything other than regrettable. The costs to the Defense Department to deploy a new algorithm were not an adequate reason to sustain our support for a questionable algorithm. Indeed, we support NIST’s April 2014 decision to remove the algorithm. Furthermore, we realize that our advocacy for the Dual_EC_DRBG casts suspicion on the broader body of work NSA has done to promote secure standards. 
I agree with all that. The trouble is that on closer examination, the letter doesn't express regret for the inclusion of Dual EC DRBG in national standards. The transgression Dr. Wertheimer identifies is merely that NSA continued to support the algorithm after major questions were raised. That's bizarre.

Even worse, Dr. Wertheimer reserves a substantial section of his letter for a defense of the decision to deploy Dual EC. It's those points that I'd like to address in this post.

Let's take them one at a time.
1: The Dual_EC_DRBG was one of four random number generators in the NIST standard; it is neither required nor the default.
It's absolutely true that Dual EC was only one of four generators in the NIST standard. It was not required for implementers to use it, and in fact they'd be nuts to use it -- given that overall it's at least two orders of magnitude slower than the other proposed generators.

The bizarre thing is that people did indeed adopt Dual EC in major commercial software packages. Specifically, RSA Security included it as the default generator in their popular BSAFE software library. Much worse, there's evidence that RSA was asked to do this by NSA, and were compensated for their compliance.

This is the danger with standards. Once NIST puts its seal on an algorithm, it's considered "safe". If the NSA came to a company and asked it to use some strange, non-standard algorithm, the request would be considered deeply suspicious by company and customers alike. But how can you refuse to use a standard if your biggest client asks you to? Apparently RSA couldn't.
2: The NSA-generated elliptic curve points were necessary for accreditation of the Dual_EC_DRBG but only had to be implemented for actual use in certain DoD applications.
This is a somewhat misleading statement, one that really needs to be unpacked.

First, the original NSA proposal of Dual EC DRBG contained no option for alternate curve points. This is an important point, since its the selection of curve points that give Dual EC its potential for a "back door". By generating two default points (P, Q) in a specific way, the NSA may have been able to create a master key that would allow them to very efficiently decrypt SSL/TLS connections.

If you like conspiracy theories, here's what NIST's John Kelsey was told when he asked how the NSA's points were generated:



In 2004-2005, several participants on the ANSI X9 tools committee pointed out the potential danger of this backdoor. One of them even went so far as to file a patent on using the idea to implement key escrow for SSL/TLS connections. (It doesn't get more passive aggressive than that.)

In response to the discovery of such an obvious flaw, the ANSI X9 committee immediately stopped recommending the NSA's points -- and relegated them to be simply an option, one to be used by the niche set of government users who required them.

I'm only kidding! Actually the committee did no such thing.

Instead, at the NSA's urging, the ANSI committee retained the original NSA points as the recommended parameters for the standard. It then added an optional procedure for generating alternative points. When NIST later adopted the generator in its SP800-90A standard, it mirrored the ANSI decision. But even worse, NIST didn't even bother to publish the alternative point generation algorithm. To actually implement it, you'd need to go buy the (expensive) non-public-domain ANSI standard and figure it out to implement it yourself:


This is, to paraphrase Douglas Adams, the standards committee equivalent of putting the details in the bottom of a locked filing cabinet stuck in a disused lavatory with a sign on the door saying 'Beware of the Leopard'.

To the best of our knowledge, nobody has ever used ANSI's alternative generation procedure in a single one of the many implementations of Dual EC DRBG in commercial software.  It's not even clear how you could have used that procedure in a FIPS-certified product, since the FIPS evaluation process (conducted by CMVP) still requires you to test against the NSA-generated points.
3. The trapdoor concerns were openly studied by ANSI X9F1, NIST, and by the public in 2007. 
This statement has the benefit of being literally true, while also being pretty damned misleading.

It is true that in 2007 -- after Dual EC had been standardized -- two Microsoft researchers, Dan Shumow and Neils Ferguson openly raised the alarm about Dual EC. The problem here is that the flaws in Dual EC were not first discovered in 2007. They were discovered much earlier in the standardization process and nobody ever heard about them.

As I noted above, the ANSI X9 committee detected the flaws in Dual EC as early as 2004, and in close consultation with NSA agreed to address them -- in a manner that was highly beneficial to the NSA. But perhaps that's understandable, given that the committee was anything but 'open'.

In fact, this is an important aspect of the controversy that even NIST has criticized. The standardization of these algorithms was conducted through ANSI. And the closed ANSI committee consisted of representatives from a few select companies, NIST and the NSA. No public notice was given of the potential vulnerabilities discovered in the RNG. Moreover, a patent application that might have shone light on the backdoor was mired in NSA pre-publication review for over two years.

This timeline issue might seem academic, but bear this in mind: we now know that RSA Security began using the Dual EC DRBG random number generator in BSAFE -- as the default, I remind you -- way back in 2004. That means for three years this generator was widely deployed, yet serious concerns were not communicated to the public.

To state that the trapdoor concerns were 'openly' studied in 2007 is absolutely true. It's just completely irrelevant.

In conclusion

I'm not a mathematician, but like anyone who works in a mathematical area, I find there are aspects of the discipline that I love. For me it's the precision of mathematical statements, and the fact that the truth or falsity of a statement can -- ideally -- be evaluated from the statement itself, without resorting to differing opinions or understandings of the context.

While Dr. Wertheimer's letter is hardly a mathematical work, it troubles me to see such confusing statements in a publication of the AMS. As a record of history, Dr. Wertheimer's letter leaves much to be desired, and could easily lead people to the wrong understanding.

Given the stakes, we deserve a more exact accounting of what happened with Dual EC DRBG. I hope someday we'll see that.

Monday, December 29, 2014

On the new Snowden documents

If you don't follow NSA news obsessively, you might have missed yesterday’s massive Snowden document dump from Der Spiegel. The documents provide a great deal of insight into how the NSA breaks our cryptographic systems. I was very lightly involved in looking at some of this material, so I'm glad to see that it's been published.

Unfortunately with so much material, it can be a bit hard to separate the signal from the noise. In this post I’m going to try to do that a little bit -- point out the bits that I think are interesting, the parts that are old news, and the things we should keep an eye on.

Background

Those who read this blog will know that I’ve been wondering for a long time how NSA works its way around our encryption. This isn't an academic matter, since it affects just about everyone who uses technology today.

What we've learned since 2013 is that NSA and its partners hoover up vast amounts of Internet traffic from fiber links around the world. Most of this data is plaintext and therefore easy to intercept. But at least some of it is encrypted -- typically protected by protocols such as SSL/TLS or IPsec.

Conventional wisdom pre-Snowden told us that the increasing use of encryption ought to have shut the agencies out of this data trove. Yet the documents we’ve seen so far indicate just the opposite. Instead, the NSA and GCHQ have somehow been harvesting massive amounts of SSL/TLS and IPSEC traffic, and appear to be making inroads into other technologies such as Tor as well.

How are they doing this? To repeat an old observation, there are basically three ways to crack an encrypted connection:
  1. Go after the mathematics. This is expensive and unlikely to work well against modern encryption algorithms (with a few exceptions). The leaked documents give very little evidence of such mathematical breaks — though a bit more on this below.
  2. Go after the implementation. The new documents confirm a previously-reported and aggressive effort to undermine commercial cryptographic implementations. They also provide context for how important this type of sabotage is to the NSA.
  3. Steal the keys. Of course, the easiest way to attack any cryptosystem is simply to steal the keys. Yesterday we received a bit more evidence that this is happening.
I can’t possibly spend time on everything that’s covered by these documents — you should go read them yourself — so below I’m just going to focus on the highlights.

Not so Good Will Hunting

First, the disappointing part. The NSA may be the largest employer of cryptologic mathematicians in the United States, but — if the new story is any indication — those guys really aren’t pulling their weight.

In fact, the only significant piece of cryptanalytic news in the entire stack comes is a 2008 undergraduate research project looking at AES. Sadly, this is about as unexciting as it sounds -- in fact it appears to be nothing more than a summer project by a visiting student. More interesting is the context it gives around the NSA’s efforts to break block ciphers such as AES, including the NSA's view of the difficulty of such cryptanalysis, and confirmation that NSA has some ‘in-house techniques’. 


Additionally, the documents include significant evidence that NSA has difficulty decrypting certain types of traffic, including Truecrypt, PGP/GPG, Tor and ZRTP from implementations such as RedPhone. Since these protocols share many of the same underlying cryptographic algorithms — RSA, Diffie-Hellman, ECDH and AES — some are presenting this as evidence that those primitives are cryptographically strong.

As with the AES note above, this ‘good news’ should also be taken with a grain of salt. With a small number of exceptions, it seems increasingly obvious that the Snowden documents are geared towards NSA’s analysts and operations staff. In fact, many of the systems actually seem aimed at protecting knowledge of NSA's cryptanalytic capabilities from NSA's own operational staff (and other Five Eyes partners). As an analyst, it's quite possible you'll never learn why a given intercept was successfully decrypted.

To put this a bit more succinctly: the lack of cryptanalytic red meat in these documents may not truly be representative of the NSA’s capabilities. It may simply be an artifact of Edward Snowden's clearances at the time he left the NSA.

Tor

One of the most surprising aspects of the Snowden documents — to those of us in the security research community anyway — is the NSA’s relative ineptitude when it comes to de-anonymizing users of the Tor anonymous communications network.

The reason for our surprise is twofold. First, Tor was never really designed to stand up against a global passive adversary — that is, an attacker who taps a huge number of communications links. If there’s one thing we’ve learned from the Snowden leaks, the NSA (plus GCHQ) is the very definition of the term. In theory at least, Tor should be a relatively easy target for the agency.

The real surprise, though, is that despite this huge signals intelligence advantage, the NSA has barely even tested their ability to de-anonymize users. In fact, this leak provides the first concrete evidence that NSA is experimenting with traffic confirmation attacks to find the source of Tor connections. Even more surprising, their techniques are relatively naive, even when compared to what’s going on in the ‘research’ community.


This doesn’t mean you should view Tor as secure against the NSA. It seems very obvious that the agency has identified Tor as a high-profile target, and we know they have the resources to make much more headway against the network. The real surprise is that they haven’t tried harder. Maybe they're trying now.

SSL/TLS and IPSEC

A few months ago I wrote a long post speculating about how the NSA breaks SSL/TLS. Because it’s increasingly clear that the NSA does break these protocols, and at relatively large scale.

The new documents don’t tell us much we didn’t already know, but they do confirm the basic outlines of the attack. The first portion requires endpoints around the world that are capable of performing the raw decryption of SSL/TLS sessions provided they know the session keys. The second is a separate infrastructure located on US soil that can recover those session keys when needed.

All of the real magic happens within the key recovery infrastructure. These documents provide the first evidence that a major attack strategy for NSA/GCHQ involves key databases containing the private keys for major sites. For the RSA key exchange ciphersuites of TLS, a single private key is sufficient to recover vast amounts of session traffic — in real time or even after the fact.


The interesting question is how the NSA gets those private keys. The easiest answer may be the least technical. A different Snowden leak shows gives some reason to believe that the NSA may have relationships with employees at specific named U.S. entities, and may even operate personnel “under cover”. This would certainly be one way to build a key database.



But even without the James Bond aspect of this, there’s every reason to believe that NSA has other means to exfiltrate RSA keys from operators. During the period in question, we know of at least one vulnerability (Heartbleed) that could have been used to extract private keys from software TLS implementations. There are still other, unreported vulnerabilities that could be used today.

Pretty much everything I said about SSL/TLS also applies to VPN protocols, with the additional detail that many VPNs use broken protocols and relatively poorly-secured pre-shared secrets that can in some cases be brute-forced. The NSA seems positively gleeful about this.

Open Source packages: Redphone, Truecrypt, PGP and OTR

The documents provide at least circumstantial evidence that some open source encryption technologies may thwart NSA surveillance. These include Truecrypt, ZRTP implementations such as RedPhone, PGP implementations, and Off the Record messaging. These packages have a few commonalities:
  1. They’re all open source, and relatively well studied by researchers.
  2. They’re not used at terribly wide scale (as compared to e.g., SSL or VPNs)
  3. They all work on an end-to-end basis and don’t involve service providers, software distributers, or other infrastructure that could be corrupted or attacked.
What’s at least as interesting is which packages are not included on this list. Major corporate encryption protocols such as iMessage make no appearance in these documents, despite the fact that they ostensibly provide end-to-end encryption. This may be nothing. But given all we know about NSA’s access to providers, this is definitely worrying.

A note on the ethics of the leak

Before I finish, it's worth addressing one major issue with this reporting: are we, as citizens, entitled to this information? Would we be safer keeping it all under wraps? And is this all 'activist nonsense'?

This story, more than some others, skates close to a line. I think it's worth talking about why this information is important.

To sum up a complicated issue, we live in a world where targeted surveillance is probably necessary and inevitable. The evidence so far indicates that NSA is very good at this kind of work, despite some notable failures in actually executing on the intelligence it produces.

Unfortunately, the documents released so far also show that a great deal of NSA/GCHQ surveillance is not targeted at all. Vast amounts of data are scooped up indiscriminately, in the hope that some of it will someday prove useful. Worse, the NSA has decided that bulk surveillance justifies its efforts to undermine many of the security technologies that protect our own information systems. The President's own hand-picked review council has strongly recommended this practice be stopped, but their advice has -- to all appearances -- been disregarded. These are matters that are worthy of debate, but this debate hasn't happened.

Unfortunate if we can't enact changes to fix these problems, technology is probably about all that's left. Over the next few years encryption technologies are going to be widely deployed, not only by individuals but also by corporations desperately trying to reassure overseas customers who doubt the integrity of US technology.

In that world, it's important to know what works and doesn't work. Insofar as this story tells us that, it makes us all better off. 

Thursday, November 27, 2014

Zero Knowledge Proofs: An illustrated primer

One of the best things about modern cryptography is the beautiful terminology. You could start any number of punk bands (or Tumblrs) named after cryptography terms like 'hard-core predicate', 'trapdoor function', ' or 'impossible differential cryptanalysis'. And of course, I haven't even mentioned the one term that surpasses all of these. That term is 'zero knowledge'.

In fact, the term 'zero knowledge' is so appealing that it leads to problems. People misuse it, assuming that zero knowledge must be synonymous with 'really, really secure'. Hence it gets tacked onto all kinds of stuff -- like encryption systems and anonymity networks -- that really have nothing to do with true zero knowledge protocols.

This all serves to underscore a point: zero-knowledge proofs are one of the most powerful tools cryptographers have ever devised. But unfortunately they're also relatively poorly understood. In this series of posts I'm going try to give a (mostly) non-mathematical description of what ZK proofs are, and what makes them so special. In this post and the next I'll talk about some of the ZK protocols we actually use.

Origins of Zero Knowledge

The notion of 'zero knowledge' was first proposed in the 1980s by MIT researchers Shafi Goldwasser, Silvio Micali and Charles Rackoff. These researchers were working on problems related to interactive proof systems, theoretical systems where a first party (called a 'Prover') exchanges messages with a second party ('Verifier') to convince the Verifier that some mathematical statement is true.*

Prior to Goldwasser et al., most work in this area focused the soundness of the proof system. That is, it considered the case where a malicious Prover attempts to 'trick' a Verifier into believing a false statement. What Goldwasser, Micali and Rackoff did was to turn this problem on its head. Instead of worrying only about the Prover, they asked: what happens if you don't trust the Verifier? 

The specific concern they raised was information leakage. Concretely, they asked, how much extra information is the Verifier going to learn during the course of this proof, beyond the mere fact that the statement is true?

It's important to note that this is not simply of theoretical interest. There are real, practical applications where this kind of thing matters.

Here's one: imagine that a real-world client wishes to log into a web server using a password. The standard 'real world' approach to this problem involves storing a hashed version of the password on the server. The login can thus be viewed as a sort of 'proof' that a given password hash is the output of a hash function on some password -- and more to the point, that the client actually knows the password.

Most real systems implement this 'proof' in the absolute worst possible way. The client simply transmits the original password to the server, which re-computes the password hash and compares it to the stored value. The problem here is obvious: at the conclusion of the protocol, the server has learned my cleartext password. Modern password hygiene therefore involves a good deal of praying that servers aren't compromised.

What Goldwasser, Micali and Rackoff proposed was a new hope for conducting such proofs. If fully realized, zero knowledge proofs would allow us to prove statements like the one above, while provably revealing no information beyond the single bit of information corresponding to 'this statement is true'.

A 'real world' example

So far this discussion has been pretty abstract. To make things a bit more concrete, let's go ahead and give a 'real' example of a (slightly insane) zero knowledge protocol.

For the purposes of this example, I'd like you to imagine that I'm a telecom magnate in the process of deploying a new cellular communications network. My network structure is represented by the graph below. Each vertex in this graph represents a cellular radio tower, and the connecting lines (edges) indicate locations where two cells overlap, meaning that their transmissions are likely to interfere with each other.

This overlap is problematic, since it means that signals from adjacent towers are likely to scramble reception. Fortunately my network design allows me to configure each tower to one of three different frequency bands to avoid such interference.

Thus the challenge in deploying my network is to assign frequency bands to the towers such that no two overlapping cells share the same frequencies. If we use colors to represent the frequency bands, we can quickly work out one solution to the problem:


Of course, many of you will notice that what I'm describing here is simply an instance of the famous theory problem called the graph three-coloring problem. You might also know that what makes this problem interesting is that, for some graphs, it can be quite hard to find a solution, or even to determine if a solution exists. In fact, graph three-coloring -- specifically, the decision problem of whether a given graph supports a solution with three colors -- is known to be in the complexity class NP-complete.

It goes without saying that the toy example above is easy to solve by hand. But what if it wasn't? For example, imagine that my cellular network was very large and complex, so much so that the computing power at my disposal was not sufficient to find a solution. In this instance, it would be desirable to outsource the problem to someone else who has plenty of computing power. For example, I might hire my friends at Google to solve it for me on spec.

But this leads to a problem.

Suppose that Google devotes a large percentage of their computing infrastructure to searching for a valid coloring for my graph. I'm certainly not going to pay them until I know that they really have such a coloring. At the same time, Google isn't going to give me a copy of their solution until I've paid up. We'll wind up at an impasse.

In real life there's probably a common-sense answer to this dilemma, one that involves lawyers and escrow accounts. But this is not a blog about real life, it's a blog about cryptography. And if you've ever read a crypto paper, you'll understand that the right way to solve this problem is to dream up an absolutely crazy technical solution.

A crazy technical solution (with hats!)

The engineers at Google consult with Silvio Micali at MIT, who in consultation with his colleagues Oded Goldreich and Avi Wigderson, comes up with the following clever protocol -- one so elegant that it doesn't even require any computers. All it requires is a large warehouse, lots of crayons, and plenty of paper. Oh yes, and a whole bunch of hats.**

Here's how it works.

First I will enter the warehouse, cover the floor with paper, and draw a blank representation of my cell network graph. Then I'll exit the warehouse. Google can now enter enter, shuffle a collection of three crayons to pick a random assignment of the three agreed-upon crayon colors (red/blue/purple, as in the example above), and color in the graph in with their solution. Note that it doesn't matter which specific crayons they use, only that the coloring is valid.

Before leaving the warehouse, Google covers up each of the vertices with a hat. When I come back in, this is what I'll see:


Obviously this approach protects Google's secret coloring perfectly. But it doesn't help me at all. For all I know, Google might have filled in the graph with a random, invalid solution. They might not even have colored the graph at all.

To address my valid concerns, Google now gives me an opportunity to 'challenge' their solution to the graph coloring. I'm allowed to pick -- at random -- a single 'edge' of this graph (that is, one line between two adjacent hats). Google will then remove the two corresponding hats, revealing a small portion of their solution:

Notice that there are two outcomes to my experiment:
  1. If the two revealed vertices are the same color (or aren't colored in at all!) then I definitely know that Google is lying to me. Clearly I'm not going to pay Google a cent.
  2. If the two revealed vertices are different colors, then Google might not be lying to me.
Hopefully the first proposition is obvious. The second one requires a bit more consideration. The problem is that even after our experiment, Google could still be lying to me -- after all, I only looked under two of the hats. If there are E different edges in the graph, then Google could fill in an invalid solution and still get away with it most of the time. Specifically, after one test they could succeed in cheating me with probability up to (E-1)/E (which for a 1,000 edge graph works out to 99.9% of the time).

Fortunately Google has an answer to this. We'll just run the protocol again!

We put down fresh paper with a new, blank copy of the graph. Google now picks a new (random) shuffle of the three crayons. Next they fill in the graph with a valid solution, but using the new random ordering of the three colors.

The hats go back on. I come back in and repeat the challenge process, picking a new random edge. Once again the logic above applies. Only this time if all goes well, I should now be slightly more confident that Google is telling me the truth. That's because in order to cheat me, Google would have had to get lucky twice in a row. That can happen -- but it happens with relatively lower probability. The chance that Google fools me twice in a row is now (E-1)/E * (E-1)/(or about 99.8% probability for our 1,000 edge example above).

Fortunately we don't have to stop at two challenges. In fact, we can keep trying this over and over again until I'm confident that Google is probably telling me the truth.

But don't take my word for it. Thanks to some neat Javascript, you can go try it yourself.

Note that I'll never be perfectly certain that Google is being honest -- there's always going to be a tiny probability that they're cheating me. But after a large number of iterations (E^2, as it happens) I can eventually raise my confidence to the point where Google can only cheat me with negligible probability -- low enough that for all practical purposes it's not worth worrying about. And then I'll be able to safely hand Google my money.

What you need to believe is that Google is also protected. Even if I try to learn something about their solution by keeping notes between protocol runs, it shouldn't matter. I'm foiled by Google's decision to randomize their color choices between each iteration. The limited information I obtain does me no good, and there's no way for me to link the data I learn between interactions.

What makes it 'zero knowledge'?

I've claimed to you that this protocol leaks no information about Google's solution. But don't let me get away with this! The first rule of modern cryptography is never to trust people who claim such things without proof.

Goldwasser, Micali and Rackoff proposed three following properties that every zero-knowledge protocol must satisfy. Stated informally, they are:
  1. Completeness. If Google is telling the truth, then they will eventually convince me (at least with high probability).
  2. Soundness. Google can only convince me if they're actually telling the truth. 
  3. Zero-knowledgeness. (Yes it's really called this.) I don't learn anything else about Google's solution.
We've already discussed the argument for completeness. The protocol will eventually convince me (with a negligible error probability), provided we run it enough times. Soundness is also pretty easy to show here. If Google ever tries to cheat me, I will detect their treachery with overwhelming probability.

The hard part here is the 'zero knowledgeness' property. To do this, we need to conduct a very strange thought experiment.

A thought experiment (with time machines)

First, let's start with a crazy hypothetical. Imagine that Google's engineers aren't quite as capable as people make them out to be. They work on this problem for weeks and weeks, but they never manage to come up with a solution. With twelve hours to go until showtime, the Googlers get desperate. They decide to trick me into thinking they have a coloring for the graph, even though they don't.

Their idea is to sneak into the GoogleX workshop and borrow Google's prototype time machine. Initially the plan is to travel backwards a few years and use the extra working time to take another crack at solving the problem. Unfortunately it turns out that, like most Google prototypes, the time machine has some limitations. Most critically: it's only capable of going backwards in time four and a half minutes.

So using the time machine to manufacture more working time is out. But still, it turns out that even this very limited technology can still be used to trick me.
I don't really know what's going on here
but it seemed apropos.

The plan is diabolically simple. Since Google doesn't actually know a valid coloring for the graph, they'll simply color the paper with a bunch of random colors, then put the hats on. If by sheer luck, I challenge them on a pair of vertices that happen to be different colors, everyone will heave a sigh of relief and we'll continue with the protocol. So far so good.

Inevitably, though, I'm going to pull off a pair of hats and discover two vertices of the same color. In the normal protocol, Google would now be totally busted. And this is where the time machine comes in. Whenever Google finds themselves in this awkward situation, they simply fix it. That is, a designated Googler pulls a switch, 'rewinds' time about four minutes, and the Google team recolors the graph with a completely new random solution. Now they let time roll forward and try again.

In effect, the time machine allows Google to 'repair' any accidents that happen during their bogus protocol execution, which makes the experience look totally legitimate to me. Since bad challenge results will occur only 1/3 of the time, the expected runtime of the protocol (from Google's perspective) is only moderately greater than the time it takes to run the honest protocol. From my perspective I don't even know that the extra time machine trips are happening.

This last point is the most important. In fact, from my perspective, being unaware that the time machine is in the picture, the resulting interaction is exactly the same as the real thing. It's statistically identical. And yet it's worth pointing out again that in the time machine version, Google has absolutely no information about how to color the graph.

What the hell is the point of this?

What we've just shown is an example of a simulation. Note that in a world where time runs only forward and nobody can trick me with a time machine, the hat-based protocol is correct and sound, meaning that after E^2 rounds I should be convinced (with all but negligible probability) that the graph really is colorable and that Google is putting valid inputs into the protocol.

What we've just shown is that if time doesn't run only forward -- specifically, if Google can 'rewind' my view of time -- then they can fake a valid protocol run even if they have no information at all about the actual graph coloring.

From my perspective, what's the difference between the two protocol transcripts? When we consider the statistical distribution of the two, there's no difference at all. Both convey exactly the same amount of useful information.

Believe it or not, this proves something very important.

Specifically, assume that I (the Verifier) have some strategy that 'extracts' useful information about Google's coloring after observing an execution of the honest protocol. Then my strategy should work equally well in the case where I'm being fooled with a time machine. The protocol runs are, from my perspective, statistically identical. I physically cannot tell the difference.

Thus if the amount of information I can extract is identical in the 'real experiment' and the 'time machine experiment', yet the amount of information Google puts into the 'time machine' experiment is exactly zero -- then this implies that even in the real world the protocol must not leak any useful information.

Thus it remains only to show that computer scientists have time machines. We do! (It's a well-kept secret.)

Getting rid of the hats (and time machines)

Of course we don't actually want to run a protocol with hats. And even Google (probably?) doesn't have a literal time machine.

To tie things together, we first need to bring our protocol into the digital world. This requires that we construct the digital equivalent of a 'hat': something that both hides a digital value, while simultaneously 'binding' (or 'committing') the maker to it, so she can't change her mind after the fact.

Fortunately we have a perfect tool for this application. It's called a digital commitment scheme. A commitment scheme allows one party to 'commit' to a given message while keeping it secret, and then later 'open' the resulting commitment to reveal what's inside. They can be built out of various ingredients, including (strong) cryptographic hash functions.******

Given a commitment scheme, we now have all the ingredients we need to run the zero knowledge protocol electronically. The Prover first encodes its vertex colorings as a set of digital messages (for example, the numbers 0, 1, 2), then generates digital commitments to each one. These commitments get sent over to the Verifier. When the Verifier challenges on an edge, the Prover simply reveals the opening values for the commitments corresponding to the two vertices.

So we've managed to eliminate the hats. But how do we prove that this protocol is zero knowledge?

Fortunately now that we're in the digital world, we no longer need a real time machine to prove things about this protocol. A key trick is to specify in our setting that the protocol is not going to be run between two people, but rather between two different computer programs (or, to be more formal, probabilistic Turing machines.)

What we can now prove is the following theorem: if you could ever come up with a computer program (for the Verifier) that extracts useful information after participating in a run of the protocol, then it would be possible to use a 'time machine' on that program in order to make it extract the same amount of useful information from a 'fake' run of the protocol where the Prover doesn't put in any information to begin with.

And since we're now talking about computer programs, it should be obvious that rewinding time isn't such an extraordinary feat at all. In fact, we rewind computer programs all the time. For example, consider using virtual machine software with a snapshot capability.

Example of rewinding through VM snapshots. An initial VM is played forward, rewound to an
 initial snapshot, then execution is forked to a new path. 
Even if you don't have fancy virtual machine software, any computer program can be 'rewound' to an earlier state, simply by starting the program over again from the beginning and feeding it exactly the same inputs. Provided that the inputs -- including all random numbers -- are fixed, the program will always follow the same execution path. Thus you can rewind a program just by running it from the start and 'forking' its execution when it reaches some desired point.

Ultimately what we get is the following theorem. If there exists any Verifier computer program that successfully extracts information by interactively running this protocol with some Prover, then we can simply use the rewinding trick on that program to commit to a random solution, then 'trick' the Verifier by rewinding its execution whenever we can't answer its challenge correctly. The same logic holds as we gave above: if such a Verifier succeeds in extracting information after running the real protocol, then it should be able to extract the same amount of information from the simulated, rewinding-based protocol. But since there's no information going into the simulated protocol, there's no information to extract. Thus the information the Verifier can extract must always be zero.

Ok, so what does this all mean?

So let's recap. We know that the protocol is complete and sound, based on our analysis above. The soundness argument holds in any situation where we know that nobody is fiddling with time -- that is, the Verifier is running normally and nobody is rewinding its execution.

At the same time, the protocol is also zero knowledge. To prove this, we showed that any Verifier program that succeeds in extracting information must also be able to extract information from a protocol run where rewinding is used and no information is available in the first place. Which leads to an obvious contradiction, and tells us that the protocol can't leak information in either situation.

There's an important benefit to all this. Since it's trivial for anyone to 'fake' a protocol transcript, even after Google proves to me that they have a solution, I can't re-play a recording of the protocol transcript to prove anything to anyone else (say, a judge). That's because the judge would have no guarantee that the video was recorded honestly, and that I didn't simply edit in the same way Google might have done using the time machine. This means that protocol transcripts themselves contain no information. The protocol is only meaningful if I myself participated, and I can be sure that it happened in real time.

Proofs for all of NP!

If you've made it this far, I'm pretty sure you're ready for the big news. Which is that 3-coloring cellphone networks isn't all that interesting of a problem -- at least, not in and of itself.

The really interesting thing about the 3-coloring problem is that it's in the class NP-complete. To put this informally, the wonderful thing about such problems is that any other problem in the class NP can be translated into an instance of that problem.

In a single stroke, this result -- due to Goldreich, Micali and Wigderson -- proves that 'efficient' ZK proofs exists for a vast class of useful statements, many of which are way more interesting than assigning frequencies to cellular networks. You simply find a statement (in NP) that you wish to prove, such as our hash function example from above, then translate it into an instance of the 3-coloring problem. At that point you simply run the digital version of the hat protocol.

In summary, and next time

Of course, actually running this protocol for interesting statements would be an insanely silly thing for anyone to do, since the cost of doing so would include the total size of the original statement and witness, plus the reduction cost to convert it into a graph, plus the E^2 protocol rounds you'd have to conduct in order to convince someone that the proof is valid. Theoretically this is 'efficient', since the total cost of the proof would be polynomial in the input size, but in practice it would be anything but.

So what we've shown so far is that such proofs are possible. It remains for us to actually find proofs that are practical enough for real-world use.

In the next post I'll talk about some of those -- specifically, the efficient proofs that we use for various useful statements. I'll give some examples (from real applications) where these things have been used. Also at reader request: I'll also talk about why I dislike SRP so much.

Notes:

* Formally, the goal of an interactive proof is to convince the Verifier that a particular string belongs to some language. Typically the Prover is very powerful (unbounded), but the Verifier is limited in computation.

** This example is based on the original solution of Goldwasser, Micali and Rackoff, and the teaching example using hats is based on an explanation by Silvio Micali. I take credit only for the silly mistakes.

****** A simple example of a commitment can be built using a hash function. To commit to the value "x" simply generate some (suitably long) string of random numbers, which we'll call 'salt', and output the commitment C = Hash(salt || x). To open the commitment, you simply reveal 'x' and 'salt'. Anyone can check that the original commitment is valid by recomputing the hash. This is secure under some (moderately strong) assumptions about the function itself.