Why the FBI can’t get your browsing history from Apple iCloud (and other scary stories)

Why the FBI can’t get your browsing history from Apple iCloud (and other scary stories)

It’s not every day that I wake up thinking about how people back up their web browsers. Mostly this is because I don’t feel the need to back up any aspect of my browsing. Some people lovingly maintain huge libraries of bookmarks and use fancy online services to organize them. I pay for one of those because I aspire to be that kind of person, but I’ve never been organized enough to use it.

In fact, the only thing I want from my browser is for my history to please go away, preferably as quickly as possible. My browser is a part of my brain, and backing my thoughts up to a cloud provider is the most invasive thing I can imagine. Plus, I’m constantly imagining how I’ll explain specific searches to the FBI.

All of these thoughts are apropos a Twitter thread I saw last night from the Engineering Director on Chrome Security & Privacy at Google, which explains why “browser sync” features (across several platforms) can’t provide end-to-end encryption by default.

This thread sent me down a rabbit hole that ended in a series of highly-scientific Twitter polls and frantic scouring of various providers’ documentation. Because while on the one hand Justin’s statement is mostly true, it’s also a bit wrong. Specifically, I learned that Apple really seems to have solved this problem. More interestingly, the specific way that Apple has addressed this problem highlights some strange assumptions that make this whole area unnecessarily messy.

This munging of expectations also helps to explain why “browser sync” features and the related security tradeoffs seem so alien and horrible to me, while other folks think these are an absolute necessity for survival.

Let’s start with the basics.

What is cloud-based browser “sync”, and how secure is it?

Most web browsers (and operating systems with a built-in browser) incorporate some means of “synchronizing” browsing history and bookmarks. By starting with this terminology we’ve already put ourselves on the back foot, since “synchronize” munges together three slightly different concepts:

  1. Synchronizing content across devices. Where, for example you have a phone, a laptop and a tablet all active and in occasional use and want your data to propagate from one to the others.
  2. Backing up your content. Wherein you lose all your device(s) and need to recover this data onto a fresh clean device.
  3. Logging into random computers. If you switch computers regularly (for example, back when we worked in offices) then you might want to be able to quickly download your data from the cloud.

(Note that the third case is kind of weird. It might be a subcase of #1 if you have another device that’s active and can send you the data. It might be a subcase of #2. I hate this one and am sending it to live on a farm upstate.)

You might ask why I call these concepts “very different” when they all seem quite similar. The answer is that I’m thinking about a very specific question: namely, how hard is it to end-to-end encrypt this data so that the cloud provider can’t read it? The answer is different between (at least) the first two cases.

If what we really want to do is synchronize your data across many active devices, then the crypto problem is relatively easy. The devices generate public keys and register them with your cloud provider, and then each one simply encrypts relevant content to the others. Apple has (I believe) begun to implement this across their device ecosystem.

If what we want is cloud backup, however, then the problem is much more challenging. Since the base assumption is that the device(s) might get lost, we can’t store decryption keys there. We could encrypt the data under the user’s device passcode or something, but most users choose terrible passcodes that are trivially subject to dictionary attacks. Services like Apple iCloud and Google (Android) have begun to deploy trusted hardware in their data centers to mitigate this: these “Hardware Security Modules” (HSMs) store encryption keys for each user, and only allow a limited number of password guesses before they wipe the keys forever. This keeps providers and hackers out of your stuff. Yay!

Except: not yay! Because, as Justin points out (and here I’m paraphrasing in my own words) users are the absolute worst. Not only do they choose lousy passcodes, but they constantly forget them. And when they forget their passcode and can’t get their backups, do they blame themselves? Of course not! They blame Justin. Or rather, they complain loudly to their cloud backup providers.

While this might sound like an extreme characterization, remember: when you have a billion users, the extreme ones will show up quite a bit.

The consequence of this, argues Justin, is that most cloud backup services don’t use default end-to-end encryption for browser synchronization, and hence your bookmarks and in this case your browsing history will be stored at your provider in plaintext. Justin’s point is that this decision flows from the typical user’s expectations and is not something providers have much discretion about.

And if that means your browsing history happens to get data-mined, well: the spice must flow.

Except none of this is quite true, thanks to Apple!

The interesting thing about this explanation is that it’s not quite true. I was inclined to believe this explanation, until I went spelunking through the Apple iCloud security docs and found that Apple does things slightly differently.

(Note that I don’t mean to blame Justin for not knowing this. The problem here is that Apple absolutely sucks at communicating their security features to an audience that isn’t obsessed with reading their technical documentation. My students and I happen to be obsessive, and sometimes it pays dividends.)

What I learned from my exploration (and here I pray the documentation is accurate) is that Apple actually does seem to provide end-to-end encryption for browser data. Or more specifically: they provide end-to-end encryption for browser history data starting as of iOS 13.

Image

More concretely, Apple claims that this data is protected “with a passcode”, and that “nobody else but you can read this data.” Presumably this means Apple is using their iCloud Keychain HSMs to store the necessary keys, in a way that Apple itself can’t access.

What’s interesting about the Apple decision is that it appears to explicitly separate browsing history and bookmarks, rather than lumping them into a single take-it-or-leave-it package. Apple doesn’t claim to provide any end-to-end encryption guarantees whatsoever for bookmarks: presumably someone who resets your iCloud account password can get those. But your browsing history is protected in a way that even Apple won’t be able to access, in case the FBI show up with a subpoena.

That seems like a big deal and I’m surprised that it’s gotten so little attention.

Why should browser history be lumped together with bookmarks?

This question gets at the heart of why I think browser synchronization is such an alien concept. From my perspective, browsing history is an incredibly sensitive and personal thing that I don’t want anywhere. Bookmarks, if I actually used them, would be the sort of thing I’d want to preserve.

I can see the case for keeping history on my local devices. It makes autocomplete faster, and it’s nice to find that page I browsed yesterday. I can see the case for (securely) synchronizing history across my active devices. But backing it up to the cloud in case my devices all get stolen? Come on. This is like the difference between backing up my photo library, and attaching a GoPro to my head while I’m using the bathroom.

(And Google’s “sync” services only stores 90 days of history, so it isn’t even a long-term backup.)

One cynical answer to this question is: these two very different forms of data are lumped together because one of them — browser history — is extremely valuable for advertising companies. The other one is valuable to consumers. So lumping them together gets consumers to hand over the sweet, sweet data in exchange for something they want. This might sound critical, but on the other hand, we’re just describing the financial incentive that we know drives most of today’s Internet.

A less cynical answer is that consumers really want to preserve their browsing history. When I asked on Twitter, a bunch of tech folks noted that they use their browsing history as an ad-hoc bookmarking system. This all seemed to make some sense, and so maybe there’s just something I don’t get about browser history.

However, the important thing to keep in mind here is that just because you do this doesn’t mean it should drive a few billion people’s security posture. The implications of prioritizing the availability of browser history backups (as a default) is that vast numbers of people will essentially have their entire history uploaded to the cloud, where it can be accessed by hackers, police and surveillance agencies.

Apple seems to have made a different calculation: not that history isn’t valuable, but that it isn’t a good idea to hold the detailed browser history of a billion human beings in a place where any two-bit police agency or hacker can access it. I have a very hard time faulting them in that.

And if that means a few users get upset, that seems like a good tradeoff to me.

The future of Ransomware

This is kind of a funny post for me to write, since it ransomwareinvolves speculating about a very destructive type of software — and possibly offering some (very impractical) suggestions on how it might be improved in the future. It goes without saying that there are some real downsides to this kind of speculation. Nonetheless, I’m going ahead on the theory that it’s usually better to talk and think about the bad things that might happen to you — before you meet them on the street and they steal your lunch money.

On the other hand, just as there’s a part of every karate master that secretly wants to go out and beat up a bar full of people, there’s a part of every security professional that looks at our current generation of attackers and thinks: why can’t you people just be a bit more imaginative?! And wonders whether, if our attackers were just a little more creative, people would actually pay attention to securing their system before the bad stuff happens.

And ransomware is definitely a bad thing. According to the FBI it sucks up $1 billion/year in payments alone, and some unimaginably larger amount in remediation costs. This despite the fact that many ransomware packages truly suck, and individual ransomware developers get routinely pwned due to making stupid cryptographic errors. If this strategy is working so well today, the question  we should be asking ourselves is: how much worse could it get?

So that’s what I’m going to muse about now. A few (cryptographic) ways that it might.

Some of these ideas are the result of collaboration with my students Ian Miers, Gabe Kaptchuk and Christina Garman. They range from the obvious to the foolish to the whimsical, and I would be utterly amazed if any of them really do happen. So please don’t take this post too seriously. It’s all just fun.

Quick background: ransomware today

The amazing thing about ransomware is that something so simple could turn out to be such a problem. Modern ransomware consists of malware that infects your computer and then goes about doing something nasty: it encrypts every file it can get its hands on. This typically includes local files as well as network shares that can be reached from the infected machine.

cryptob1

Once your data has been encrypted, your options aren’t great. If you’re lucky enough to have a recent backup, you can purge the infected machine and restore. Otherwise you’re faced with a devil’s bargain: learn top live without that data, or pay the bastards.

If you choose to pay up, there are all sorts of different procedures. However most break down into the following three steps:

  1. When the ransomware encrypts your files, it generates a secret key file and stores it on your computer.
  2. You upload that file (or data string) to your attackers along with a Bitcoin payment.
  3. They process the result with their secrets and send you a decryption key.

If you’re lucky, and your attackers are still paying attention (or haven’t screwed up the crypto beyond recognition) you get back a decryption key or a tool you can use to undo the encryption on your files. The whole thing is very businesslike. Indeed, recent platforms will allegedly offer you a discount if you infect recommend it to your friends — just like Lyft!

The problem of course, is that nothing in this process guarantees that your attacker will give you that decryption key. They might be scammers. They might not have the secret anymore. They might get tracked down and arrested. Or they might get nervous and bail, taking your precious data and your payment with them. This uncertainty makes ransomware payments inherently risky — and worse, it’s the victims who mostly suffer for it.

Perhaps it would be nice if we could make that work better.

Verifiable key delivery using smart contracts

Most modern ransomware employs a cryptocurrency like Bitcoin to enable the payments that make the ransom possible. This is perhaps not the strongest argument for systems like Bitcoin — and yet it seems unlikely that Bitcoin is going away anytime soon. If we can’t solve the problem of Bitcoin, maybe it’s possible to use Bitcoin to make “more reliable” ransomware.

Recall that following a ransomware infection, there’s a possibility that you’ll pay the ransom and get nothing in return. Fundamentally there’s very little you can do about this. A conscientious ransomware developer might in theory offer a “proof of life” — that is, offer to decrypt a few files at random in order to prove their bonafides. But even if they bother with all the risk and interaction of doing this, there’s still no guarantee that they’ll bother to deliver the hostage alive.

An obvious approach to this problem is to make ransomware payments conditional. Rather than sending off your payment and hoping for the best, victims could use cryptocurrency features to ensure that ransomware operators can’t get paid unless they deliver a key. Specifically, a ransomware developer could easily perform payment via a smart contract script (in a system like Ethereum) that guarantees the following property:

This payment will be delivered to the ransomware operator if and only if the ransomware author unlocks it — by posting the ransomware decryption key to the same blockchain.

The basic primitive needed for this is called a Zero Knowledge Contingent Payment. This idea was proposed by Greg Maxwell and demonstrated by Sean Bowe of the ZCash team.**** The rough idea is to set the decryption key to be some pre-image k for some public hash value K that the ransomware generates and leaves on your system. It’s relatively easy to imagine a smart contract that allows payment if and only if the payee can post the input k such that K=SHA256(k). This could easily be written in Ethereum, and almost certainly has an analog for Bitcoin script.

The challenge here, of course, is to prove that k is actually a decryption key for your files, and that the files contain valid data. There are a handful of different ways to tackle this problem. One is to use complex zero-knowledge proof techniques (like zkSNARKs or ZKBoo) to make the necessary proofs non-interactively. But this is painful, and frankly above the level of most ransomware developers — who are still struggling with basic RSA.

An alternative approach is to use several such K challenges in combination with the “proof of life” idea. The ransomware operator would prove her bonafides by decrypting a small, randomly selected subset of files before the issuer issued payment. The operator could still “fake” the encryption — or lose the decryption key — but she would be exposed with reasonable probability before money changed hands.

“Autonomous” ransomware

Of course, the problem with “verifiable” ransomware is: what ransomware developer would bother with this nonsense?google-self-driving-car-624x326

While the ability to verify decryption might conceivably improve customer satisfaction, it’s not clear that it would really offer that much value to ransomware deverlopers. At the same time, it would definitely add a lot of nasty complexity to their software.

Instead of pursuing ideas that offer developers no obvious upside, ransomware designers presumably will pursue ideas that offer them some real benefits. And that brings us to an idea time whose time has (hopefully) not quite come yet. The idea itself is simple:

Make ransomware that doesn’t require operators.

Recall that in the final step of the ransom process, the ransomware operator must deliver a decryption key to the victim. This step is the most fraught for operators, since it requires them to manage keys and respond to queries on the Internet. Wouldn’t it be better for operators if they could eliminate this step altogether?

Of course, to accomplish this seems to require a trustworthy third party — or better, a form of ransomware that can decrypt itself when the victim makes a Bitcoin payment. Of course this last idea seems fundamentally contradictory. The decryption keys would have to live on the victim’s device, and the victim owns that device. If you tried that, then victim could presumably just hack the secrets out and decrypt the ransomware without paying.

But what if the victim couldn’t hack their own machine?

This isn’t a crazy idea. In fact, it’s exactly the premise that’s envisioned by a new class of trusted execution environments, including Intel’s SGX and ARM TrustZone. These systems — which are built into the latest generation of many processors — allow users to instantiate “secure enclaves”: software environments that can’t be accessed by outside parties. SGX also isolates enclaves from other enclaves, which means the secrets they hold are hard to pry out.

Hypothetically, after infecting your computer a piece of ransomware could generate and store its decryption key inside of a secure enclave. This enclave could be programmed to release the key only on presentation of a valid Bitcoin payment to a designated address.

The beauty of this approach is that no third party even needs to verify the payment. Bitcoin payments themselves consist of a publicly-verifiable transaction embedded in a series of “blocks”, each containing an expensive computational “proof of work“. In principle, after paying the ransom the victim could present the SGX enclave with a fragment of a blockchain all by itself — freeing the ransomware of the need to interact with third parties. If the blockchain fragment exhibited sufficient hashpower along with a valid payment to a specific address, the enclave would release the decryption key.*

The good news is that Intel and ARM have devoted serious resources to preventing this sort of unauthorized access. SGX developers must obtain a code signing certificate from Intel before they can make production-ready SGX enclaves, and it seems unlikely that Intel would partner up with a ransomware operation. Thus a ransomware operator would likely have to (1) steal a signing key from a legitimate Intel-certified developer, or (2) find an exploitable vulnerability in another developer’s enclave.**, ***

This all seems sort of unlikely, and that appears to block most of the threat — for now. Assuming companies like Intel and Qualcomm don’t screw things up, and have a good plan for revoking enclaves (uh oh), this is not very likely to be a big threat.

Of course, in the long run developers might not need Intel SGX at all. An even more speculative concern is that developments in the field of cryptographic obfuscation will provide a software-only alternative means to implement this type of ransomware. This would eliminate the need for a dependency like SGX altogether, allowing the ransomware to do its work with no hardware at all.

At present such techniques are far north of practical, keep getting broken, and might not work at all. But cryptographic researchers keep trying! I guess the lesson is that it’s not all roses if they succeed.

Ransomware Skynet

Since I’m already this far into what reads like a Peyote-fueled rant, let’s see if we can stretch the bounds of credibility just a little a bit farther. If ransomware can become partially autonomous — i.e., do part of its job without the need for human masters — what would it mean for it to become fully autonomous? In other words, what if we got rid of the rest of the human equation?

terminatorgenisys1-xlarge
I come from the future to encrypt C:\Documents

Ransomware with the ability to enforce payments would provide a potent funding source for another type of autonomous agent: a Decentralized Autonomous Organization, or (DAO). These systems are “corporations” that consist entirely of code that runs on a consensus network like Ethereum. They’re driven by rules, and are capable of both receiving and transmitting funds without (direct) instruction from human beings.

At least in theory it might be possible to develop a DAO that’s funded entirely by ransomware payments — and in turn mindlessly contracts real human beings to develop better ransomware, deploy it against human targets, and… rinse repeat. It’s unlikely that such a system would be stable in the long run — humans are clever and good at destroying dumb things — but it might get a good run. Who knows? Maybe this is how the Rampant Orphan Botnet Ecologies get started.

(I hope it goes without saying that I’m mostly not being serious about this part. Even though it would be totally awesome in a horrible sort of way.)

In conclusion

This hasn’t been a terribly serious post, although it was fun to write. The truth is that as a defender, watching your attackers fiddle around is pretty much the most depressing thing ever. Sometimes you have to break the monotony a bit.

But insofar as there is a serious core to this post, it’s that ransomware currently is using only a tiny fraction of the capabilities available to it. Secure execution technologies in particular represent a giant footgun just waiting to go off if manufacturers get things only a little bit wrong.

Hopefully they won’t, no matter how entertaining it might be.

Notes:

* This technique is similar to SPV verification. Of course, it would also be possible for a victim to “forge” a blockchain fragment without paying the ransom. However, the cost of this could easily be tuned to significantly exceed the cost of paying the ransom. There are also many issues I’m glossing over here like difficulty adjustments and the possibility of amortizing the forgery over many different victims. But thinking about that stuff is a drag, and this is all for fun, right?

** Of course, if malware can exploit such a vulnerability in another developer’s enclave to achieve code execution for “ransomware”, then the victim could presumably exploit the same vulnerability to make the ransomware spit out its key without a payment. So this strategy seems self-limiting — unless the ransomware developers find a bug that can be “repaired” by changing some immutable state held by the enclave. That seems like a long shot. And no, SGX does not allow you to “seal” data to the current state of the enclave’s RAM image.

*** In theory, Intel or an ARM manufacturer could also revoke the enclave’s signing certificate. However, the current SGX specification doesn’t explain how such a revocation strategy should work. I assume this will be more prominent in future specifications.

**** The original version of this post didn’t credit Greg and Sean properly, because I honestly didn’t make the connection that I was describing the right primitive. Neat!

The crypto dream

Arvind Narayanan just gave a fascinating talk at Princeton’s Center for Information Technology Policy entitled ‘What Happened to the Crypto Dream?‘. That link is to the video, which unfortunately you’ll actually have to watch — I have yet to find a transcript.

From the moment I heard the title of Arvind’s talk I was interested, since it asks an important question that I wanted a chance to answer. Specifically: what happened to the golden future of crypto? You know, the future that folks like Phil Zimmermann offered us — the one that would have powered the utopias of Neal Stephenson and the dystopias of William Gibson (or do I have that backwards?) This was the future where cryptography fundamentally altered the nature of society and communications and set us free in new and exciting ways.

That future never quite arrived. Oh, mind you, the technology did — right on schedule. We’re living in a world where it’s possible to visit Tokyo without ever leaving your bed, and where governments go to war with software rather than tanks. Yet in some ways the real future is more Stephen King than William Gibson. The plane landed; nobody was on board.

So what did happen to the crypto dream?

Arvind gives us a bunch of great answers, and for the most part I agree with him. But we differ in a few places too. Most importantly, Arvind is a Princeton scholar who has been known to toss out terms like ‘technological determinism’Me, I’m just an engineer. What I want to know is: where did we screw it all up? And how do we make it right?

The premise, explained

Once upon a time most important human transactions were done face-to-face, and in those transactions we enjoyed at least the promise that our communications would be private. Then everything changed. First came the telephones and telegraphs, and then computer networks. As our friends and colleagues spread farther apart geographically, we eagerly moved our personal communications to these new electronic networks. Networks that, for all their many blessings, are anything but private.

People affected by telephonic surveillance (1998-2011). Source: ACLU

Some people didn’t like this. They pointed out that our new electronic communications were a double-edge sword, and were costing us protections that our ancestors had fought for. And a very few decided to do something about it. If technological advances could damage our privacy, they reasoned, then perhaps the same advances could help us gain it back.

Technically, the dream was born from the confluence of three separate technologies. The first was the PC, which brought computing into our living room. The second was the sudden and widespread availability of computer networking: first BBSes, then WANs like GTE Telenet, and then the Internet. Most critically, the dream was fueled by the coincidental rise of scientific, industrial cryptography, starting with the publication of the Data Encryption Standard and continuing through the development of technologies like public-key cryptography.

By 1990s, the conditions were in place for a privacy renaissance. For the first time in history, the average person had access to encryption technology that was light years beyond what most governments had known before. The flagbearer of this revolution was Philip Zimmermann and his Pretty Good Privacy (PGP), which brought strong encryption to millions. Sure, by modern standards PGP 1.0 was a terrible flaming piece of crap. But it was a miraculous piece of crap. And it quickly got better. If we just hung in there, the dream told us, the future would bring us further miracles, things like perfect cryptographic anonymity and untraceable electronic cash.

It’s worth pointing out that the ‘dream’ owes a lot of its power to government itself. Congress and the NSA boosted it by doing what they do best — freaking out. This was the era of export regulations and 40-bit keys and Clipper chips and proposals for mandatory backdoors in all crypto software. Nothing says ‘clueless old dinosaurs’ like the image of Phil Zimmermann being searched at the border — for copies of a program you could download off the Internet!

And so we all held a million key signing parties and overlooked a few glaring problems with the software we were using. After all, these would be resolved. Once we convinced the masses to come along with us, the future would be encrypted and paid for with e-Cash spent via untraceable electronic networks on software that would encrypt itself when you were done using it. The world would never be the same.

Obviously none of this actually happened.

If you sent an email today — or texted, or made a phone call — chances are that your communication was just as insecure as it would have been in 1990. Maybe less so. It probably went through a large service provider who snarfed up the cleartext, stuffed it into an advertising algorithm, then dropped it into a long term data store where it will reside for the next three years. It’s hard to be private under these circumstances.

Cryptography is still everywhere, but unfortunately it’s grown up and lost its ideals. I don’t remember the last time I bothered to send someone a GPG email — and do people actually have key signing parties anymore?

There are a few amazingly bright spots in this dull landscape — I’ll get to them in due course — but for the most part crypto just hasn’t lived up to its privacy billing. The question is why? What went so terribly wrong? In the rest of this post I’ll try to give a few of my answers to this question.

Problem #1: Crypto software is too damned hard to use.

Cryptographers are good at cryptography. Software developers are good at writing code. Very few of either camp are good at making things easy to use. In fact, usability is a surprisingly hard nut to crack across all areas of software design, since it’s one of a few places where 99% is just not good enough. This is why Apple and Samsung sell a zillion phones every year, and it’s why the ‘year of Linux on the Desktop‘ always seems to be a few years away.

Security products are without a doubt the worst products for usability, mostly because your user is also the enemy. If your user can’t work a smartphone, she might not be able to make calls. But if she screws up with a security product, she could get pwned.

Back in 1999 — in one of the few usability studies we have in this area — Alma Whitten and J.D. Tygar sat down and tried to get non-experts to use PGP, a program that experts generally thought of as being highly user friendly. Needless to say the results were not impressive. And as fun as it is to chuckle at the people involved (like the guy who revoked his key and left the revocation message on his hard drive) the participants weren’t idiots. They were making the same sort of mistakes everyone makes with software, just with potentially more serious consequences.

And no, this isn’t just a software design problem. Even if you’re a wizard with interfaces, key management turns out to be just plain hard. And worse, your brililant idea for making it easier will probably also make you more vulnerable. Where products have ‘succeeded’ in marketing end-to-end encryption, they’ve usually done so by making radical compromises that undermine the purpose of the entire exercise.

Think Hushmail, where the crypto client was delivered from a (potentially) untrustworthy server. Or S/MIME email certificates which are typically generated in a way that could expose private key to the CA. And of course, there’s Skype, which operates their own user-friendly CAs, a CA that can potentially pwn you in a heartbeat.

Problem #2: Snake-oil cryptography has cluttered the space.

As Arvind points out, most people don’t really understand the limitations of cryptography. This goes for people who rely on it for their business (can’t tell you how many times I’ve explained this stuff to DRM vendors.) It goes double for the average user.

The problem is that when cryptography does get used, it’s often applied in dangerous, stupid and pointless ways. And yet people don’t know this. So bad products get equal (or greater) billing than good ones, and the market lacks the necessary information to provide a sorting function. This is a mess, since cryptography — when treated as a cure-all with magical properties — can actually make us less secure than we might otherwise be.

Take VPN services, for example. These propose to secure you from all kinds of threats, up to and including totalitarian governments. But the vast majority of commercial VPN providers do terribly insecure things, like use a fixed shared-secret across all users. Data encryption systems are another big offender. These are largely purchased to satisfy regulatory requirements, and buying one can get you off the hook for all kinds of bad behavior: regulations often excuse breaches as long as you encrypt your data — in some way — before you leave it in a taxi. The details are often overlooked.

With so much weak cryptography out there, it’s awfully hard to distinguish a good system. Moreover, the good system will probably be harder to use. How do you convince people that there’s a difference?

Problem #3: You can’t make money selling privacy.

As I’ll explain in a minute this one isn’t entirely true. Yet of all the answers in this post I tend to believe that it’s also the one with the most explanitory power.

Here’s the thing: developing cryptographic technology isn’t cheap, and it isn’t fast. It takes time, love, and a community of dedicated developers. But more importantly, it requires subject matter experts. These people often have families and kids, and kids can’t eat dreams. This means you need a way to pay them. (The parents, that is. You can usually exploit the kids as free labor and call it an ‘internship’.)

Across the board, commercialization of privacy technologies has been something of a bust. David Chaum gave it a shot with his anonymous electronic cash company. Didn’t work. Hushmail had a good run. These guys are giving it a shot right now — and I wish them enormous luck. But I’m not sure how they’re going to make people pay for it.

In fact, when you look at the most successful privacy technologies — things like PGP or Tor or Bitcoin  — you notice that these are the exceptions that prove the rule. Tor was developed with US military funding and continues to exist thanks to generous NGO and government donations. PGP was a labor of love. Bitcoin is… well, I mean, nobody really understands what Bitcoin is. But it’s unique and not likely to be repeated.

I could think of at least two privacy technologies that would be wonderful to have right now, and yet implementing them would be impossible without millions in seed funding. And where would you recover that money? I can’t quite figure it out. Maybe Kickstarter is the answer to this sort of thing, but I’ll have to let someone else prove it to me.

Problem #4: It doesn’t matter anyway. You’re using software, and you’re screwed.

Some of the best days of the crypto dream were spent fighting government agencies that legitimately believed that crypto software would render them powerless. We generally pat ourselves on the back for ‘winning’ this fight (although in point of fact, export regulations still exist). But it’s more accurate to say that governments decided to walk away.

With hindsight it’s pretty obvious that they got the better end of the deal. It’s now legal to obtain strong crypto software, but the proportion of (non-criminal) people who actually do this is quite small. Worse, governments have a trump card that can circumvent the best cryptographic algorithm. No, it’s not a giant machine that can crack AES. It’s the fact that you’re implementing the damned thing in software. And software vulnerabilities will overcome all but the most paranoid users, provided that the number of people worth tracking is small enough.

Arvind points this out in his talk, and refers to a wonderful talk by Jonathan Zittrain called ‘The End of Crypto’ — in which Jonathan points out how serious the problem is. Moreover, he notes that we’re increasingly losing control of our devices (thanks to the walled garden model), and argues that such control is a pre-condition for secure communications  This may be true, but let me play devil’s advocate: the following chart shows a price list for 0days in commercial software. You tell me which ones the government has the hardest time breaking into.

Estimated price list for 0-days in various software products. (Source: Forbes)

Whatever the details, it seems increasingly unlikely that we’re going to live the dream while using the software we use today. And sadly nobody seems to have much of an answer for that.

Problem #5: The whole premise of this post is wrong — the dream is alive!

Of course, another possibility is that the whole concept is just mistaken. Maybe the dream did arrive and we were all just looking the other way.

Sure, GPG adoption may be negligible, and yes, most crypto products are a disaster. Yet with a few clicks I can get on a user-friendly (and secure!) anonymous communications network, where my web connections will be routed via an onion of encrypted tunnels to a machine on the other side of the planet. Once there I can pay for things using a pseudonymous electronic cash service that bases its currency on nothing more than the price of a GPU.

If secure communications is what I’m after, I can communicate through OTR or RedPhone or one of a dozen encrypted chat programs that’ve recently arrived on the scene. And as long as I’m not stupid, there’s surprisingly little that anyone can do about any of it.

In conclusion

This has been a very non-technical post, and that’s ok — you don’t have to get deeply technical in order to answer this particular question. (In fact, this one place I slightly disagree with Arvind, who also brings up the efficiency of certain advanced technologies like Secure Multiparty Computation. I truly don’t think this is a story about efficiency, because we have lots of efficient privacy protocols, and people still don’t use them.)

What I do know is that there’s so much we can do now, and there are so many promising technologies that have now reached maturity and are begging to be deployed. These include better techniques for anonymizing networks, adding real privacy to currencies like Bitcoin, and providing user authentication that actually works. The crypto dream can still live. Maybe all we need is a new generation of people to dream it.

Four theories on the cryptography of Star Trek

frameofmind
“I’m sorry Captain. They rotated by fourteen.”

Over on ZDNet they’re asking why cybersecurity is like Star Trek. I think this is the wrong question. A better one is: why is the cybersecurity so bad on Star Trek?

Please don’t take this the wrong way. I’m a huge Trek fan. I’ve watched every episode ever made, and I’d do it again if I had time. Even the Holodeck ones.

But I also teach computer security, and specifically, cryptography. Which is ruining the show for me! How can I buy into a universe where the protagonists have starships, transporters and dorky positronic robots, but still can’t encrypt an email to save their livesThe Trek crew has never encountered an encryption scheme that didn’t crack like an egg when faced with an ‘adaptive algorithm’ (whatever that is), or — worse — just a dude doing math in his head.

But there’s no reason to take my word for this. Thanks to the miracle of searchable Star Trek, you can see for yourself.

Cryptographers deserve better. Viewers deserve better. And while I can’t fix bad screenwriting, I can try to retcon us an explanation. And that will be the subject of this post: four scientifically credible explanations why 24th century crypto could legitimately be so awful.

Theory #1: A quantum leap

One answer to the mystery of Trek’s bad crypto is so obvious it’s mundane. It’s the 24th century, and of course all the computers are quantum. Everyone knows that quantum computers are super-duper-powerful, and would blow through traditional encryption like a knife through butter.

But not so fast! As I’ve written before on this blog, quantum computers are actually quite limited in what (we think) they can do. This even goes for quantum computers enhanced with bio-neural gel packs, whatever the hell those are.

Specifically: while QCs are very good at solving certain number-theoretic problems — including the ones that power RSA and most public-key encryption schemes — theorists don’t believe that they can efficiently solve NP-complete problems, which should still leave an opening for complexity-theoretic crypto to thrive in the 24th century. And yet we never hear about this in Trek.

Of course it’s always possible that the theorists are wrong. But quantum computers still don’t explain why Spock can apparently crack encryption codes in his head. (And no, ‘Vulcans are really good at math’ is not a theory.)

Theory #2: It’s the warp drive, stupid  

If there’s a single technology that makes the Star Trek universe different from ours, it’s the Warp drive. And this tees up our next theory:

Could it be that there’s a conflict between faster-than-light travel and secure cryptography? Could Zephram Cochrane have done in crypto?

Shockingly, there might actually be something to this. Exhibit A is this paper by Scott Aaronson and John Watrous — two honest-to-god complexity theorists — on the implications of a physical structure called a closed timelike curve‘ (CTC) and what would happen if you used one to go back in time and kill your grandfather.

Aaronson and Watrous aren’t really interested in killing anyone. What they’re interested in is paradoxes, and particularly, what it means if the Universe resolves paradoxes. It turns out that this resolution power has huge implications for computing.

It seems that computers with access to paradox-resolving time travel would be dramatically more powerful than any of the computers we can envision today, regardless of whether they’re quantum or classical. In fact, CTC-enhanced computers would be powerful enough to efficiently solve problems in the complexity class PSPACE. This would utterly doom the type of complexity-theoretic crypto we rely on today.

But this still leaves a question: does the Warp drive necessarily imply the existence of CTCs?

One clue comes from Einstein’s special theory of relativity, which implies that faster-than-light travel would imply violation of causality. For those without the physics background: Star Trek IV. 

Theory #3: Complexity theory is dead

Do you remember the episode in Deep Space Nine where O’Brien and Bashir discussed the latest developments in Ferengi computer science? How about the episode that took place at a Vulcan complexity theory conference? No, I don’t either. These things never happened.

This all by itself is suspicious. Trek characters could waste hours blabbering about subspace fields or trying to convince Data he’s a real boy. But something as central as the computers that run their ship and keep them alive? Not a peep, not even in a “TECH” scene.

It’s almost as though by the end of the 24th century, complexity theory has fallen off of the list of things people care about. Which brings me to my next theory:

In the Star Trek Universe, P = NP.

In one sense this would be huge and mostly great news for computer scientists. But it would be a disaster for the efficient (complexity-theoretic) encryption we use on a daily basis. For things like RSA and AES to be truly secure, we require the existence of ‘one-way functions‘. And those can only exist if P does not equal NP (P != NP).

Fortunately for cryptography, most computer scientists are convinced that P != NP. They just haven’t been able to to prove it. The most recent attempt was made by Vinay Deolalikar of HP Labs, and his proof foundered on subtleties just like every one before it. This means the problem is still open, and technically could go either way.

If P did turn out to be equal to NP, it’s conceivable that result would look exactly like Star Trek! A few algorithms could still be quite difficult to break (i.e., the attacks would have huge polynomial runtimes). But maybe not. People might instead fall back on obscurity to overcome the mathematical impossibility of building strong complexity-theoretic encryption. One-time pads would still work, of course, and quantum key distribution might allow for point-to-point transmission. Everything else would become a massive joke.

Now, this theory still doesn’t explain the ‘breaking crypto in your head’ thing, or why it takes like six hours to change the Enterprise’s command codes. But it would go a long way to repairing the damage wrought by years of bad scriptwriting.

Theory #4: The Stallman effect

Live long and publish your source.

This last theory is the most mindbending. It’s also not mine (I ripped it off from Chris Long).

To get a fix on it, you first have to think about this Federation we hold so dear. Here we have a society where the cost of making something is simply the marginal cost of replicating a copy. Money isn’t necessary, and people are free to devote themselves to activities that are fun, after spending the necessary ten hours a week on required tasks such as legislation, family counseling, robot repair and asteroid prospecting.

Does any of this sound familiar to you? Yes. The Federation was founded on the teachings of Richard M. Stallman.

A society based on the teachings of RMS can’t possibly get security right. To such a society, security is simply a tool that prevents you you from accessing the full capabilities of your computer replicator. How could we expect serious crypto in a society that worships the legacy of RMS?

A minor problem with this theory is that it doesn’t explain why bad cryptography crosses species lines: even the Romulans have terrible encryption. Of course, the Romulans have frigging cloaking devices and still haven’t managed to wipe us out. So maybe we can just chalk that one up to incompetence.

In conclusion

I admit that there’s only so far you can go with all of this. At a certain point you have to give in and admit that the Trek screenwriters don’t know encryption from a Chronoton field. And honestly, what they’ve done with cryptography is nothing compared to what they’ve done to physics, electronics, and historical drama.

And please don’t get me started on the Holodeck. Can’t they just fit that thing with an OFF switch?

Still, if nothing else, this post has given me another forum to bitch about my favorite grievance: bad cryptography in movies and TV. And a chance to remind Hollywood (should any representatives be reading) that I am ready and willing to help you with your cryptographic script writing problems for a very reasonable fee. Just don’t expect anyone to do crypto in their head.

If wishes were horses then beggars would ride… a Pwnie!

In case you’re wondering about the title above, I ripped it off from an old Belle Waring post on the subject of wishful thinking in politics. Her point (inspired by this Calvin & Hobbes strip) is that whenever you find yourself imagining how things should be in your preferred view of the world, there’s no reason to limit yourself. Just go for broke!

Take a non-political example. You want Facebook to offer free services, protect your privacy and shield you from targeted advertising? No problem! And while you’re wishing for those things, why not wish for a pony too!

This principle can also apply to engineering. As proof of this, I offer up a long conversation I just had with Dan Kaminsky on Twitter, related to the superiority of DNSSEC vs. application-layer solutions to securing email in transit. Short summary: Dan thinks DNSSEC will solve this problem, all we have to do is get it deployed everywhere in a reasonable amount of time. I agree with Dan. And while we’re daydreaming, ponies! (See how it works?)

The impetus for this discussion was a blog post by Tom Ritter, who points out that mail servers are horrendously insecure when it comes to shipping mail around the Internet. If you know your mailservers, there are basically two kinds of SMTP connection. MUA-to-MTA connections, where your mail client (say, Apple Mail) sends messages to an SMTP server such as GMail. And MTA-to-MTA connections, where Gmail sends the message on to a destination server (e.g., Hotmail).

Now, we’ve gotten pretty good at securing the client-to-server connections. In practice, this is usually done with TLS, using some standard server certificate that you can pick up from a reputable company like Trustwave. This works fine, since you know the location of your outgoing mail server and can request that the connection always use TLS.

Unfortunately, we truly suck at handling the next leg of the communication, where the email is shipped on to the next MTA, e.g., from Google to Hotmail. In many cases this connection isn’t encrypted at all, making it vulnerable to large-scale snooping. But even when it is secured, it’s not secured very well.

What Tom points out is that many mail servers (e.g., Gmail’s) don’t actually bother to use a valid TLS certificate on their MTA servers. Since this is the case, most email senders are configured to accept the bogus certs anyway, because everyone has basically given up on the system.

We could probably fix the certs, but you see, it doesn’t matter! That’s because finding that next MTA depends on a DNS lookup, and (standard) DNS is fundamentally insecure. A truly nasty MITM attacker can spoof the MX record returned, causing your server to believe that mail.evilempire.com is the appropriate server to receive Hotmail messages. And of course, mail.evilempire.com could have a perfectly legitimate certificate for its domain — which means you’d be hosed even if you checked it.

(It’s worth pointing out that the current X.509 certificate infrastructure does not have a universally-accepted field equivalent to “I am an authorized mailserver for Gmail”. It only covers hostnames.)

The question is, then, what to do about this?

There are at least three options. One is that we just suck it up and assume that email’s insecure anyway. If people want more security, they should end-to-end encrypt their mail with something like GPG. I’m totally sympathetic to this view, but I also recognize that almost nobody encrypts their email with GPG. Since we already support TLS on the MTA-to-MTA connection, perhaps we should be doing it securely?

The second view is that we fix this using some chimera of X.509 extensions and public-key pinning. In this proposal (inspired by an email from Adam Langley***), we’d slightly extend X.509 implementations to recognize an “I am a mailserver for Hotmail.com” field, we get CAs to sign these, and we install them on Hotmail’s servers. Of course, we’ll have to patch mailservers like Sendmail to actually check for these certs, and we’d have to be backwards compatible to servers that don’t support TLS at all — meaning we’d need to pin public keys to prevent downgrade attacks. It’s not a very elegant solution at all.

The third, ‘elegant’ view is that we handle all of our problems using DNSSEC. After all, this is what DNSSEC was built for! To send an email to Hotmail, my server just queries DNS for Hotmail.com, and (through the magic of public-key cryptography) it gets back a signed response guaranteeing that the MX record is legitimate. And a pony too!

But of course, back in the real world this is not at all what will happen, for one very simple reason:

  • Many mail services do not support DNSSEC for their mail servers.
  • Many operating systems do not support DNSSEC resolution. (e.g., Windows 8)

Ok, that’s two reasons, and I could probably add a few more. But at the end of the day DNSSEC is the pony with the long golden mane, the one your daughter wants you to buy her — right after you buy her the turtle and the fish and the little dog named Sparkles.

So maybe you think that I’m being totally unfair. If people truly want MTA-to-MTA connections to be secure, they’ll deploy DNSSEC and proper certificates. We’ll all modify Sendmail/Qmail/Etc. to validate that the DNS resolution is ‘secured’ and they’ll proceed appropriately. If DNS does not resolve securely, they’ll… they’ll…

Well, what will they do? It seems like there are only two answers to that question. Number one: when DNSSEC resolution fails (perhaps because it’s not supported by Hotmail), then the sender fails open. It accepts whatever insecure DNS response it can get, and we stick with the current broken approach. At least your email gets there.

Alternatively, we modify Sendmail to fail closed. When it fails to accomplish a secure DNS resolution, the email does not go out. This is the security-nerd approach.

Let me tell you something: Nobody likes the security-nerd approach.

As long as DNSSEC remains at pitiful levels of deployment, it’s not going to do much at all. Email providers probably won’t add DNSSEC support for MX records, since very few mailservers will be expecting it. Mailserver applications won’t employ (default, mandatory) DNSSEC checking because none of the providers will offer it! Around and around we’ll go.

But this isn’t just a case against DNSSEC, it relates to a larger security architecture question. Namely: when you’re designing a security system, beware of relying on things that are outside of your application boundary.

What I mean by this is that your application (or transport layer implementation, etc.) should not be dependent. If possible, it should try to achieve its security goals all by itself, with as few assumptions about the system that it’s running on, unless those assumptions are rock solid. If there’s any doubt that these assumptions will hold in practice, your system needs a credible fallback plan other than “ah, screw it” or “let’s be insecure”.

The dependence on a secure DNS infrastructure is one example of this boundary-breaking problem, but it’s not the only one.

For example, imagine that you’re writing an application depends on having a strong source of pseudo-random numbers. Of course you could get these from /dev/rand (or /dev/urand), but what if someone runs your application on a system that either (a) doesn’t have these, or (b) has them, but they’re terrible?

Now sometimes you make the choice to absorb a dependency, and you just move on. Maybe DNSSEC is a reasonable thing to rely on. But before you make that decision, you should also remember that it’s not just a question of your assumptions breaking down. There are human elements to think of.

Remember that you may understand your design, but if you’re going to be successful at all, someday you’ll have to turn it over to someone else. Possibly many someones. And those people need to understand the fullness of your design, meaning that if you made eight different assumptions, some of which occur outside of your control, then your new colleagues won’t just have eight different ways to screw things up. They’ll have 2^8 or n^8 different ways to screw it up.

You as the developer have a duty to yourself and to those future colleagues to keep your assumptions right there where you can keep an eye on them, unless you’re absolutely positively certain that you can rely on them holding in the future.

Anyway, that’s enough on this subject. There’s probably room on this blog for a longer and more detailed post about how DNSSEC works, and I’d like to write that post at some point. Dan obviously knows a lot more of the nitty-gritty than I do, and I’m (sort of) sympathetic to his position that DNSSEC is awesome. But that’s a post for the future. Not the DNSSEC-distant future, hopefully!

Notes:

* Dan informs me that he already has a Pwnie, hence the title.
** Thomas Ptacek also has a much more convincing (old) case against DNSSEC.
*** In fairness to Adam, I think his point was something like ‘this proposal is batsh*t insane’, maybe we should just stick with option (1). I may be wrong.

The future of electronic currency

Crypto is fantastic for many things, but those who readthis blog know that I have a particular fascination for its privacy applications. Specifically, what interests me are the ways we can use cryptography to transact (securely) online, without revealing what we’re doing, or who we’re doing it with.

This is particularly relevant, since today we’re in the middle of an unprecedented social and technological experiment: moving our entire economy out of metal and paper and into the ‘net. I’ve already had to explain to my four-year old what newspapers are; I imagine he’ll have a similarly experience when his children ask why people once carried funny pieces of paper around in their wallet.

Between credit and debit cards, EFT, online banking and NFC, it seems like the days of cash are numbered. Unfortunately, all is not sunshine and roses. The combination of easy-to-search electronic records and big data seems like a death-knell for our individual privacy. Cryptography holds the promise to get some of that privacy back, if we want it.

In this post I’m going to take a very quick look at a few privacy-preserving ‘e-cash’ technologies that might help us do just that. (And yes, I’ll also talk about Bitcoin.)

Why don’t we have electronic cash today?

The simple answer is that we already have electronic money; we just don’t call it that. Right now in its mainstream incarnation, it takes the form of little plastic cards that we carry around in our wallets. If you live in a developed nation and aren’t too particular about tipping valets, you can pretty much survive without ever touching hard currency.

The problem is that credit and debit cards are not cash. They’re very good for money transfers, but they have two specific limitations: first, they require you to access an online payment network. This means that they lose their usefulness at exactly the moment when you need them most: typically, after a disaster has wiped out or severely limited your connectivity (e.g., most hurricanes in Florida, NYC the morning after 9/11, etc).

Secondly, funds transfer systems offer none of the privacy advantages of real cash. This is probably by (government) preference: untraceable cash lends itself to unsavory activities, stuff like drug dealing, arms purchases and tax evasion. Our modern banking system doesn’t necessarily stop these activities, but it’s a godsend for law enforcement: just about every transaction can be traced down to the $0.01. (And even if you aren’t a drug dealer, there are still plenty of folks who’ll pay good money for a copy of your spending history, just so they can sell you stuff.)

The genesis of private e-cash

chaum_1
David Chaum

Credit for the invention of true, privacy-preserving electronic cash generally goes to David Chaum. Chaum proposed his ideas in a series of papers throughout the 1980s, then made a fortune providing the world with untraceable electronic cash.

Well, actually, the statement above is not quite accurate. According to legend, Chaum turned down lucrative offers from major credit card companies in favor of starting his own e-cash venture. I don’t need to tell you how the story ends — you’ve probably already noticed that your wallet isn’t full of untraceable electronic dollars (and if it is: I’m sorry.)

There’s an important lesson here, which is that getting people to adopt electronic cash requires a lot more than just technology. Fortunately, the failure of e-cash has a silver lining, at least for the field of cryptography: Chaum went on to pioneer anonymous electronic voting and a whole mess of other useful stuff.

Like many e-cash systems since, Chaum’s earliest paper on the e-cash proposed to use digital ‘coins’, each of some fixed denomination (say, $1). A coin was simply a unique serial number, generated by the holder and digitally signed using a private key known only to the bank. When a user ‘spends’ a coin, the merchant can verify the signature and ‘deposit’ the coin with the bank — which will reject any coin that’s already been spent.

(Of course, this doesn’t prevent the merchant from re-spending your hard earned money. To deal with this, the user can replace that serial number with a freshly-generated public key for a signature scheme. The bank will sign the public key, then the user can provide the merchant with the public key — signed by the bank — and use the corresponding signing key to sign the merchant’s name and transaction info.)

However you do it, the system as described has a crucial missing element: it’s not private. The bank knows which serial numbers it signs for you, and also knows where they’re being spent. This provides a linkage between you and, say, that anarchist bookstore where you’re blowing your cash.

To address this, Chaum replaced the signing process with a novel blind signature protocol. Blind signature is exactly what it sounds like: a way for the bank to sign a message without actually seeing it. Using this technology, the user could make up a serial number and not tell the bank; the blind signature protocol would provide the necessary signature. Even if the bank was trying to track the coins, it wouldn’t be able to link them to the user.

Chaum even provided a nice real-world analogy for his idea: place a document inside of an element along with a sheet of carbon paper, then let the bank sign the outside of the envelope, conveying the signature through and onto the document. This doesn’t literally describe how blind signatures work, but the real cryptographic constructions aren’t that much worse: you can readily obtain blind versions of RSADSA and the Schnorr/Elgamal signatures without (mostly) breaking a sweat (see this footnote for details).

The double-spending problem and going offline

Digital signatures do one thing very well: they prevent unauthorized users from issuing their own coins. Unfortunately they don’t prevent a second serious problem: users who copy legitimate coins.

Copying is where electronic cash really differs from its physical equivalent. Real money is hard to copy — by design. If it wasn’t, we wouldn’t use it. When people get too clever at copying it, we even send men with guns to shut them down.

Electronic coins are very different. It’s almost impossible to work with data without copying it; from long-term storage to RAM, from RAM to the processor cache, from one computer to another over a network. Electronic coins must be copied, and this fundamentally changes the nature of the problem. The boogeyman here is ‘double spending‘, where a user tries to spend the same valid coin with many different merchants. Left unchecked, double-sending does more than screw over a merchant. It can totally debase the currency supply, making coins almost impossible for merchants to trust.

Chaum’s original solution dealt with double-spenders by requiring the bank to be online, so users could immediately deposit their coins — and make sure they were fresh. This works great, but it’s damn hard to handle in a system that works offline, i.e., without a live network connection. Indeed, offline spending is the big problem that most e-cash solutions have tried to tackle.

There are two basic solutions to the offline problem. Neither is perfect. They are:

  • Use trusted hardware. Force users to store their coins inside of some piece of bank-trusted (and tamper-resistant) piece of hardware such as a cryptographic smartcard. The hardware can enforce correct behavior, and prevent users from learning the actual coin values.
  • Revoke double-spenders’ anonymity. Alternatively, it’s possible to build e-cash systems that retain the users’ anonymity when they participate honestly, but immediately revokes their anonymity when they cheat (i.e., double-spend the same coin).

Although these solutions are elegant, they also kind of suck. This is because neither is really sufficient to deal with the magnitude of the double-spending problem.

To understand what I’m talking about, consider the following scam: I withdraw $10,000 from the bank, then spend each of my coins with 1,000 different offline merchants. At the end of the day, I’ve potentially walked away with $10,000,000 in merchandise (assuming it’s portable) before anyone realizes what I’ve done. That’s a lot of dough for a single scam.

In fact, it’s enough dough that it would justify some serious investment in hardware reverse-engineering, which makes it hard to find cost-effective hardware that’s sufficient to handle the threat. Finding the owner of the coin isn’t much of a deterrent either — most likely you’ll just find some guy in Illinois who had his wallet stolen.

That doesn’t mean these approaches are useless: in fact, they’re very useful in certain circumstances, particularly if used in combination with an online bank. Moreover the problem of revealing a user’s identity (on double-spend) is an interesting one. There are several schemes that do this, including one by Chaum, Fiat and Naor, and a later (very elegant) scheme by Stefan Brands. (For a bit more about these schemes, see this footnote.)

Compact wallets and beyond

There have been quite a few developments over the past few years, but none are as dramatic as the original schemes. Still, they’re pretty cool.

One scheme that deserves a few words is the ‘Compact e-Cash‘ system of Camenisch, Hohenberger and Lysyanskaya. This system is nice because users can store millions of e-coins in a relatively small format, but also because it uses lots of neat crypto — including signatures with efficient protocols and zero-knowledge proofs.

At a very high level, when a user withdraws n coins from the bank in this system, the bank provides the user with a digital signature on the following values: the user’s public key, the number of coins n withdrawn, and a secret seed value seed that’s generated cooperatively by the bank and the user.

The bank learns the number of coins and user’s public key, but only the user learns seed. To spend the ith coin in the wallet, the user generates a ‘serial number’ SN = F(seed, i), where F is some pseudo-random function. The user also provides a non-interactive zero-knowledge proof that (a) 0 < i < n, (b) SN is correctly formed, and (c) she has a signature on seed from the bank (among other things). This zero-knowledge proof is a beautiful thing, because it does not leak any information beyond these statements, and can’t even be linked back to the user’s key in the event that she loses it. The online bank records each serial number it sees, ensuring that no coin will ever be spent twice.

This may seem pretty complicated, but the basic lesson is that we can do lots of neat things with these technologies. We can even build coins that can be spent k times for some arbitrary k, only revealing your identity if they’re used more times than that; this turns out to be useful anonymous login applications, where users want to access a resource a fixed number of times, but don’t want anyone counting their accesses.

Unfortunately, we haven’t managed to build any of this stuff and deploy it in a practical setting.

Bitcoin

Which brings us to the one widely-deployed, practical electronic cash system in the world today. What about Bitcoin?

I’m a big fan of Bitcoin (from a technical perspective), but it has a few limitations that make Bitcoins a little bit less private than real e-cash should be.

Despite the name, Bitcoin doesn’t really deal with ‘coins’: it’s actually a transaction network. Users generate blocks of a certain value then transfer quantities of currency using ECDSA public keys as identifiers. The core innovation in Bitcoin is a distributed public bulletin-board (the ‘block-chain’) that records every transaction in Bitcoin’s history. This history lets you check that any given chunk of currency has a valid pedigree.

While the Bitcoin block-chain is essential to security, it’s also Bitcoin’s privacy achilles heel. Since every transaction is public — and widely disseminated — there’s no hiding that it took place. To make up for this, Bitcoin offers pseudonymity: your public key isn’t tied to your identity in any way, and indeed, you can make as many of them as you want. You can even transfer your coins from one key to another.

Now, I’m not really complaining about this. But it should be noted that pseudonymity is to anonymity what sugar-free chocolates are to the real thing. While I don’t know of anyone who’s actively looking to de-anonymize Bitcoin transactions (scratch that, Zooko points out that some people are), there has been plenty of work on extracting (or ‘re-identifying’) pseudonymized data sets. If you don’t believe me, see this work by Narayanan and Shamtikov on de-anonymizing social network data, or this one that does the same thing for the Netflix prize dataset. And those are just two of many examples.

Many knowledgable Bitcoin users know this, and some have even developed Bitcoin ‘mixers’ that stir up large pools of Bitcoin from different users, in the hopes that this will obfuscate the transaction history. This sounds promising, but has a lot of problems — starting with the fact that none few seem to be actually online as I write this post.* Even if one was available, you’d basically be placing your privacy trust into the hands of one party who could totally screw you. (A large, distributed system like Tor could do the job, but none seems to be on the horizon). Finally, you’d need a lot of transaction volume to stay safe.

At the same time, it seems difficult to shoehorn the e-cash techniques from the previous sections into Bitcoin, because those systems rely on a centralized bank, and also assume that coins are used only once. Bitcoin has no center, and coins are used over and over again forever as they move from user to user. Any anonymous coin solution would have to break this linkage, which seems fundamentally at odds with the Bitcoin design. (Of course that doesn’t mean it isn’t possible! ;)**

In summary

This has hardly been an exhaustive summary of how e-cash works, but hopefully it gives you a flavor of the problem, along with a few pointers for further reading.

I should say that I don’t live the most interesting life, and about the only embarrassing thing you’ll see on my credit cards is the amount of money we waste on Diet Coke (which is totally sick). Still, this isn’t about me. As our society moves away from dirty, messy cash and into clean — and traceable — electronic transactions, I really do worry that we’re losing something important. Something fundamental.

This isn’t about avoiding marketers, or making it easier for people to have affairs. Privacy is something humans value instinctively even when we don’t have that much to hide. It’s the reason we have curtains on our windows. We may let go of our privacy today when we don’t realize what we’re losing, but at some point we will realize the costs of this convenience. The only question is when that will be, and what we’ll do about it when the day comes.

Notes:

* I was wrong about this: a commenter points out that Bitcoin Fog is up and running as a Tor hidden service. I’ve never tried this, so I don’t know how well it works. My conclusion still stands: mixing works well if you trust the BF operators and there’s enough transaction volume to truly mix your spending. We shouldn’t have to trust anyone.

** Some people have tried though: for example, OpenCoin tries to add Chaum-style cash to Bitcoin. Ditto Open Transactions. From what I can tell, these protocols still do Chaumian cash ‘old style’: that is, they require a trusted ‘bank’, or ‘issuer’ and don’t actually integrate into the distributed trust framework of Bitcoin. Still very nice work. h/t commenters and Stephen Gornick (who also fixed a few typos).

iCloud: Who holds the key?

Ars Technica brings us today’s shocking privacy news: ‘Apple holds the master decryption key when it comes to iCloud security, privacy‘. Oh my.

The story is definitely worth a read, though it may leave you shaking your head a bit. Ars’s quoted security experts make some good points, but they do it in a strange way — and they propose some awfully questionable fixes.

But maybe I’m too picky. To be honest, I didn’t realize that there was even a question about who controlled the encryption key to iCloud storage. Of course Apple does — for obvious technical reasons that I’ll explain below. You don’t need to parse Apple’s Terms of Service to figure this out, which is the odd path that Ars’s experts have chosen:

In particular, Zdziarski cited particular clauses of iCloud Terms and Conditions that state that Apple can “pre-screen, move, refuse, modify and/or remove Content at any time” if the content is deemed “objectionable” or otherwise in violation of the terms of service. Furthermore, Apple can “access, use, preserve and/or disclose your Account information and Content to law enforcement authorities” whenever required or permitted by law.

Well, fine, but so what — Apple’s lawyers would put stuff like this into their ToS even if they couldn’t access your encrypted content. This is what lawyers do. These phrases don’t prove that Apple can access your encrypted files (although, I remind you, they absolutely can), any more than Apple’s patent application for a 3D LIDAR camera ‘proves’ that you’re going to get one in your iPhone 5.

Without quite realizing what I was doing, I managed to get myself into a long Twitter-argument about all this with the Founder & Editor-in-Chief of Ars, a gentleman named Ken Fisher. I really didn’t mean to criticize the article that much, since it basically arrives at the right conclusions — albeit with a lot of nonsense along the way.

Since there seems to be some interest in this, I suppose it’s worth a few words. This may very well be the least ‘technical’ post I’ve ever written on this blog, so apologies if I’m saying stuff that seems a little obvious. Let’s do it anyway.

The mud puddle test

You don’t have to dig through Apple’s ToS to determine how they store their encryption keys. There’s a much simpler approach that I call the ‘mud puddle test’:

  1. First, drop your device(s) in a mud puddle.
  2. Next, slip in said puddle and crack yourself on the head. When you regain consciousness you’ll be perfectly fine, but won’t for the life of you be able to recall your device passwords or keys.
  3. Now try to get your cloud data back.

Did you succeed? If so, you’re screwed. Or to be a bit less dramatic, I should say: your cloud provider has access to your ‘encrypted’ data, as does the government if they want it, as does any rogue employee who knows their way around your provider’s internal policy checks.

And it goes without saying: so does every random attacker who can guess your recovery information or compromise your provider’s servers.

Now I realize that the mud puddle test doesn’t sound simple, and of course I don’t recommend that anyone literally do this — head injuries are no fun at all. It’s just a thought experiment, or in the extreme case, something you can ‘simulate’ if you’re willing to tell your provider few white lies.

But you don’t need to simulate it in Apple’s case, because it turns out that iCloud is explicitly designed to survive the mud puddle test. We know this thanks to two iCloud features. These are (1) the ability to ‘restore’ your iCloud backups to a brand new device, using only your iCloud password, and (2) the ‘iForgot’ service, which lets you recover your iCloud password by answering a few personal questions.

Since you can lose your device, the key isn’t hiding there. And since you can forget your password, it isn’t based on that. Ergo, your iCloud data is not encrypted end-to-end, not even using your password as a key (or if it is, then Apple has your password on file, and can recover it from your security questions.) (Update: see Jonathan Zdziarski’s comments at the end of this post.)

You wanna make something of it?

No! It’s perfectly reasonable for a consumer cloud storage provider to design a system that emphasizes recoverability over security. Apple’s customers are far more likely to lose their password/iPhone than they are to be the subject of a National Security Letter or data breach (hopefully, anyway).

Moreover, I doubt your median iPhone user even realizes what they have in the cloud. The iOS ‘Backup’ service doesn’t advertise what it ships to Apple (though there’s every reason to believe that backed up data includes stuff like email, passwords, personal notes, and those naked photos you took.) But if people don’t think about what they have to lose, they don’t ask to secure it. And if they don’t ask, they’re not going to receive.

My only issue is that we have to have this discussion in the first place. That is, I wish that companies like Apple could just come right out and warn their users: ‘We have access to all your data, we do bulk-encrypt it, but it’s still available to us and to law enforcement whenever necessary’. Instead we have to reverse-engineer it by inference, or by parsing through Apple’s ToS. That shouldn’t be necessary.

But can’t we fix this with Public-Key Encryption/Quantum Cryptography/ESP/Magical Unicorns?

No, you really can’t. And this is where the Ars Technica experts go a little off the rails. Their proposed solution is to use public-key encryption to make things better. Now this is actually a great solution, and I have no objections to it. It just won’t make things better.

To be fair, let’s hear it in their own words:

First, cloud services should use asymmetric public key encryption. “With asymmetric encryption, the privacy and identity of each individual user” is better protected, Gulri said, because it uses one “public” key to encrypt data before being sent to the server, and uses another, “private” key to decrypt data pulled from the server. Assuming no one but the end user has access to that private key, then no one but the user—not Apple, not Google, not the government, and not hackers—could decrypt and see the data.

I’ve added the boldface because it’s kind of an important assumption.

To make a long story short, there are two types of encryption scheme. Symmetric encryption algorithms have a single secret key that is used for both encryption and decryption. The key can be generated randomly, or it can be derived from a password. What matters is that if you’re sending data to someone else, then both you and the receiver need to share the same key.

Asymmetric, or public-key encryption has two keys, one ‘public key’ for encryption, and one secret key for decryption. This makes it much easier to send encrypted data to another person, since you only need their public key, and that isn’t sensitive at all.

But here’s the thing: the difference between these approaches is only related to how you encrypt the data. If you plan to decrypt the data — that is, if you ever plan to use it — you still need a secret key. And that secret key is secret, even if you’re using a public-key encryption scheme.

Which brings us to the real problem with all encrypted storage schemes: someone needs to hold the secret decryption key. Apple has made the decision that consumers are not in the best position to do this. If they were willing to allow consumers to hold their decryption keys, it wouldn’t really matter whether they were using symmetric or public-key encryption.

So what is the alternative?

Well, for a consumer-focused system, maybe there really isn’t one. Ultimately people back up their data because they’re afraid of losing their devices, which cuts against the idea of storing encryption keys inside of devices.

You could take the PGP approach and back up your decryption keys to some other location (your PC, for example, or a USB stick). But this hasn’t proven extremely popular with the general public, because it’s awkward — and sometimes insecure.

Alternatively, you could use a password to derive the encryption/decryption keys. This approach works fine if your users pick decent passwords (although they mostly won’t), and if they promise not to forget them. But of course, the convenience of Apple’s “iForgot” service indicates that Apple isn’t banking on users remembering their passwords. So that’s probably out too.

In the long run, the answer for non-technical users is probably just to hope that Apple takes good care of your data, and to hope you’re never implicated in a crime. Otherwise you’re mostly out of luck. For tech-savvy users, don’t use iCloud and do try to find a better service that’s willing to take its chances on you as the manager of your own keys.

In summary

I haven’t said anything in this post that you couldn’t find in Chapter 1 of an ‘Intro to Computer Security’ textbook, or a high-level article on Wikipedia. But these are important issues, and there seems to be confusion about them.

The problem is that the general tech-using public seems to think that cryptography is a magical elixir that can fix all problems. Companies — sometimes quite innocently — market ‘encryption’ to convince people that they’re secure, when in fact they’re really not. Sooner or later people will figure this out and things will change, or they won’t and things won’t. Either way it’ll be an interesting ride.

Update 4/4: Jonathan Zdziarski ‏tweets to say my ‘mud puddle’ theory is busted: since the iForgot service requires you to provide your birthdate and answer a ‘security question’, he points out that this data could be used as an alternative password, which could encrypt your iCloud password/keys — protecting them even from Apple itself.

The problem with his theory is that security answers don’t really make very good keys, since (for most users) they’re not that unpredictable. Apple could brute-force their way through every likely “hometown” or “favorite sport” in a few seconds. Zdziarski suggests that Apple might employ a deliberately slow key derivation function to make these attacks less feasible, and I suppose I agree with him in theory. But only in theory. Neither Zdziarski or I actually believe that Apple does any of this.

Why Antisec matters

A couple of weeks ago the FBI announced the arrest of five members of the antisec-logo-100337062-orighacking group LulzSec. We now know that these arrests were facilitated by ‘Anonymous’ leader* “Sabu“, who, according to court documents, was arrested and ‘turned’ in June of 2011. He spent the next few months working with the FBI to collect evidence against other members of the group.

This revelation is pretty shocking, if only because Anonymous and Lulz were so productive while under FBI leadership. Their most notable accomplishment during this period was the compromise of Intelligence analysis firm Stratfor — culminating in that firm’s (rather embarrassing) email getting strewn across the Internet.

This caps off a fascinating couple of years for our field, and gives us a nice opportunity to take stock. I’m neither a hacker nor a policeman, so I’m not going to spend much time why or the how. Instead, the question that interests me is: what impact have Lulz and Anonymous had on security as an industry?

Computer security as a bad joke

To understand where I’m coming from, it helps to give a little personal background. When I first told my mentor that I was planning to go back to grad school for security, he was aghast. This was a terrible idea, he told me. The reality, in his opinion, was that security was nothing like Cryptonomicon. It wasn’t a developed field. We were years away from serious, meaningful attacks, let alone real technologies that could deal with them.

This seemed totally wrong to me. After all, wasn’t the security industry doing a bazillion dollars of sales ever year? Of course people took it seriously. So I politely disregarded his advice and marched off to grad school — full of piss and vinegar and idealism. All of which lasted until approximately one hour after I arrived on the floor of the RSA trade show. Here I learned that (a) my mentor was a lot smarter than I realized, and (b) idealism doesn’t get you far in this industry.

Do you remember the first time you met a famous person, and found out they were nothing like the character you admired? That was RSA for me. Here I learned that all of the things I was studying in grad school, our industry was studying too. And from that knowledge they were producing a concoction that was almost, but not quite, entirely unlike security.

Don’t get me wrong, it was a rollicking good time. Vast sums of money changed hands. Boxes were purchased, installed, even occasionally used. Mostly these devices were full of hot air and failed promises, but nobody really cared, because after all: security was kind of a joke anyway. Unless you were a top financial services company or (maybe) the DoD, you only really spent money on it because someone was forcing you to (usually for compliance reasons). And when management is making you spend money, buying glossy products is a very effective way to convince them that you’re doing a good job.

Ok, ok, you think I’m exaggerating. Fair enough. So let me prove it to you. Allow me to illustrate my point with a single, successful product, one which I encountered early on in my career. The product that comes to mind is the Whale Communications “e-Gap“, which addressed a pressing issue in systems security, namely: the need to put an “air gap” between your sensitive computers and the dangerous Internet.

Now, this used to be done (inexpensively) by simply removing the network cable. Whale’s contribution was to point out a major flaw in the old approach: once you ‘gap’ a computer, it no longer has access to the Internet!

Hence the e-Gap, which consisted of a memory unit and several electronic switches. These switches were configured such that the memory could be connected only to the Internet or to your LAN, but never to both at the same time (seriously, it gives me shivers). When data arrived at one network port, the device would load up with application data, then flip ‘safely’ to the other network to disgorge its payload. Isolation achieved! Air. Gap.

(A few pedants — damn them — will try to tell you that the e-Gap is a very expensive version of an Ethernet cable. Whale had a ready answer to this, full of convincing hokum about TCP headers and bad network stacks. But really, this was all beside the point: it created a freaking air gap around your network! This apparently convinced Microsoft, who later acquired Whale for five times the GDP of Ecuador.)

Now I don’t mean to sound too harsh. Not all security was a joke. There were plenty of solid companies doing good work, and many, many dedicated security pros who kept it from all falling apart.

But there are only so many people who actually know about security, and as human beings these people are hard to market. To soak up all that cybersecurity dough you needed a product, and to sell that product you needed marketing and sales. And with nobody actually testing vendors’ claims, we eventually wound up with the same situation you get in any computing market: people buying garbage because the booth babes were pretty.**

Lulz, Anonymous and Antisec

I don’t remember when I first heard the term ‘Antisec’, but I do remember what went through my mind at the time: either this is a practical joke, or we’d better harden our servers.

Originally Antisec referred to the ‘Antisec manifesto‘, a document that basically declared war on the computer security industry. The term was too good to be so limited, so LulzSec/Anonymous quickly snarfed it up to refer to their hacking operation (or maybe just part of it, who knows). Wherever the term came from, it basically had one meaning: let’s go f*** stuff up on the Internet.

Since (per my expanation above) network security was pretty much a joke at this point, this didn’t look like too much of a stretch.

And so a few isolated griefing incidents gradually evolved into serious hacking. It’s hard to say where it really got rolling, but to my eyes the first serious casualty of the era was HBGary Federal, who — to be completely honest — were kind of asking for it. (Ok, I don’t mean that. Nobody deserves to be hacked, but certainly if you’re shopping around a plan to ‘target’ journalists and civilians you’d better have some damned good security.)

In case you’re not familiar with the rest of the story, you can get a taste of it here and here. In most cases Lulz/Anonymous simply DDoSed or defaced websites, but in other cases they went after email, user accounts, passwords, credit cards, the whole enchilada. Most of these ‘operations’ left such a mess that it’s hard to say for sure which actually belonged to Anonymous, which were criminal hacks, and which (the most common case) were a little of each.

The bad

So with the background out of the way, let’s get down to the real question of this post. What has all of this hacking meant for the security industry?

Well, obviously, one big problem is that it’s making us (security folks) look like a bunch of morons. I mean, we’ve spent the last N years developing secure products and trying to convince people if they just followed our advice they’d be safe. Yet when it comes down to it, a bunch of guys on the Internet are walking right through it.

This is because for the most part, networks are built on software, and software is crap. You can’t fix software problems by buying boxes, any more than, say, buying cookies will fix your health and diet issues. The real challenge for industry is getting security into the software development process itself — or, even better, acknowledging that we never will, and finding a better way to do things. But this is expensive, painful, and boring. More to the point, it means you can’t outsource your software development to the lowest bidder anymore.

Security folks mostly don’t even try to address this. It’s just too hard. When I ask my software security friends why their field is so terrible (usually because they’re giving me crap about crypto), they basically look at me like I’m from Mars. The classic answer comes from my friend Charlie Miller, who has a pretty firm view of what is, and isn’t his responsibility:

I’m not a software developer, I just break software! If they did it right, I’d be out of a job.

So this is a problem. But beyond bad software, there’s just a lot of rampant unseriousness in the security industry. The best (recent) example comes from RSA, who apparently forgot that their SecurID product was actually important, and decided to make the master secret database accessible from a single compromised Windows workstation. The result of this ineptitude was a series of no-joking-around breaches of US Defense Contractors.

While this has nothing to do with Anonymous, it goes some of the way to explaining why they’ve had such an easy time these past two years.

The good

Fortunately there’s something of a silver lining to this dark cloud. And that is, for oncepeople finally seem to be taking security seriously. Sort of. Not enough of them, and maybe not in the ways that matter (i.e., building better consumer products). But at least institutionally there seems to be a push away from the absolute stupid.

There’s also been (to my eyes) a renewed interest in data-at-rest encryption, a business that’s never really taken off despite its obvious advantages. This doesn’t mean that people are buying good encryption products (encrypted hard drives come to mind), but at least there’s movement.

To some extent this is because there’s finally something to be scared of. Executives can massage data theft incidents, and payment processors can treat breaches as a cost of doing business, but there’s one thing that no manager will ever stop worrying about. And that is: having their confidential email uploaded to a convenient, searchable web platform for the whole world to see.

The ugly 

The last point is that Antisec has finally drawn some real attention to the elephant in the room, namely, the fact that corporations are very bad at preventing targeted breaches. And that’s important because targeted breaches are happening all the time. Corporations mostly don’t know it, or worse, prefer not to admit it.

The ‘service’ that Antisec has provided to the world is simply their willingness to brag. This gives us a few high-profile incidents that aren’t in stealth mode. Take them seriously, since my guess is that for every one of these, there are ten other incidents that we never hear about.***

In Summary

Let me be utterly clear about one thing: none of what I’ve written above should be taken as an endorsement of Lulz, Anonymous, or the illegal defacement of websites. Among many other activities, Anonymous is accused of hacking griefing the public forums of the Epilepsy Foundation of America in an attempt to cause seizures among in its readers. Stay classy, guys.

What I am trying to point out is that something changed a couple of years ago when these groups started operating. It’s made a difference. And it will continue to make a difference, provided that firms don’t become complacent again.

So in retrospect, was my mentor right about the field of information security? I’d say the jury’s still out. Things are moving fast, and they’re certainly interesting enough. I guess we’ll just have to wait and see where it all goes. In the meantime I can content myself with the fact that I didn’t take his alternative advice — to go study Machine Learning. After all, what in the world was I ever going to do with that?

Notes:

* Yes, there are no leaders. Blah blah blah.

** I apologize here for being totally rude and politically incorrect. I wish it wasn’t true.

*** Of course this is entirely speculation. Caveat Emptor.

The Internet is broken: could we please fix it?

Ok, this is a little embarrassing and I hate having to admit it publicly. But I can’t hold it in any longer: I think I’m becoming an Internet activist.

This is upsetting to me, since active is the last thing I ever thought I’d be. I have friends who live to make trouble for big corporations on the Internet, and while I admire their chutzpah (and results!), they’ve always made me a little embarrassed. Even when I agree with their cause, I still have an urge to follow along, cleaning up the mess and apologizing on behalf of all the ‘reasonable’ folks on the Internet.

But every man has a breaking point, and the proximate cause of mine is Trustwave. Or rather, the news that Trustwave — an important CA and pillar of the Internet — took it upon themselves to sell a subordinate root cert to some (still unknown) client, for the purposes of undermining the trust assumptions that make the Internet secure eavesdropping on TLS connections.

This kind of behavior is absolutely, unquestionably out of bounds for a trusted CA, and certainly deserves a response — a stronger one than it’s gotten. But the really frightening news is twofold:

  1. There’s reason to believe that other (possibly bigger) CAs are engaged in the same practice.
  2. To the best of my knowledge, only one browser vendor has taken a public stand on this issue, and that vendor isn’t gaining market share.

The good news is that the MITM revelation is exactly the sort of kick we’ve needed to improve the CA system. And even better, some very bright people are already thinking about it. The rest of this post will review the problem and talk about some of the potential solutions.

Certificates 101

For those of you who know the TLS protocol (and how certificates work), the following explanation is completely gratuitous. Feel free to skip it. If you don’t know — or don’t understand the problem — I’m going to take a minute to give some quick background.

TLS (formerly SSL) is probably the best-known security protocol on the Internet. Most people are familiar with TLS for its use in https — secure web — but it’s also used to protect email in transit, software updates, and a whole mess of other stuff you don’t even think about.

TLS protects your traffic by encrypting it with a strong symmetric key algorithm like AES or RC4. Unfortunately, this type of cryptography only works when the communicating parties share a key. Since you probably don’t share keys with most of the web servers on the Internet, TLS provides you with a wonderful means to do so: a public-key key agreement protocol.

I could spend a lot of time talking about this, but for our purposes, all you need to understand is this: when I visit https://gmail.com, Google’s server will send me a public key. If this key really belongs to Google, then everything is great: we can both derive a secure communication key, even if our attacker Mallory is eavesdropping on the whole conversation.

If, on the other hand, Mallory can intercept and modify our communications, the game is very different. In this case, she can overwrite Gmail’s key with her own public key. The result: I end up sharing a symmetric key with her! The worst part is that I probably won’t know this has happened: clever Mallory can make her own connection to Gmail and silently pass my traffic through — while reading every word. This scenario is called a Man in the Middle (MITM) Attack.

Man_in_the_middle_attack.svg
MITM attack. Alice is your grandfather, Bob is BankofAmerica.com, and Mallory establishes connections with both. (Wikipedia/CC license)

MITM attacks are older than the hills. Fortunately TLS has built-in protections to thwart them. Instead of transmitting a naked public key, the Gmail server wraps its key in a certificate; this is a simple file that embeds both the key and some identifying information, like “gmail.com”. The certificate is digitally signed by someone very trustworthy: one of a few dozen Certificate Authorities (CA) that your browser knows and trusts. These include companies like Verisign, and (yes) Trustwave.

TLS clients (e.g., web browsers) carry the verification keys for a huge number of CAs. When a certificate comes in, they can verify its signature to ensure that it’s legit. This approach works very well, under one very important assumption: namely, Mallory won’t be able to get a signed certificate on a domain she doesn’t own.

What’s wrong with the CA model?

The real problem with the CA model is that every root CA has the power to sign any domain, which completely unravels the security of TLS. So far the industry has policed itself using the Macaroni Grill model: If a CA screws up too badly, they face being removed from the ‘trusted’ list of major TLS clients. In principle this should keep people in line, since it’s the nuclear option for a CA — essentially shutting down their business.

Unfortunately while this sounds good it’s tricky to implement in practice. That’s because:

  1. It assumes that browser vendors are willing to go nuclear on their colleagues at the CAs.
  2. It assumes that browser vendors can go nuclear on a major CA, knowing that the blowback might very well hurt their product. (Imagine that your browser unilaterally stopped accepting Verisign certs. What would you do?)
  3. It assumes that someone will catch misbehaving CAs in the first place.

What’s fascinating about the Trustwave brouhaha is that it’s finally giving us some visibility into how well these assumptions play out in the real world.

So what happened with Trustwave?

In late January of this year, Trustwave made a cryptic update to their CA policy. When people started asking about it, they responded with a carefully-worded post on the company blog. When you cut through the business-speak, here’s what it says:

We sold the right to generate certificates — on any domain name, regardless of whether it belongs to one of our clients or not — and packed this magical capability into a box. We rented this box to a corporate client for the express purpose of running Man-in-the-Middle attacks to eavesdrop on their employees’ TLS-secured connections. At no point did we stop to consider how damaging this kind of practice was, nor did we worry unduly about its potential impact on our business — since quite frankly, we didn’t believe it would have any.

I don’t know which part is worse. That a company whose entire business is based on trust — on the idea that people will believe them when they say a certificate is legit — would think they could get away with selling a tool to make fraudulent certificates. Or that they’re probably right.

But this isn’t the worst of it. There’s reason to believe that Trustwave isn’t alone in this practice. In fact, if we’re to believe the rumors, Trustwave is only noteworthy in that they stopped. Other CAs may still be up to their ears.

And so this finally brings us to the important part of this post: what’s being done, and what can we do to make sure that it never happens again?

Option 1: Rely on the browser vendors

What’s particularly disturbing about the Trustwave fiasco is the response it’s gotten from the various browser manufacturers.

So far exactly one organization has taken a strong stand against this practice. The Mozilla foundation (makers of Firefox) recently sent a strongly-worded letter to all of their root CAs — demanding that they disclose whether such MITM certificates exist, and that they shut them down forthwith. With about 20% browser share (depending on who’s counting), Mozilla has the means to enforce this. Assuming the vendors are honest, and assuming Mozilla carries through on its promise. And assuming that Mozilla browser-share doesn’t fall any further.

That’s the good news. Less cheerful is the deafening silence from Apple, Microsoft and Google. These vendors control most of the remaining browser market, and to the best of my knowledge they’ve said nothing at all about the practice. Publicly, anyway. It’s possible that they’re working the issue privately; if so, more power to them. But in the absence of some evidence, I find it hard to take this on faith.

Option 2: Sunlight is the best disinfectant

The Trustwave fiasco exposes two basic problems with the CA model: (1) any CA can claim ownership of any domain, and (2) there’s no easy way to know which domains a CA has put its stamp on.

This last is very much by CA preference: CAs don’t want to reveal their doings, on the theory that it would harm their business. I can see where they’re coming from (especially if their business includes selling MITM certs!) Unfortunately, allowing CAs to operate without oversight is one of those quaint practices (like clicking on links sent by strangers) that made sense in a more innocent time, but no longer has much of a place in our world.

Hash_Tree.svg
Merkle tree (Wikipedia/CC)

Ben Laurie and Adam Langley feel the same way, and they’ve developed a plan to do something about it. The basic idea is this:

  1. Every new certificate should be published in a public audit log. This log will be open to the world, which means that everyone can scan for illegal entries (i.e., their own domain appearing in somebody else’s certificate.)
  2. Anytime a web server hands out a certificate, it must prove that the certificate is contained in the list.

The beautiful thing is that this proof can be conducted relatively efficiently using a Merkle hash tree. The resulting proofs are quite short (log(N) hashes, where N is the total number of certificates). Browsers will need to obtain the current tree root, which requires either (a) periodic scanning of the tree, or some degree of trust in an authority, who will periodically distribute signed root nodes.

Along the same lines, the EFF has a similar proposal called the Sovereign Keys Project. SKP also proposes a public log, but places stronger requirements on what it takes to get into the log. It’s quite likely that in the long run these projects will merge, or give birth to something even better.

Option 3: Eternal vigilance

The problem with SKP and the Laurie/Langley proposal is that both require changes to the CA infrastructure. Someone will need to construct these audit logs; servers will have to start shipping hash proofs. Both can be incrementally deployed, but will only be effective once deployment reaches a certain level.

Another option is to dispense with this machinery altogether, and deal with rogue CAs today by subjecting them to contant, unwavering surveillance. This is the approach taken by CMU’s Perspectives plugin and by Moxie Marlinspike’s Convergence.

The core idea behind both of these systems is to use ‘network perspectives’ to determine whether the certificate you’re receiving is the same certificate that everyone else is. This helps to avoid MITMs, since presumably the attacker can only be in the ‘middle’ of so many network paths. To accomplish this, both systems deploy servers called Notaries — run on a volunteer basis — which you can call up whenever you receive an unknown certificate. They’ll compare your version of the cert to what they see from the same server, and help you ring the alarm if there’s a mismatch.

A limitation of this approach is privacy; these Notary servers obviously learn quite a bit about the sites you visit. Convergence extends the Perspectives plugin to address some of these issues, but fundamentally there’s no free lunch here. If you’re querying some external party, you’re leaking information.

One solution to this problem is to dispense with online notary queries altogether, and just ask people to carry a list of legitimate certificates with them. If we assume that there are 4 million active certificates in the world, we could easily fit them into a < 40MB Bloom filter. This would allow us to determine whether a cert is ‘on the list’ without making an online query. Of course, this requires someone to compile and maintain such a list. Fortunately there are folks already doing this, including the EFF’s SSL Observatory project.

Option 4: The hypothetical

The existence of these proposals is definitely heartening. It means that people are taking this seriously, and there’s an active technical discussion on how to make things better.

Since we’re in this mode, let me mention a few other things that could make a big difference in detecting exploits. For one thing, it would be awfully nice if web servers had a way to see things through their clients’ eyes. One obvious way to do this is through script: use Javascript to view the current server certificate, and report the details back to the server.

Of course this isn’t perfect — a clever MITM could strip the Javascript or tamper with it. Still, obfuscation is a heck of a lot easier then de-obfuscation, and it’s unlikely that a single attacker is going to win an arms race against a variety of sites.

Unfortunately, this idea has to be relegated to the ‘could be, should be’ dustbin, mostly because Javascript doesn’t have access to the current certificate info. I don’t really see the reason for this, and I sure hope that it changes in the future.

Option 5: The long arm of the law

I suppose the last option — perhaps the least popular — is just to treat CAs the same way that you’d treat any important, trustworthy organization in the real world. That means: you cheat, you pay the penalty. Just as we shouldn’t tolerate Bank of America knowingly opening a credit line in the name of a non-customer, we shouldn’t tolerate a CA doing the same.

Option 6: Vigilante justice

Ok, I’m only kidding about this one, cowboy. You can shut down that LOIC download right now.

In summary

I don’t know that there’s a magical vaccine that will make the the CA system secure, but I’ve come to believe that the current approach is not working. It’s not just examples like Trustwave, which (some might argue) is a relatively limited type of abuse. It’s that the Trustwave revelation comes in addition to a steady drumbeat of news about stolen keys, illegitimately-obtained certificates, and various other abuses.

While dealing with these problems might not be easy, what’s shocking is how easy it would be to at least detect and expose the abuses at the core of it — if various people agreed that this was a worthy goal. I do hope that people start taking this stuff seriously, mostly because being a radical is hard, hard work. I’m just not cut out for it.

Bad movie cryptography, ‘Swordfish’ edition

Hackers are paler than the general public. Also, they use gel.

I was just working on an honest-to-god technical post when I thought: here’s an idea, let’s illustrate this point with a reference to the classic bad-security movie ‘Swordfish‘. What a terrible mistake.

In searching for a link I turned up what purports to be Skip Woods’ original shooting script. And now I’m not going to get any work done until I get this off my chest: holy &#^$*&# crap the cryptography in that movie is way worse than I thought it was. 

I know, I know, it’s a ten year old movie and it’s all been said before. So many times that it’s not even shooting fish in a barrel anymore, it’s more like shooting frozen fish in a barrel.

There isn’t much crypto in the movie. But what there is, whew… If you consider a modified Pritchard scale where the X axis is ‘refers to a technology that could actually exist‘ and the Y axis is ‘doesn’t make me want to stab myself’, Skip Woods has veered substantially into negative territory.

I know most people will say something like ‘Duh’ or ‘It’s swordfish!’ or ‘What do you expect from a movie where a guy breaks a password while John Travolta holds a gun to his head and Halle Berry fiddles around in his lap.’ And yes, I realize that this happens. But that stuff actually doesn’t trouble me so much.

What does bother me is that the DoD system he breaks into uses 128-bit RSA encryption. Does anyone really think that the NSA would validate that? And then there’s this exchange (emphasis mine):

                            GABRIEL
                  Here's the deal. I need a worm,
                  Stanley. A hydra, actually. A
                  multi-headed worm to break an
                  encryption and then sniff out
                  latent digital footprints
                  throughout an encrypted network.

                                STANLEY
                  What kind of cypher?

                                GABRIEL
                  Vernam encryption.

                                STANLEY
                  A Vernam's impossible. Its key
                  code is destroyed upon
                  implementation. Not to mention
                  being a true 128 bit encryption.

                                GABRIEL
                  Actually, we're talking 512 bit.

Ok, I don’t know about the stuff at the beginning — but the rest is serious. We’re not going after a mere Vernam One-Time Pad, which would just be impossible to break. Instead we’re going after the Big Kahuna, the true 128-bit unbreakable Vernam One-Time Pad. No, wait, that’s too easy. To do this right, we’re gonna have to break the full 512-bit unbreakable Vernam One-Time Pad, which is at least 2^384 times as unbreakable as the regular unbreakable kind. Get Halle back in here!

What kills me is that if you squint a little some of this technical jargon kind of makes sense. This can only mean one thing: Skip Woods brought in a technical advisor. But having done so, he obviously took the advice he was given and let it fly prettily out the windows of his Mercedes on the way home. Then he wrote what he wanted to write. Who needs an unbreakable cipher when we can have an unbreakable cipher with a frickin’ 128 512 bit key!

I thought this post would be cathartic, but the truth is I just feel dirty now. Where will this end? Will I find myself criticizing Mercury Rising and Star Trek? The thing is, I like movies, even bad ones. I don’t ask for realism. I just have limits.

And Swordfish is a bridge too far. If you’re a Hollywood type and you need someone to vet your scripts, I’ll do it. Cheap. I won’t leave you all hung up in painful details — if your plot requirements have the main character breaking cryptography in his head, I’ll find a way to make it work. But it won’t be a One-Time Pad and it sure as hell won’t be 128-bit RSA. It will be *ahem* realistic.