On cellular encryption

If you’re interested in technology/privacy issues then you probably heard last week’s big news out of the Boston Marathon case. It comes by way of former FBI agent Tim Clemente, who insists that our government routinely records all domestic phone calls.

Clemente’s claim generated lots of healthy skepticism. This isn’t because the project is technically infeasible (the numbers mostly add up), or because there’s no precedent for warrantless wiretapping. To me the most convincing objection was simple: it’d be hard to keep secret.* Mostly for boring phone company reasons.

But this led to another interesting discussion. What if we forget local phone eavesdropping and focus on an ‘easier’ problem: tapping only cellular phone calls.
Cellular eavesdropping seems a lot more tractable, if only because mobile calls are conducted on a broadcast channel. That means you can wiretap with almost no carrier involvement. In fact there’s circumstancial evidence that this already happening — just by different parties than you’d think. According to a new book by reporters Marc Ambinder and Dave Brown:

The FBI has quietly removed from several Washington, D.C.–area cell phone towers, transmitters that fed all data to wire rooms at foreign embassies.

This raises a few questions: once you’ve tapped someone’s cellular signals, what do you do with the data? Isn’t it all encrypted? And how easy would it be for the US or some foreign government to lay their hands on the plaintext?

All of this which serves as a wonderful excuse to noodle about the state of modern cellular encryption. Be warned that this is not going to be a short post! For those who don’t like long articles, here’s the TL;DR: cellular encryption is a whole lot worse than you think.

GSM

GSM is the granddaddy of all digital cellular protocols, and it remains of the most popular protocols in the world. One thing that makes GSM special is its call encryption capability: the protocol is designed to encrypt all calls in between the handset and the local tower.

Call encryption is facilitated by a long-term secret key (call it K) that’s stored within the tamper-resistant SIM card in your GSM phone. Your carrier also has a copy of this key. When your GSM phone connects to a tower, the parties execute the following authentication and key agreement protocol:

GSM authentication and key agreement protocol (source). MS represents the ‘mobile station’ (phone) and
HLR is the ‘home location register’, a central database. The MS and HLR combine a long-term secret Ki
with a random nonce RAND to create the shared communication key Kc. A3 and A8 are
typically implemented using the COMP128 function.
The interaction above serves two purposes: first, both the phone and carrier (HLR) derive a pair of values that will be used for authentication and key agreement. This includes a session key Kc as well as a short authentication token SRES that the phone will use to authenticate itself to the tower. Derivation is performed using two functions (A3 and A8) that accept the long-term secret K and with a random nonce RAND.
There are a handful of well-known problems with the GSM security protocols. In no particular order, they are:
  1. Lack of tower authentication. GSM phones authenticate to the tower, but the tower doesn’t authenticate back. This means that anyone can create a ‘fake’ tower that your phone will connect to. The major problem here is that in GSM, the tower gets to pick the encryption algorithm! That means your attacker can simply turn encryption off (by setting encryption ‘algorithm’ A5/0) and simply route the cleartext data itself.In theory your phone is supposed to alert you to this kind of attack, but the SIM chip contains a bit that can de-active the warning. And (as researcher Chris Paget discovered) carriers often set this bit.
  2. Bad key derivation algorithms. The GSM ciphers were developed using the ‘make stuff up and pray nobody sees it‘ school of cryptographic algorithm design. This is a bad approach, since it’s incredibly hard to keep algorithms secret — and when they do leak, they tend to break badly. This was the case for the original A3/A8 algorithms, which are both implemented using single function called COMP128-1. Unfortunately COMP128 turns out to be seriously broken — to the point where you can clone a user’s SIM key in as few as 8 queries.
  3. Bad encryption algorithms. Fortunately it’s easy to replace COMP128-1 by swapping out your SIM card. Unfortunately the situation is much worse for GSM’s A5/1 call encryption cipher, which is embedded in the hardware of most handsets and tower equipment. A5/1 was leaked around the same time as COMP128 and rapidly succumbed to a series of increasingly powerful attacks. Today it’s possible to conduct an efficient known-plaintext attack on A5/1 using a series of rainbow tables you can obtain from BitTorrent. The upshot is that most A5/1 calls can now be decrypted on a high-end PC.
  4. Terrible encryption algorithms. But it’s actually worse than that. GSM phones support an ‘export weakened‘ variant called A5/2, which is so weak you can break it in real time. The worst part is that A5/2 uses the same key as A5/1 — which means an active attacker (see #1 above) can briefly activate the A5/2 mode, crack to recover the encryption key, then switch back to A5/1 with a known key. This is much faster than attacking A5/1 directly, and allows eavesdroppers to intercept incoming phone calls from a legitimate (A5/1 supporting) tower.
Alleged ‘Stingray’ tracking device
mounted on an SUV (source).

Another unfortunate aspect of the GSM protocol is that you don’t need to attack the crypto to do useful things. For example, if all you want to do is determine which devices area in an area, you simply present yourself as a valid tower — and see which phones connect to you (by sending their IMSI values). This is the approach taken by IMSI-catchers like Stingray.

Now the ‘good news’ here is that attacks (1), (2) and (4) require active involvement by the attacker. This means they have to be after you specifically and — at least in principal — they’re detectable if you know what to look for. (You don’t.) However, the weaknesses in A5/1 are a whole different kettle of fish. They permit decryption of even passively recorded A5/1 GSM calls (in real time, or after the fact) even to an attacker with modest resources.

3G/4G/LTE

A valid response to the points above is to note that GSM is nearly 30 years old. You probably wouldn’t blame today’s Ford execs for the crash performance of a 1982 Ford Escort, and similarly you shouldn’t hold the GSM designers responsible for a 1980s protocol — even if billions of people still rely on it.

Overview of the 3G AKA protocol. Look familiar?
The great news is that modern phones often support the improved ‘3G’ (e.g., UMTS) or LTE standards. These offer a bundle of improvements that substantially improve security over the original GSM. These can be summed up as follows:
  1. Mutual authentication. The 3G protocols use a new ‘Authentication and Key Agreement‘ (AKA) protocol, which adds mutual authentication to the tower connection. To validate that the phone is speaking to a legitimate tower, the carrier now computes a MAC that the phone can verify before initiating a connection. This prevents many of the uglier protocol attacks that plagued GSM.
  2. Better authentication algorithms. The session keys and authentication tags are still computed using proprietary algorithms — now called f1-f5, f5* — but the algorithms are purportedly much stronger. Since their design is carrier-specific it’s not easy to say exactly how they work. However this 3GPP recommendation indicates that they might be based on a block cipher like AES.
  3. Better encryption. Call encryption in 3G uses a proprietary block cipher called KASUMI. KASUMI is based off of a Mitsubishi proposal called MISTY1, which was heavy customized to make it faster in cellular hardware.

The biggest source of concern for 3G/LTE is that you may not be using it. Most phones are programmed to gracefully ‘fail over’ to GSM when a 3G/4G connection seems unavailable. Active attackers exploit this feature to implement a rollback attack — jamming 3G/4G connections, and thus re-activating all of the GSM attacks described above.

A more subtle concern is the weakness of the KASUMI cipher. Unfortunately KASUMI seems much weaker than the original MISTY1 algorithm — so much weaker that in 2010 Dunkelman, Keller and Shamir were able to implement a related-key attack that recovered a full 128 bit call key in just under two hours!

Now before you panic, you should know that executing this attack requires a substantial amount of data, all of which must all be encrypted under highly unrealistic attack conditions. Still, it’s interesting to note that the same attack fails completely when applied to the original MISTY1 design.

Top: generation of keys (CK, IK) and authentication tags AUTN, XRES using the functions f1-f5. Bottom: one proposed implementation of those functions using a 128-bit block cipher Ek. This could be AES (source).
And yes, it looks complicated to me, too.

What if you have already the keys?

 The encryption flaws above seem pretty significant, and they really are — if you’re a private eavesdropper or a foreign government. For US national security organizations they’re probably beside the point. It seems unlikely that the NSA would have to ‘break’ any crypto at all.

This is because both the GSM and AKA protocols lack an important property known as forward secrecy. What this means is that if I can record an encrypted call, and later obtain the long-term key K for that phone, then I can still reliably decrypt the whole communication — even months or years later. (Protocols such as Diffie-Hellman or ECMQV prevent this.) Worse, for cellular conversations I can do it even if I only have one half (the tower side) of the communication channel.

Unfortunately I don’t think you have to be a conspiracy theorist to suppose that the carriers’ key databases are probably stored somewhere in Ft. Meade. Thus to the truly paranoid: stop talking on cellphones.

In conclusion

Sometimes conspiracy theories can be fun. Sometimes they’re downright creepy.

For me this discussion veers towards the ‘creepy’. Not so much because I think the NSA really is tapping all our cellphones (I suspect they just read our Facebook). Rather, the creepiness is in seeing just how vulnerable our privacy infrastructure is, even to people who are far less capable than the NSA.

As a result, your data isn’t just available to nation states: it’s also potentially available to that goofball neighbor who bought an IMSI-catcher off the Internet. Or to your business competitor. Or even to that one girl who finally got GnuRadio to compile.

We’re rapidly approaching a technological crossroads, one where we need to decide if we’re going to keep trusting others to protect our data — or if we’re going to take charge of it and treat carriers as the dumb and insecure pipes that they are. I suspect that we’re well on our way towards this world — and sadly, so does the FBI. It’ll be interesting to see how things play out.

Notes:

The argument is that recording all local calls in real time is a bigger job than just splitting major trunks or re-routing a few specific targets. It would seem to require significant changes to the US phone infrastructure. This is particularly true at the local office, where you’d need to upgrade and provision thousands of devices to record all calls passing through them. Building this capability is possible, but would require a lot of effort — not to mention the complicity of hundreds (or thousands) of collaborators at the local telcos. And that means a whole lot of people keeping a very big secret. People aren’t so good at that.

Zerocoin: making Bitcoin anonymous

Wow, what the heck is going on with Bitcoin?zerocoin

When I started this post, the value of a single bitcoin had surged upwards of $250. It’s corrected a bit since then (down $100 or so), but it’s pretty clear that we live in a very different world than we did two weeks ago.

And I’m not sure I really like this world. It’s a world where I have to listen to CNBC reporters try to understand Bitcoin. Ouch. I think we can all agree that we were better off before this happened.

The explosion of interest in Bitcoin is both wonderful and terrible. It’s wonderful because Bitcoin is an amazing technical innovation — the first decentralized electronic currency to actually make something of itself. It’s terrible because Bitcoin has some technical rough edges that really need to be filed off before we start using it for anything.

The rough edge that particularly interests me is user privacy. Or rather, Bitcoin’s troubling lack of it.

In this post I’m going to describe a new piece of research out of my lab at Johns Hopkins that provides one potential solution to this problem. This is joint work led by my hardworking students Ian Miers and Christina Garman, along with my colleague Avi Rubin. Our proposal is called Zerocoin, and we’ll be presenting it at this year’s IEEE S&P.

For those who just want the TL;DR, here it is:

Zerocoin is a new cryptographic extension to Bitcoin that (if adopted) would bring true cryptographic anonymity to Bitcoin. It works at the protocol level and doesn’t require new trusted parties or services. With some engineering, it might (someday) turn Bitcoin into a completely untraceable, anonymous electronic currency.

In the rest of the post I’m going to explain Zerocoin, what it can do for Bitcoin, and how far away that ‘someday‘ might be. This is going to be a long, wonky post, so I won’t be offended if you stop here and take my word that it’s all true.

For everyone else, strap in. I need to start with some background.

Bitcoin in 300 words

Before I get to Zerocoin I need to give the world’s shortest explanation of how Bitcoin works. (See here for a slightly less terrible explanation.)

At its heart, Bitcoin is a transaction network with a distributed public ledger. Transactions are files that contains messages like “User X transfers 3 bitcoins to user Y” and “User Y transfers 2.5 of those bitcoins to user Z”. Users aren’t identified by name. Instead, their identities are public keys for a digital signature scheme.* This allows users to sign their transactions, and makes it very difficult to forge them.

Now none of this stuff is really new. What makes Bitcoin special is the way it maintains the transaction ledger. Rather than storing the whole thing on a single computer, the ledger — called a block chain — is massively replicated and updated by a swarm of mutually distrustful parties running in a peer-to-peer network.

To make this work, nodes pull transactions off of a peer-to-peer broadcast network, then compete for the opportunity to tack them on the end of the chain. To keep one party from dominating this process (and posting bad transations), competition is enforced by making the parties solve hard mathematical problems called ‘proofs of work‘. The integrity of the block chain is enforced using hash chaining, which makes it very difficult to change history.

Now the block chain is fascinating, and if you’re interested in the gory details you should by all means see here. This post mostly isn’t going to get into it. For now all you need to know is that the block chain works like a global ledger. It’s easy to add (valid) transactions at the end, but it’s astonishingly difficult to tamper with the transactions that are already there.

So what’s the problem?

The block chain is Bitcoin’s greatest strength. Unfortunately from a privacy perspective, it’s also the currency’s greatest weakness.

This is because the block chain contains a record of every single Bitcoin transaction that’s ever been conducted. Due to the way Bitcoin works, this information can’t be limited to just a few trustworthy parties, since there are no trusted parties. This means all of your transactions are conducted in public.

Illustration of a Bitcoin block chain. Each transaction is tied to the one that precedes it.
The transaction at far left is almost certainly a drug deal.

In a sense this makes Bitcoin less private than cash, and even worse than credit cards. If you choose to engage in sensitive transactions on Bitcoin, you should be aware that a record will be preserved for all eternity. Spend with care.

Now some will say this is unfair, since Bitcoin users are not identified by name — the only identifier associated with your transactions is your public key. Moreover, you can make as many public keys as you’d like. In other words, Bitcoin offers privacy through pseudonymity, which some argue is almost as good as the real thing.

But don’t get too comfortable. Already several academic works have succeeded in de-anonymizing Bitcoin transactions. And this work is just getting started. You see, there’s an entire subfield of computer science that can roughly be described as ‘pulling information out of things that look exactly like the Bitcoin transaction graph’, and while these researchers haven’t done much to Bitcoin yet — that’s only because they’re still fighting over the grant money. We will see more.

If you’re a Bitcoin user who values your privacy, this should worry you. The worst part is that right now your options are somewhat limited. Roughly speaking, they are:

Just be careful. Generate lots of public keys and make sure your client software is extremely careful not to use them in ways that could tie one to another (e.g., getting ‘change’ from one key sent to another). This seems to be the major privacy thrust of the current Bitcoin development effort, and we’re all waiting to see how it pans out.

Use a laundry. For the more paranoid, there are services called ‘laundries that take in bitcoins from a whole bunch of users, mix them up and shuffle them back out. In theory this makes it hard to track your money. Unfortunately, laundries suffer from a few problems. First, they only work well if lots of people are using them, and today’s laundries have relatively low volume. More importantly, you’re entirely dependent on the honesty and goodwill of the laundry itself. A dishonest (or hacked) laundry can steal your coins, or even trace its inputs and outputs — which could completely undermine your privacy.

Use a Chaumian e-cash system. On the wonkier side, there have been attempts to implement real anonymous cryptographic e-Cash for Bitcoin. I’ve written about these systems before and while I think they’re neat, the existing schemes (from Chaum on forward) have one critical flaw: they all rely on a central ‘bank’ to issue and redeem e-Cash tokens. The need for this bank has been a major stumbling block in getting these systems up and running, and it’s almost unworkable for Bitcoin — since trusted parties are antithetical to Bitcoin’s decentralized nature.

In short, the current solutions aren’t perfect. It would be awfully nice if we had something better. Something with the power of cryptographic e-Cash, but without the need to change Bitcoin’s network model. And this is where Zerocoin comes in.

Zerocoin

Zerocoin is not intended as a replacement for Bitcoin. It’s actually a separate anonymous currency that’s designed to live side-by-side with Bitcoin on the same block chain. Zerocoins are fully exchangeable on a one-to-one basis with bitcoins, which means (in principle) you can use them with existing merchants.

Zerocoins themselves can be thought of literally as coins. They’re issued in a fixed denomination (for example, 1 BTC), and any user can purchase a zerocoin in exchange for the correct quantity of bitcoin. This purchase is done by placing a special new ‘Zerocoin Mint’ transaction onto the block chain.

Once a Mint transaction has been accepted by the Bitcoin peers, the same user can later redeem her zerocoin back into bitcoins. She simply embeds a (preferably new) destination Bitcoin address into a ‘Zerocoin Spend’ transaction, then sends it into the network. If the transaction checks out, the Bitcoin peers will treat it just like a normal Bitcoin transfer — meaning that she’ll receive the full bitcoin value of the coin (minus transaction fees) at the destination address.

Now you’re probably wondering what any of this has to do with privacy. To explain that, I need to give you one more piece of information:

Aside from educated guesswork, there’s no way to link a Zerocoin Mint transaction to the Zerocoin Spend transaction that redeems it.

Redeeming a zerocoin gives you a completely different set of bitcoins than the ones you used to purchase it. In fact, you can think of Zerocoin like the world’s biggest laundry — one that can handle millions of users, has no trusted party, and can’t be compromised. Once as user converts her bitcoins into zerocoins, it’s very hard to determine where she took them back out. Their funds are mixed up with all of the other users who also created zerocoins. And that’s a pretty powerful guarantee.

Illustration of a Bitcoin/Zerocoin block chain. A user transforms bitcoins into a zerocoin,
then (at some unspecified later point) ‘Spends’ it to redeem the bitcoins. The linkage between Mint
and Spend (dotted line) cannot be determined from the block chain data.

The key to the whole process is to make it all work at the protocol level — meaning, without adding new trusted parties. And doing that is the goal of Zerocoin.

How does it work?

Zerocoin uses a combination of digital commitments, one-way accumulators and zero-knowledge proofs, and some extensions to the existing Bitcoin protocol. It also shares some similarities to a previous work by Sander and Ta-Shma. For the details, you can see our paper. Here I’m going to try to give a very high-level intuition that avoids the muck.

The key idea in Zerocoin is that each coin commits to (read: encrypts) a random serial number. These coins are easy to create — all you need to do is pick the serial number and run a fast commitment algorithm to wrap this up in a coin. The commitment works like encryption, in that the resulting coin completely hides the serial number . At the same time this coin ‘binds’ you to the number you’ve chosen. The serial number is secret, and it stays with you.

To ‘Mint’ the new coin you post it to the network along with a standard Bitcoin transaction containing enough (normal) bitcoins to ‘pay for’ it. The Mint transaction adds some new messages to the Bitcoin protocol, but fundamentally there’s no magic here. The Bitcoin network will accept the transaction into the block chain as long as the input bitcoins check out.**

The Zerocoin ‘Spend’ transaction is a little bit more complicated. To redeem your zerocoin, you first create a new transaction that contains the coin’s serial number (remember that you kept it secret after you made the coin). You also attach a zero-knowledge proof of the following two statements:

  1. You previously posted a valid zerocoin on the block chain.
  2. This particular zerocoin contained the serial number you put in your transaction.
The key to making this all work is that zero-knowledge proof. What you need to know about these is that anyone can verify such a proof, and she’ll be absolutely convinced that you’re telling the truth about these statements. At the same time, the proof reveals absolutely no other information (hence the ‘zero’ knowledge).
This means anyone who sees your Spend transaction will be convinced that you really did previously Mint a zerocoin, and that it contained the serial number you just revealed. They can then check the block chain to make sure that particular serial number has never been Spent before. At the same time, the zero knowledge property ensures that they they have absolutely no idea which zerocoin you’re actually spending. The number of such coins could easily run into the millions.
All of this leads us to one final question: where do your bitcoins go after you Mint a zerocoin, and how do you get them back when you Spend?
The simple answer is that they don’t go anywhere at all. The bitcoins used in a ‘Mint’ transaction just sit there on the block chain. The Zerocoin protocol semantics require that nobody can access those coins again except by publishing a valid Zerocoin ‘Spend’. When you publish a Spend, the protocol allows you to ‘claim’ any of the previously-committed bitcoins — regardless of who posted them. In other words, you Mint with one set of bitcoins, and you leave with someone else’s.

When will Zerocoin be available?

For those looking to use Zerocoin tomorrow, I would advise patience. We’ve written a proof-of-concept implementation that extends the C++ bitcoind client to support Zerocoin, and we’ll be releasing a cleaned up version of our code when we present the paper in May. (Update: available here.)

But before you get excited, I need to point out some pretty serious caveats.

First of all, Zerocoin is not cheap. Our current zero-knowledge proof averages around 40KB, and take nearly two seconds to verify. By the standards of advanced crypto primitives this is fantastic. At the same time, it poses some pretty serious engineering challenges — not least of which is: where do you store all these proofs?

This probably isn’t the end of the world. For one thing, it seems likely that we’ll be able to reduce the size and cost of verifying the proof, and we think that even the current proof could be made to work with some careful engineering. Still, Zerocoin as currently construed is probably not going to go online anytime soon. But some version of Zerocoin might be ready in the near future.

Another problem with Zerocoin is the difficulty of incrementally deploying it. Supporting the new Mint and Spend functionality requires changes to every Bitcoin client. That’s a big deal, and it’s unlikely that the Bitcoin folks are going to accept a unilateral protocol change without some serious pushback. But even this isn’t a dealbreaker: it should be possible to start Zerocoin off using some training wheels — using a trusted central party to assist with the process, until enough Bitcoin clients trust it and are willing to support it natively.

In fact, one of the biggest barriers to adoption is human beings themselves. As complicated as Bitcoin is, you can explain the crypto even to non-experts. This makes people happy. Unfortunately Zerocoin is a different animal. It will take time to convince people that these new techniques are safe. We hope to be there when it happens.

In conclusion

I realize that this blog post has run slightly longer than usual. Thanks for sticking with me!

As regular readers of this blog know, I have a passion for anything that gets interesting crypto out into the world. Bitcoin is a great example of this. It would be wonderful if this gave us an opportunity to do even more interesting things. Perfectly untraceable e-Cash would definitely fit that bill.

But even if we don’t get there, the fact that reputable computer science conferences are accepting papers about ‘that crazy Bitcoin thing’ tells you a lot about how much it’s grown up. In the long run, this is good news for everyone.

Notes:

* There are a lot of simplifications in here. Identities are the hash of your public key. The block chain is really computed using a Merkle tree for efficiency reasons. The peer-to-peer network isn’t quite a broadcast network. Did I mention you should read the paper?

** Note that for those who ‘know’ their Bitcoin, you can think of the Zerocoin as a piece of extra data that gets added to a Bitcoin transaction. The inputs are still standard bitcoins. There’s just no output. Instead, the transaction contains the coin data.