# Attack of the week: Voice calls in LTE

I haven’t written an “attack of the week” post in a while, and it’s been bumming me out. This is not because there’s been a lack of attacks, but mostly because there hasn’t been an attack on something sufficiently widely-used that it can rouse me out of my blogging torpor.

But today brings a beautiful attack called ReVoLTE, on a set of protocols that I particularly love to see get broken: namely, cellular protocols. And specifically, the (voice over) LTE standards. I’m excited about these particular protocols — and this new attack — because it’s so rare to see actual cellular protocols and implementations get broken. This is mostly because these standards are developed in smoke-filled rooms and written up in 12,000 page documents that researchers never have the energy to deal with. Moreover, implementing the attacks requires researchers to mess with gnarly radio protocols.

And so, serious cryptographic vulnerabilities can spread all over the world, presumably only exploited by governments, before a researcher actually takes a look at them. But every now and then there’s an exception, and today’s attack is one of them.

The attack itself is by David Rupprecht, Katharina Kohls, Thorsten Holz, and Christina Pöpper at RUB and NYU Abu Dhabi. It’s a lovely key re-installation attack on a voice protocol that you’re probably already using, assuming you’re one of the older generation who still make phone calls using a cellular phone.

### What is LTE, and what is VoLTE?

The basis for our modern cellular telephony standards began in Europe back in the 1980s, with a standard known as Global System for Mobile. GSM was the first major digital cellular telephony standard, and it introduced a number of revolutionary features such as the use of encryption to protect phone calls. Early GSM was designed primarily for voice communications, although data could be sent over the air at some expense.

As data became more central to cellular communications, the Long Term Evolution (LTE) standards were devised to streamline this type of communication. LTE builds on a group of older standards such as GSM, EDGE and HSPA to make data communication much faster. There’s a lot of branding and misbranding in this area, but the TL;DR is that LTE is a data communications system that serves as a bridge between older packet data protocols and future 5G cellular data technologies.

Of course, history tells us that once you have enough (IP) bandwidth, concepts like “voice” and “data” start to blur together. The same is true with modern cellular protocols. To make this transition smoother, the LTE standards define Voice-over-LTE (VoLTE), which is an IP-based standard for transmitting voice calls directly over the data plane of the LTE system, bypassing the circuit-switched portion of the cellular network entirely. As with standard VoIP calls, VoLTE calls can be terminated by the cellular provider and connected to the normal phone network. Or (as is increasingly common) they can be routed directly from one cellular customer to another, even across providers.

Like standard VoIP, VoLTE is based on two popular IP-based protocols: Session Initiation Protocol (SIP) for call establishment, and Real Time Transport Protocol (which should be called RTTP but is actually called RTP) to actually handle voice data. VoLTE also adds some additional bandwidth optimizations, such as header compression.

### Ok, what does this have to do with encryption?

Like GSM before it, LTE has a standard set of cryptographic protocols for encrypting packets while they travel over the air. These are mainly designed to protect your data while it travels between your handset (called the “User Equipment”, or UE) and the cellular tower (or wherever your provider decides to terminate the connection.) This is because cellular providers view outside eavesdroppers as the enemy, not themselves. Obviously.

(However, the fact that VoLTE connections can occur directly between customers on different provider networks does mean that the VoLTE protocol itself has some additional and optional encryption protocols that can happen at higher network layers. These aren’t relevant to the current paper except insofar as they could screw things up. We’ll talk about them briefly further below.)

Historical GSM encryption had many weaknesses: bad ciphers, protocols where only the handset authenticated itself to the tower (meaning an attacker could impersonate a tower, giving rise to the “Stingray“) and so on. LTE fixed many of the obvious bugs, while keeping a lot of the same structure.

Let’s start with the encryption itself. Assuming key establishment has already happened — and we’ll talk about that in just a minute — each data packet is encrypted using a stream cipher mode using some cipher called “EEA” (which in practice can be implemented with things like AES). The encryption mechanism is basically CTR-mode, as shown below:

Since the encryption algorithm itself (EEA) can be implemented using a strong cipher like AES, it’s unlikely that there’s any direct attack on the cipher itself, as there was back in the GSM days. However, even with a strong cipher, it’s obvious that this encryption scheme is a giant footgun waiting to go off.

Specifically: the LTE standard uses an (unauthenticated) stream cipher with a mode that will be devastatingly vulnerable should the counter — and other inputs, such as ‘bearer’ and ‘direction’ — ever be re-used. In modern parlance the term for this concept is “nonce re-use attack“, but the potential risks here are not modern. They’re well-known and ancient, going back to the days of hair-metal and even disco.

In fairness, the LTE standards says “don’t re-use these counters, please“. But the LTE standards are also like 7,000 pages long, and anyway, this is like begging toddlers not to play with a gun. Inevitably, they’re going to do that and terrible things will happen. In this case, the discharging gun is a keystream re-use attack in which two different confidential messages get XORed with the same keystream bytes. This is known to be utterly devastating for message confidentiality.

### So what’s ReVoLTE?

The ReVoLTE attack paper points out that, indeed, this highly vulnerable encryption construction is in fact, misused by real equipment in the wild. Specifically, the authors analyze actual VoLTE calls made using commercial equipment, and show that they can exploit something called a “key re-installation attack”. (Much credit for the basic observation goes to Raza and Lu, who first pointed out the potential vulnerability. But the ReVoLTE research turns it into a practical attack.)

Let me give a quick overview of the attack here, although you should really read the paper.

You might assume that once LTE sets up a packet data connection, voice-over-LTE is just a question of routing voice packets over that connection alongside all of your other data traffic. In other words, VoLTE would be a concept that exists only above Layer 2. This isn’t precisely the case.

In fact, LTE’s data link layer introduces the concept of a “bearer“. Bearers are separate session identifiers that differentiate various kinds of packet traffic. Normal Internet traffic (your Twitter and Snapchat) goes over one bearer. SIP signalling for VoIP goes over another, and voice traffic packets are handled on a third. I don’t have much insight into the RF and network routing mechanisms of LTE, but I presume this is done because LTE networks want to enable quality of service mechanisms to ensure that these different packet flows are treated with different priority levels: i.e., your crummy TCP connections to Facebook can be prioritized at a lower level than your real-time voice calls.

This isn’t exactly a problem, but it raises an issue. Keys for LTE encryption are derived separately each time a new “bearer” is set up. In principle this should happen afresh each time you make a new phone call. This would result in a different encryption key for each call, thus eliminating the possibility that the same key will be re-used to encrypt two different sets of call packets. Indeed, the LTE standard says something like “you should use a different key each time you set up a new bearer to handle a new phone call.” But that doesn’t mean it happens.

In fact, in real implementations, two different calls that happen in close temporal proximity will end up using the exact same key — despite the fact that new (identically-named) bearers are configured between them. The only practical change that happens between those calls is that the encryption counter will reset back to zero. In the literature, this is sometimes called a key reinstallation attack. One can argue that this is basically an implementation error, although in this case the risks seem largely set up by the standard itself.

In practice, this attack leads to keystream re-use where an attacker can obtain the encrypted packets $C_1 = M_1 \oplus KS$ and $C_2 = M_2 \oplus KS$, which allows her to compute $C_1 \oplus C_2 = M_1 \oplus M_2$. Even better, if the attacker knows one of $M_1$ or $M_2$, she can immediately recover the other. This gives her a strong incentive to know one of the two plaintexts.

This brings us to the complete and most powerful attack scenario. Consider an attacker who can eavesdrop the radio connection between a target phone and the cellular tower, and who somehow gets “lucky enough” to record two different calls where the second happens immediately subsequent to the other. Now imagine she can somehow can guess the plaintext contents of one of the calls. In this eventuality, our attacker can completely decrypt the first call, using a simple XOR evaluation between the two sets of packets.

And of course, as it happens — luck has nothing to do with it. Since phones are designed to receive calls, an attacker who can eavesdrop that first call will be able to initiate a second call at exactly moment the first call ends. This second call, should it re-use the same encryption key with a counter set back to zero, will enable plaintext recovery. Even better, since our attacker actually controls the data in the second call, she may be able to recover the contents of the first one — pending a whole lot of implementation-specific details all breaking in her favor.

Here’s a picture of the overall attack, taken from the paper:

### So does the attack actually work?

At one level, this is really the entire question for the ReVoLTE paper. All of the ideas above sound great in theory, but they leave a ton of questions. Such as:

1. Is it feasible for (academic researchers) to actually sniff VoLTE connections?
2. Do real LTE systems actually re-install keys?
3. Can you actually initiate that second call quickly and reliably enough to make a handset and tower re-use a key?
4. Even if systems do re-install keys, can you actually know the digital plaintext of the second call — given that things like codecs and transcoding may totally change the (bitwise) contents of that second call, even if you have access to the “bits” flowing out of your attacker phone?

The ReVoLTE paper answers several of these questions in the affirmative. The authors are able to use a commercial software-defined radio downlink sniffer called Airscope in order to eavesdrop the downlink side of a VoLTE call. (As is typical with academic research, I expect that simply getting hold of the software and figuring out how to work it took months off some poor graduate students’ lives.)

In order for key re-use to happen, the researchers discovered that a second call has to occur very rapidly after the termination of the first one, but not too rapidly — about ten seconds for the providers they experimented with. Fortunately, it doesn’t really matter if the target picks the call up within that time — the “ringing”, i.e., SIP communication itself causes the provider to re-use the same key.

Many of the gnarliest issues thus revolve around issue (4), obtaining all of the plaintext bits for the attacker-initiated call. This is because a lot of things can happen to your plaintext as it travels from your attacker handset out to the victim’s phone and through the cellular network. These include nastiness such as transcoding of encoded audio data, which makes the audio sound the same but totally changes the binary representation of the audio. LTE networks also use RTP header compression that can substantially change big portions of the RTP packet.

Finally, packets sent by the attacker need to roughly line up with packets that happened in the first phone call. This can be problematic, as silent patches in a phone call result in shorter messages (called comfort noise), which may not overlap well with the original call.

The “real world attack” section of the paper is worth reading for all the details. It addresses many of the above concerns — specifically, the authors find that some codecs are not transcoded, and that roughly 89% of the binary representation of the target call can be recovered, for at least two European providers that the attackers tested.

This is an astonishingly high level of success, and frankly much better than I anticipated when I started the paper.

### So what can we do to fix this?

The immediate answer to this question is straightforward: since the vulnerability is a key re-use (re-installation) attack, just fix this attack. Make sure to derive a new key for each phone call, and never allow your packet counter to reset back to zero with the same key. Problem solved!

Or maybe not. Getting this right will require upgrading a lot of equipment, and frankly the fix itself isn’t terribly robust. It would be nice if standards could find a cleaner way to implement their encryption modes that isn’t instantly and catastrophically vulnerable to these nonce-reuse issues.

One possible direction is to use modes of encryption where nonce-misuse doesn’t result in catastrophic outcomes. This might be too expensive for some current hardware, but it’s certainly a direction that designers should be thinking about for the future, particular as the 5G standards are about to take over the world.

This new result also raises a general question about why the same damned attacks keep cropping up in standard after standard, many of which use very similar designs and protocols. At a certain point when you’ve had this same key re-installation issue happen in multiple widely-deployed protocols such as WPA2, maybe it’s time to make your specifications and testing procedures more robust to it? Stop treating implementers as thoughtful partners who will pay attention to your warnings, and treat them as (inadvertent) adversaries who are inevitably going to implement everything incorrectly.

Or alternatively, we can do what the Facebooks and Apples of the world are increasingly doing: make voice call encryption happen at a higher level of the OSI network stack, without relying on cellular equipment manufacturers to get anything right. We could even promote end-to-end encryption of voice calls, as WhatsApp and Signal and FaceTime do, assuming the US government would just stop trying to trip us up. Then (with the exception of some metadata) many of these problems would go away. This solution is particularly pertinent in a world where governments aren’t even sure if they trust their equipment providers.

Alternatively, we could just do what our kids have already done: and just stop answering those annoying voice calls altogether.

# Attack of the week: searchable encryption and the ever-expanding leakage function

A few days ago I had the pleasure of hosting Kenny Paterson, who braved snow and historic cold (by Baltimore standards) to come talk to us about encrypted databases.

Kenny’s newest result is with first authors Paul Grubbs, Marie-Sarah Lacharité and Brice Minaud (let’s call it GLMP). It isn’t so much about building encrypted databases, as it is about the risks of building them badly. And — for reasons I will get into shortly — there have been a lot of badly-constructed encrypted database schemes going around. What GLMP point out is that this weakness isn’t so much a knock against the authors of those schemes, but rather, an indication that they may just be trying to do the impossible.

Hopefully this is a good enough start to get you drawn in. Which is excellent, because I’m going to need to give you a lot of background.

### What’s an “encrypted” database, and why are they a problem?

Databases (both relational and otherwise) are a pretty important part of the computing experience. Modern systems make vast use of databases and their accompanying query technology in order to power just about every software application we depend on.

Because these databases often contain sensitive information, there has been a strong push to secure that data. A key goal is to encrypt the contents of the database, so that a malicious database operator (or a hacker) can’t get access to it if they compromise a single machine. If we lived in a world where security was all that mattered, the encryption part would be pretty easy: database records are, after all, just blobs of data — and we know how to encrypt those. So we could generate a cryptographic key on our local machine, encrypt the data before we upload it to a vulnerable database server, and just keep that key locally on our client computer.

Voila: we’re safe against a database hack!

The problem with this approach is that encrypting the database records leaves us with a database full of opaque, unreadable encrypted junk. Since we have the decryption key on our client, we can decrypt and read those records after we’ve downloaded them. But this approach completely disables one of the most useful features of modern databases: the ability for the database server itself to search (or query) the database for specific records, so that the client doesn’t have to.

Unfortunately, standard encryption borks search capability pretty badly. If I want to search a database for, say, employees whose salary is between $50,000 and$100,000, my database is helpless: all it sees is row after row of encrypted gibberish. In the worst case, the client will have to download all of the data rows and search them itself — yuck.

This has led to much wailing and gnashing of teeth in the database community. As a result, many cryptographers (and a distressing number of non-cryptographers) have tried to fix the problem with “fancier” crypto. This has not gone very well.

It would take me a hundred years to detail all of various solutions that have been put forward. But let me just hit a few of the high points:

• Some proposals have suggested using deterministic encryption to encrypt database records. Deterministic encryption ensures that a given plaintext will always encrypt to a single ciphertext value, at least for a given key. This enables exact-match queries: a client can simply encrypt the exact value (“John Smith”) that it’s searching for, and ask the database to identify encrypted rows that match it.
• Of course, exact-match queries don’t support more powerful features. Most databases also need to support range queries. One approach to this is something called order revealing encryption (or its weaker sibling, order preserving encryption). These do exactly what they say they do: they allow the database to compare two encrypted records to determine which plaintext is greater than the other.
• Some people have proposed to use trusted hardware to solve these problems in a “simpler” way, but as we like to say in cryptography: if we actually had trusted hardware, nobody would pay our salaries. And, speaking more seriously, even hardware might not stop the leakage-based attacks discussed below.

This summary barely scratches the surface of this problem, and frankly you don’t need to know all the details for the purpose of this blog post.

What you do need to know is that each of the above proposals entails has some degree of “leakage”. Namely, if I’m an attacker who is able to compromise the database, both to see its contents and to see how it responds when you (a legitimate user) makes a query, then I can learn something about the data being queried.

### What some examples of leakage, and what’s a leakage function?

Leakage is a (nearly) unavoidable byproduct of an encrypted database that supports queries. It can happen when the attacker simply looks at the encrypted data, as she might if she was able to dump the contents of your database and post them on the dark web. But a more powerful type of leakage occurs when the attacker is able to compromise your database server and observe the query interaction between legitimate client(s) and your database.

Take deterministic encryption, for instance.

Deterministic encryption has the very useful, but also unpleasant feature that the same plaintext will always encrypt to the same ciphertext. This leads to very obvious types of leakage, in the sense that an attacker can see repeated records in the dataset itself. Extending this to the active setting, if a legitimate client queries on a specific encrypted value, the attacker can see exactly which records match the attacker’s encrypted value. She can see how often each value occurs, which gives and indication of what value it might be (e.g., the last name “Smith” is more common than “Azriel”.) All of these vectors leak valuable information to an attacker.

Other systems leak more. Order-preserving encryption leaks the exact order of a list of underlying records, because it causes the resulting ciphertexts to have the same order. This is great for searching and sorting, but unfortunately it leaks tons of useful information to an attacker. Indeed, researchers have shown that, in real datasets, an ordering can be combined with knowledge about the record distribution in order to (approximately) reconstruct the contents of an encrypted database.

Fancier order-revealing encryption schemes aren’t quite so careless with your confidentiality: they enable the legitimate client to perform range queries, but without leaking the full ordering so trivially. This approach can leak less information: but a persistent attacker will still learn some data from observing a query and its response — at a minimum, she will learn which rows constitute the response to a query, since the database must pack up the matching records and send them over to the client.

If you’re having trouble visualizing what this last type of leakage might look like, here’s a picture that shows what an attacker might see when a user queries an unencrypted database vs. what the attacker might see with a really “good” encrypted database that supports range queries:

So the TL;DR here is that many encrypted database schemes have some sort of “leakage”, and this leakage can potentially reveal information about (a) what a client is querying on, and (b) what data is in the actual database.

### But surely cryptographers don’t build leaky schemes?

Sometimes the perfect is the enemy of the good.

Cryptographers could spend a million years stressing themselves to death over the practical impact of different types of leakage. They could also try to do things perfectly using expensive techniques like fully-homomorphic encryption and oblivious RAM — but the results would be highly inefficient. So a common view in the field is researchers should do the very best we can, and then carefully explain to users what the risks are.

For example, a real database system might provide the following guarantee:

“Records are opaque. If the user queries for all records BETWEEN some hidden values X AND Y then all the database will learn is the row numbers of the records that match this range, and nothing else.”

This is a pretty awesome guarantee, particularly if you can formalize it and prove that a scheme achieves it. And indeed, this is something that researchers have tried to do. The formalized description is typically achieved by defining something called a leakage function. It might not be possible to prove that a scheme is absolutely private, but we can prove that it only leaks as much as the leakage function allows.

Now, I may be overdoing this slightly, but I want to be very clear about this next part:

Proving your encrypted database protocol is secure with respect to a specific leakage function does not mean it is safe to use in practice. What it means is that you are punting that question to the application developer, who is presumed to know how this leakage will affect their dataset and their security needs. Your leakage function and proof simply tell the app developer what information your scheme is (provably) going to protect, and what it won’t.

The obvious problem with this approach is that application developers probably don’t have any idea what’s safe to use either. Helping them to figure this out is one goal of this new GLMP paper and its related work.

### So what leaks from these schemes?

GLMP don’t look at a specific encryption scheme. Rather, they ask a more general question: let’s imagine that we can only see that a legitimate user has made a range query — but not what the actual queried range values are. Further, let’s assume we can also see which records the database returns for that query, but not their actual values.

How much does just this information tell us about the contents of the database?

You can see that this is a very limited amount of leakage. Indeed, it is possibly the least amount of leakage you could imagine for any system that supports range queries, and is also efficient. So in one sense, you could say authors are asking a different and much more important question: are any of these encrypted databases actually secure?

### Can you give me a simple, illuminating example?

Let’s say I’m an attacker who has compromised a database, and observes the following two range queries/results from a legitimate client:

Query 1: SELECT * FROM Salaries BETWEEN ⚙️ and 🕹    Result 1: (rows 1, 3, 5)
Query 2: SELECT * FROM Salaries BETWEEN 😨 and 🎱    Result 2: (rows 1, 43, 3, 5)

Here I’m using the emoji to illustrate that an attacker can’t see the actual values submitted within the range queries — those are protected by the scheme — nor can she see the actual values of the result rows, since the fancy encryption scheme hides all this stuff. All the attacker sees is that a range query came in, and some specific rows were scooped up off disk after running the fancy search protocol.

So what can the attacker learn from the above queries? Surprisingly: quite a bit.

At very minimum, the attacker learns that Query 2 returned all of the same records as Query 1. Thus the range of the latter query clearly somewhat overlaps with the range of the former.  There is an additional record (row 43) that is not within the range of Query 1. That tells us that row 43 must must be either the “next” greater or smaller record than each of rows (1, 3, 5). That’s useful information.

Get enough useful information, it turns out that it starts to add up. In 2016, Kellaris, Kollios, Nissim and O’Neill showed that if you know the distribution of the query range endpoints — for example, if you assumed that they were uniformly random — then you can get more than just the order of records. You can reconstruct the exact value of every record in the database.

This result is statistical in nature. If I know that the queries are uniformly random, then I can model how often a given value (say, Age=34 out of a range 1-120) should be responsive to a given random query results. By counting the actual occurrences of a specific row after many such queries, I can guess which rows correlate to specific record values. The more queries I see, the more certain I can be.The Kellaris et al. paper shows that this takes $O(N^4~log~N)$ queries, where is the number of possible values your data can take on (e.g., the ages of your employees, ranging between 1 and 100 would give N=100.) This is assuming an arbitrary dataset. The results get much better if the database is “dense”, meaning every possible value occurs once.

In practice the Kellaris et al. results mean that database fields with small domains (like ages) could be quickly reconstructed after observing a reasonable number of queries from a legitimate user, albeit one who likes to query everything randomly.

The main bright spot in this research —- at least up until recently — was that many types of data have much larger domains. If you’re dealing with salary data ranging from, say, $1 to$200,000, then N=200,000 and this dominant $N^4$ tends to make Kellaris et al. attacks impractical, simply because they’ll take too long. Similarly, data like employee last names (encoded as a form that can be sorted and range-queries) gives you even vaster domains like $N=26^{12}$, say, and so perhaps we could pleasantly ignore these results and spend our time on more amusing engagements.

### I bet we can’t ignore these results, can we?

Indeed, it seems that we can’t. The reason we can’t sit on our laurels and hope for an attacker to die of old age recovering large-domain data sets is due to something called approximate database reconstruction, or $\epsilon-$ADR.

The setting here is the same: an attacker sits and watches an attacker make (uniformly random) range queries. The critical difference is that this attacker isn’t trying to get every database record back at its exact value: she’s willing to tolerate some degree of error, up to an additive $\epsilon N$. For example, if I’m trying to recover employee salaries, I don’t need them to be exact: getting them within 1% or 5% is probably good enough for my purposes. Similarly, reconstructing nearly all of the letters in your last name probably lets me guess the rest, especially if I know the distribution of common last names.

Which finally brings us to this new GLMP paper, which puts $\epsilon-$ADR on steroids. What it shows is that the same setting, if one is willing to “sacrifice” a few of the highest and lowest values in the database, an attacker can reconstruct nearly the full database in a much smaller (asymptotic) number of queries, specifically: $O(\epsilon^{-2} log~\epsilon^{-1})$ queries, where $\epsilon$ is the error parameter.

The important thing to notice about these results is that the value N has dropped out of the equation. The only term that’s left is the error term $\epsilon$. That means these results are “scale-free”, and (asymptotically, at least), they work just as well for small values of N as large ones, and large databases and small ones. This is really remarkable.

### Big-O notation doesn’t do anything for me: what does this even mean?

Big-O notation is beloved by computer scientists, but potentially meaningless in practice. There could be huge constants in these terms that render these attacks completely impractical. Besides, weird equations involving epsilon characters are impossible for humans to understand.

Sometimes the easiest way to understand a theoretical result is to plug some actual numbers in and see what happens. GLMP were kind enough to do this for us, by first generating several random databases — each containing 1,000 records, for different values of N. They then ran their recovery algorithm against a simulated batch of random range queries to see what the actual error rate looked like as the query count increased.

Here are their results:

Even after just 100 queries, the error in the dataset has been hugely reduced, and after 500 queries the contents of the database — excluding the tails — can be recovered with only about a 1-2% error rate.

Moreover, these experimental results illustrate the fact that recovery works at many scales: that is, they work nearly as well for very different values of N, ranging from 100 to 100,000. This means that the only variable you really need to think about as an attacker is: how close do I need my reconstruction to be? This is probably not very good news for any real data set.

### How do these techniques actually work?

The answer is both very straightforward and deeply complex. The straightforward part is simple; the complex part requires an understanding of Vapnik-Chervonenkis learning theory (VC-theory) which is beyond the scope of this blog post, but is explained in the paper.

At the very highest level the recovery approach is similar to what’s been done in the past: using response probabilities to obtain record values. This paper does it much more efficiently and approximately, using some fancy learning theory results while making a few assumptions.

At the highest level: we are going to assume that the range queries are made on random endpoints ranging from 1 to N. This is a big assumption, and more on it later! Yet with just this knowledge in hand, we learn quite a bit. For example: we can compute the probability that a potential record value (say, the specific salary $34,234) is going to be sent back, provided we know the total value lies in the range 1-N (say, we know all salaries are between$1 and $200,000). If we draw the resulting probability curve in freehand, it might look something like the chart below. This isn’t actually to scale or (probably) even accurate, but it illustrates a key point: by the nature of (random) range queries, records near the center are going to have a higher overall chance of being responsive to any given query, since the “center” values are more frequently covered by random ranges, and records near the extreme high- and low values will be chosen less frequently. The high-level goal of database reconstruction is to match the observed response rate for a given row (say, row 41) to the number of responses we’d expect see for different specific concrete values in the range. Clearly the accuracy of this approach is going to depend on the number of queries you, the attacker, can observe — more is better. And since the response rates are lower at the highest and lowest values, it will take more queries to guess outlying data values. You might also notice that there is one major pitfall here. Since the graph above is symmetric around its midpoint, the expected response rate will be the same for a record at .25*N and a record at .75*N — that is, a$50,000 salary will be responsive to random queries at precisely same rate as a $150,000 salary. So even if you get every database row pegged precisely to its response rate, your results might still be “flipped” horizontally around the midpoint. Usually this isn’t the end of the world, because databases aren’t normally full of unstructured random data — high salaries will be less common than low salaries in most organizations, for example, so you can probably figure out the ordering based on that assumption. But this last “bit” of information is technically not guaranteed to come back, minus some assumptions about the data set. Thus, the recovery algorithm breaks down into two steps: first, observe the response rate for each record as random range queries arrive. For each record that responds to such a query, try to solve for a concrete value that minimizes the difference between the expected response rate on that value, and the observed rate. The probability estimation can be made more efficient (eliminating a quadratic term) by assuming that there is at least one record in the database within the range .2N-.3N (or .7N-.8N, due to symmetry). Using this “anchor” record requires a mild assumption about the database contents. What remains is to show that the resulting attack is efficient. You can do this by simply implementing it — as illustrated by the charts above. Or you can prove that it’s efficient. The GLMP paper uses some very heavy statistical machinery to do the latter. Specifically, they make use of a result from Vapnik-Chervonenkis learning theory (VC-theory), which shows that the bound can be derived from something called the VC-dimension (which is a small number, in this case) and is unrelated to the actual value of N. That proof forms the bulk of the result, but the empirical results are also pretty good. ### Is there anything else in the paper? Yes. It gets worse. There’s so much in this paper that I cannot possibly include it all here without risking carpal tunnel and boredom, and all of it is bad news for the field of encrypted databases. The biggest additional result is one that shows that if all you want is an approximate ordering of the database rows, then you can do this efficiently using something called a PQ tree. Asymptotically, this requires $O(\epsilon^{-1} log~\epsilon^{-1})$ queries, and experimentally the results are again even better than one would expect. What’s even more important about this ordering result is that it works independently of the query distribution. That is: we do not need to have random range queries in order for this to work: it works reasonably well regardless of how the client puts its queries together (up to a point). Even better, the authors show that this ordering, along with some knowledge of the underlying database distribution — for example, let’s say we know that it consists of U.S. citizen last names — can also be used to obtain approximate database reconstruction. Oy vey! And there’s still even more: • The authors show how to obtain even more efficient database recovery in a setting where the query range values are known to the attacker, using PAC learning. This is a more generous setting than previous work, but it could be realistic in some cases. • Finally, they extend this result to prefix and suffix queries, as well as range queries, and show that they can run their attacks on a dataset from the Fraternal Order of Police, obtaining record recovery in a few hundred queries. In short: this is all really bad for the field of encrypted databases. ### So what do we do about this? I don’t know. Ignore these results? Fake our own deaths and move into a submarine? In all seriousness: database encryption has been a controversial subject in our field. I wish I could say that there’s been an actual debate, but it’s more that different researchers have fallen into different camps, and nobody has really had the data to make their position in a compelling way. There have actually been some very personal arguments made about it. The schools of thought are as follows: The first holds that any kind of database encryption is better than storing records in plaintext and we should stop demanding things be perfect, when the alternative is a world of constant data breaches and sadness. To me this is a supportable position, given that the current attack model for plaintext databases is something like “copy the database files, or just run a local SELECT * query”, and the threat model for an encrypted database is “gain persistence on the server and run sophisticated statistical attacks.” Most attackers are pretty lazy, so even a weak system is probably better than nothing. The countervailing school of thought has two points: sometimes the good is much worse than the perfect, particularly if it gives application developers an outsized degree of confidence of the security that their encryption system is going to provide them. If even the best encryption protocol is only throwing a tiny roadblock in the attacker’s way, why risk this at all? Just let the database community come up with some kind of ROT13 encryption that everyone knows to be crap and stop throwing good research time into a problem that has no good solution. I don’t really know who is right in this debate. I’m just glad to see we’re getting closer to having it. # Was the Efail disclosure horribly screwed up? TL;DR. No. Or keep reading if you want. On Monday a team of researchers from Münster, RUB and NXP disclosed serious cryptographic vulnerabilities in a number of encrypted email clients. The flaws, which go by the cute vulnerability name of “Efail”, potentially allow an attacker to decrypt S/MIME or PGP-encrypted email with only minimal user interaction. By the standards of cryptographic vulnerabilities, this is about as bad as things get. In short: if an attacker can intercept and alter an encrypted email — say, by sending you a new (altered) copy, or modifying a copy stored on your mail server — they can cause many GUI-based email clients to send the full plaintext of the email to an attacker controlled-server. Even worse, most of the basic problems that cause this flaw have been known for years, and yet remain in clients. The big (and largely under-reported) story of EFail is the way it affects S/MIME. That “corporate” email protocol is simultaneously (1) hated by the general crypto community because it’s awful and has a slash in its name, and yet (2) is probably the most widely-used email encryption protocol in the corporate world. The table at the right — excerpted from the paper — gives you a flavor of how Efail affects S/MIME clients. TL;DR it affects them very badly. Efail also happens to affect a smaller, but non-trivial number of OpenPGP-compatible clients. As one might expect (if one has spent time around PGP-loving folks) the disclosure of these vulnerabilities has created something of a backlash on HN, and among people who make and love OpenPGP clients. Mostly for reasons that aren’t very defensible. So rather than write about fun things — like the creation of CFB and CBC gadgets — today, I’m going to write about something much less exciting: the problem of vulnerability disclosure in ecosystems like PGP. And how bad reactions to disclosure can hurt us all. ### How Efail was disclosed to the PGP community Putting together a comprehensive timeline of the Efail disclosure process would probably be a boring, time-intensive project. Fortunately Thomas Ptacek loves boring and time-intensive projects, and has already done this for us. Briefly, the first Efail disclosures to vendors began last October, more than 200 days prior to the agreed publication date. The authors notified a large number of vulnerable PGP GUI clients, and also notified the GnuPG project (on which many of these projects depend) by February at the latest. From what I can tell every major vendor agreed to make some kind of patch. GnuPG decided that it wasn’t their fault, and basically stopped corresponding. All parties agreed not to publicly discuss the vulnerability until an agreed date in April, which was later pushed back to May 15. The researchers also notified the EFF and some journalists under embargo, but none of them leaked anything. On May 14 someone dumped the bug onto a mailing list. So the EFF posted a notice about the vulnerability (which we’ll discuss a bit more below), and the researchers put up a website. That’s pretty much the whole story. There are three basic accusations going around about the Efail disclosure. They can be summarized as (1) maintaining embargoes in coordinated disclosures is really hard, (2) the EFF disclosure “unfairly” made this sound like a serious vulnerability “when it isn’t”, and (3) everything was already patched anyway so what’s the big deal. ### Disclosures are hard; particularly coordinated ones I’ve been involved in two disclosures of flaws in open encryption protocols. (Both were TLS issues.) Each one poses an impossible dilemma. You need to simultaneously (a) make sure every vendor has as much advance notice as possible, so they can patch their software. But at the same time (b) you need to avoid telling literally anyone, because nothing on the Internet stays secret. At some point you’ll notify some FOSS project that uses an open development mailing list or ticket server, and the whole problem will leak out into the open. Disclosing bugs that affect PGP is particularly fraught. That’s because there’s no such thing as “PGP”. What we have instead is a large and distributed community that revolves around the OpenPGP protocol. The pillar of this community is the GnuPG project, which maintains the core GnuPG tool and libraries that many clients rely on. Then there are a variety of niche GUI-based clients and email plugin projects. Finally, there are commercial vendors like Apple and Microsoft. (Who are mostly involved in the S/MIME side of things, and may reluctantly allow PGP plugins.) Then, of course there are thousands of end-users, who will generally fail to update their software unless something really bad and newsworthy happens. The obvious solution to the disclosure problem to use a staged disclosure. You notify the big commercial vendors first, since that’s where most of the affected users are. Then you work your way down the “long tail” of open source projects, knowing that inevitably the embargo could break and everyone will have to patch in a hurry. And you keep in mind that no matter what happens, everyone will blame you for screwing up the disclosure. For the PGP issues in Efail, the big client vendors are Mozilla (Thunderbird), Microsoft (Outlook) and maybe Apple (Mail). The very next obvious choice would be to patch the GnuPG tool so that it no longer spits out unauthenticated plaintext, which is the root of many of the problems in Efail. The Efail team appears to have pursued exactly this approach for the client-side vulnerabilities. Sadly, the GnuPG team made the decision that it’s not their job to pre-emptively address problems that they view as ‘clients misusing the GnuPG API’ (my paraphrase), even when that misuse appears to be rampant across many of the clients that use their tool. And so the most obvious fix for one part of the problem was not available. This is probably the most unfortunate part of the Efail story, because in this case GnuPG is very much at fault. Their API does something that directly violates cryptographic best practices — namely, releasing unauthenticated plaintext prior to producing an error message. And while this could be understood as a reasonable API design at design time, continuing to support this API even as clients routinely misuse it has now led to flaws across the ecosystem. The refusal of GnuPG to take a leadership role in preemptively safeguarding these vulnerabilities both increases the difficulty of disclosing these flaws, and increases the probability of future issues. ### So what went wrong with the Efail disclosure? Despite what you may have heard, given the complexity of this disclosure, very little went wrong. The main issues people have raised seem to have to do with the contents of an EFF post. And with some really bad communications from Robert J. Hansen at the Enigmail (and GnuPG) project. The EFF post. The Efail researchers chose to use the Electronic Frontier Foundation as their main source for announcing the existence of the vulnerability to the privacy community. This hardly seems unreasonable, because the EFF is generally considered a trusted broker, and speaks to the right community (at least here in the US). The EFF post doesn’t give many details, nor does it give a list of affected (or patched) clients. It does give two pretty mild recommendations: 1. Temporarily disable or uninstall your existing clients until you’ve checked that they’re patched. 2. Maybe consider using a more modern cryptosystem like Signal, at least until you know that your PGP client is safe again. This naturally led to a huge freakout by many in the PGP community. Some folks, including vendors, have misrepresented the EFF post as essentially pushing people to “permanently” uninstall PGP, which will “put lives at risk” because presumably these users (whose lives are at risk, remember) will immediately fall back to sending incriminating information via plaintext emails — rather than temporarily switching their communications to one of several modern, well-studied secure messengers, or just not emailing for a few hours. In case you think I’m exaggerating about this, here’s one reaction from ProtonMail: The most reasonable criticism I’ve heard of the EFF post is that it doesn’t give many details about which clients are patched, and which are vulnerable. This could presumably give someone the impression that this vulnerability is still present in their email client, and thus would cause them to feel less than secure in using it. I have to be honest that to me that sounds like a really good outcome. The problem with Efail is that it doesn’t matter if your client is secure. The Efail vulnerability could affect you if even a single one of your communication partners is using an insecure client. So needless to say I’m not very sympathetic to the reaction around the EFF post. If you can’t be sure whether your client is secure, you probably should feel insecure. Bad communications from GnuPG and Enigmail. On the date of the disclosure, anyone looking for accurate information about security from two major projects — GnuPG and Enigmail — would not have been able to find it. They wouldn’t have found it because developers from both Enigmail and GnuPG were on mailing lists and Twitter claiming that they had never heard of Efail, and hadn’t been notified by the researchers. Needless to say, these allegations took off around the Internet, sometimes in place of real information that could have helped users (like, whether either project had patched.) It goes without saying that neither allegation was actually true. In fact, both project members soon checked with their fellow developers (and their memories) and found out that they’d both been given months of notice by the researchers, and that Enigmail had even developed a patch. (However, it turned out that even this patch may not perfectly address the issue, and the community is still working to figure out exactly what still needs to be done.) This is an understandable mistake, perhaps. But it sure is a bad one. ### PGP is bad technology and it’s making a bad community Now that I’ve made it clear that neither the researchers nor the EFF is out to get the PGP community, let me put on my mask and horns and tell you why someone should be. I’ve written extensively about PGP on this blog, but in the past I’ve written mostly from a technical point of view about the problems with PGP. But what’s really problematic about PGP is not just the cryptography; it’s the story it tells about path dependence and how software communities work. The fact of the matter is that OpenPGP is not really a cryptography project. That is, it’s not held together by cryptography. It’s held together by backwards-compatibility and (increasingly) a kind of an obsession with the idea of PGP as an end in and of itself, rather than as a means to actually make end-users more secure. Let’s face it, as a protocol, PGP/OpenPGP is just not what we’d develop if we started over today. It was formed over the years out of mostly experimental parts, which were in turn replaced, bandaged and repaired — and then worked into numerous implementations, which all had to be insanely flexible and yet compatible with one another. The result is bad, and most of the software implementing it is worse. It’s the equivalent of a beloved antique sports car, where the electrical system is totally shot, but it still drives. You know, the kind of car where the owner has to install a hand-switch so he can turn the reverse lights on manually whenever he wants to pull out of a parking space. If PGP went away, I estimate it would take the security community less than a year to entirely replace (the key bits of) the standard with something much better and modern. It would have modern crypto and authentication, and maybe even extensions for future post-quantum future security. It would be simple. Many bright new people would get involved to help write the inevitable Rust, Go and Javascript clients and libraries. Unfortunately for us all, (Open)PGP does exist. And that means that even fancy greenfield email projects feel like they need to support OpenPGP, or at least some subset of it. This in turn perpetuates the PGP myth, and causes other clients to use it. And as a direct result, even if some clients re-implement OpenPGP from scratch, other clients will end up using tools like GnuPG which will support unauthenticated encryption with bad APIs. And the cycle will go round and around, like a spaceship stuck near the event horizon of a black hole. And as the standard perpetuates itself, largely for the sake of being a standard, it will fail to attract new security people. It will turn away exactly the type of people who should be working on these tools. Those people will go off and build encryption systems in a totally different area, or they’ll get into cryptocurrency. And — with some exceptions — the people who work in the community will increasingly work in that community because they’re supporting PGP, and not because they’re trying to seek out the best security technologies for their users. And the serious (email) users of PGP will be using it because they like the idea of using PGP better than they like using an actual, secure email standard. And as things get worse, and fail to develop, people who work on it will become more dogmatic about its importance, because it’s something threatened and not a real security protocol that anyone’s using. To me that’s where PGP is going today, and that is why the community has such a hard time motivating itself to take these vulnerabilities seriously, and instead reacts defensively. Maybe that’s a random, depressing way to end a post. But that’s the story I see in OpenPGP. And it makes me really sad. # A few notes on Medsec and St. Jude Medical In Fall 2016 I was invited to come to Miami as part of a team that independently validated some alleged flaws in implantable cardiac devices manufactured by St. Jude Medical (now part of Abbott Labs). These flaws were discovered by a company called MedSec. The story got a lot of traction in the press at the time, primarily due to the fact that a hedge fund called Muddy Waters took a large short position on SJM stock as a result of these findings. SJM subsequently sued both parties for defamation. The FDA later issued a recall for many of the devices. Due in part to the legal dispute (still ongoing!), I never had the opportunity to write about what happened down in Miami, and I thought that was a shame: because it’s really interesting. So I’m belatedly putting up this post, which talks a bit MedSec’s findings, and implantable device security in general. By the way: “we” in this case refers to a team of subject matter experts hired by Bishop Fox, and retained by legal counsel for Muddy Waters investments. I won’t name the other team members here because some might not want to be troubled by this now, but they did most of the work — and their names can be found in this public expert report (as can all the technical findings in this post.) Quick disclaimers: this post is my own, and any mistakes or inaccuracies in it are mine and mine alone. I’m not a doctor so holy cow this isn’t medical advice. Many of the flaws in this post have since been patched by SJM/Abbot. I was paid for my time and travel by Bishop Fox for a few days in 2016, but I haven’t worked for them since. I didn’t ask anyone for permission to post this, because it’s all public information. ### A quick primer on implantable cardiac devices Implantable cardiac devices are tiny computers that can be surgically installed inside a patient’s body. Each device contains a battery and a set of electrical leads that can be surgically attached to the patient’s heart muscle. When people think about these devices, they’re probably most familiar with the cardiac pacemaker. Pacemakers issue small electrical shocks to ensure that the heart beats at an appropriate rate. However, the pacemaker is actually one of the least powerful implantable devices. A much more powerful type of device is the Implantable Cardioverter-Defibrillator (ICD). These devices are implanted in patients who have a serious risk of spontaneously entering a dangerous state in which their heart ceases to pump blood effectively. The ICD continuously monitors the patient’s heart rhythm to identify when the patient’s heart has entered this condition, and applies a series of increasingly powerful shocks to the heart muscle to restore effective heart function. Unlike pacemakers, ICDs can issue shocks of several hundred volts or more, and can both stop and restart a patient’s normal heart rhythm. Like most computers, implantable devices can communicate with other computers. To avoid the need for external data ports – which would mean a break in the patient’s skin – these devices communicate via either a long-range radio frequency (“RF”) or a near-field inductive coupling (“EM”) communication channel, or both. Healthcare providers use a specialized hospital device called a Programmer to update therapeutic settings on the device (e.g., program the device, turn therapy off). Using the Programmer, providers can manually issue commands that cause an ICD to shock the patient’s heart. One command, called a “T-Wave shock” (or “Shock-on-T”) can be used by healthcare providers to deliberately induce ventrical fibrillation. This capability is used after a device is implanted, in order to test the device and verify it’s functioning properly. Because the Programmer is a powerful tool – one that could cause harm if misused – it’s generally deployed in a physician office or hospital setting. Moreover, device manufacturers may employ special precautions to prevent spurious commands from being accepted by an implantable device. For example: 1. Some devices require that all Programmer commands be received over a short-range communication channel, such as the inductive (EM) channel. This limits the communication range to several centimeters. 2. Other devices require that a short-range inductive (EM) wand must be used to initiate a session between the Programmer and a particular implantable device. The device will only accept long-range RF commands sent by the Programmer after this interaction, and then only for a limited period of time. From a computer security perspective, both of these approaches have a common feature: using either approach requires some form of close-proximity physical interaction with the patient before the implantable device will accept (potentially harmful) commands via the long-range RF channel. Even if a malicious party steals a Programmer from a hospital, she may still need to physically approach the patient – at a distance limited to perhaps centimeters – before she can use the Programmer to issue commands that might harm the patient. In addition to the Programmer, most implantable manufacturers also produce some form of “telemedicine” device. These devices aren’t intended to deliver commands like cardiac shocks. Instead, they exist to provide remote patient monitoring from the patient’s home. Telematics devices use RF or inductive (EM) communications to interrogate the implantable device in order to obtain episode history, usually at night when the patient is asleep. The resulting data is uploaded to a server (via telephone or cellular modem) where it can be accessed by healthcare providers. ### What can go wrong? Before we get into specific vulnerabilities in implantable devices, it’s worth asking a very basic question. From a security perspective, what should we even be worried about? There are a number of answers to this question. For example, an attacker might abuse implantable device systems or infrastructure to recover confidential patient data (known as PHI). Obviously this would be bad, and manufacturers should design against it. But the loss of patient information is, quite frankly, kind of the least of your worries. A much scarier possibility is that an attacker might attempt to harm patients. This could be as simple as turning off therapy, leaving the patient to deal with their underlying condition. On the much scarier end of the spectrum, an ICD attacker could find a way to deliberately issue dangerous shocks that could stop a patient’s heart from functioning properly. Now let me be clear: this isn’t not what you’d call a high probability attack. Most people aren’t going to be targeted by sophisticated technical assassins. The concerning thing about this the impact of such an attack is significantly terrifying that we should probably be concerned about it. Indeed, some high-profile individuals have already taken precautions against it. The real nightmare scenario is a mass attack in which a single resourceful attacker targets thousands of individuals simultaneously — perhaps by compromising a manufacturer’s back-end infrastructure — and threatens to harm them all at the same time. While this might seem unlikely, we’ve already seen attackers systematically target hospitals with ransomware. So this isn’t entirely without precedent. ### Securing device interaction physically The real challenge in securing an implantable device is that too much security could hurt you. As tempting as it might be to lard these devices up with security features like passwords and digital certificates, doctors need to be able to access them. Sometimes in a hurry. This is a big deal. If you’re in a remote emergency room or hospital, the last thing you want is some complex security protocol making it hard to disable your device or issue a required shock. This means we can forget about complex PKI and revocation lists. Nobody is going to have time to remember a password. Even merely complicated procedures are out — you can’t afford to have them slow down treatment. At the same time, these devices obviously must perform some sort of authentication: otherwise anyone with the right kind of RF transmitter could program them — via RF, from a distance. This is exactly what you want to prevent. Many manufacturers have adopted an approach that cut through this knot. The basic idea is to require physical proximity before someone can issue commands to your device. Specifically, before anyone can issue a shock command (even via a long-range RF channel) they must — at least briefly — make close physical contact with the patient. This proximity be enforced in a variety of ways. If you remember, I mentioned above that most devices have a short-range inductive coupling (“EM”) communications channel. These short-range channels seem ideal for establishing a “pairing” between a Programmer and an implantable device — via a specialized wand. Once the channel is established, of course, it’s possible to switch over to long-range RF communications. This isn’t a perfect solution, but it has a lot going for it: someone could still harm you, but they would have to at least get a transmitter within a few inches of your chest before doing so. Moreover, you can potentially disable harmful commands from an entire class of device (like telemedecine monitoring devices) simply by leaving off the wand. ### St. Jude Medical and MedSec So given this background, what did St. Jude Medical do? All of the details are discussed in a full expert report published by Bishop Fox. In this post we I’ll focus on the most serious of MedSec’s claims, which can be expressed as follows: Using only the hardware contained within a “Merlin @Home” telematics device, it was possible to disable therapy and issue high-power “shock” commands to an ICD from a distance, and without first physically interacting with the implantable device at close range. This vulnerability had several implications: 1. The existence of this vulnerability implies that – through a relatively simple process of “rooting” and installing software on a Merlin @Home device – a malicious attacker could create a device capable of issuing harmful shock commands to installed SJM ICD devices at a distance. This is particularly worrying given that Merlin @Home devices are widely deployed in patients’ homes and can be purchased on eBay for prices under$30. While it might conceivably be possible to physically secure and track the location of all PCS Programmer devices, it seems challenging to physically track the much larger fleet of Merlin @Home devices.
2. More critically, it implies that St. Jude Medical implantable devices do not enforce a close physical interaction (e.g., via an EM wand or other mechanism) prior to accepting commands that have the potential to harm or even kill patients. This may be a deliberate design decision on St. Jude Medical’s part. Alternatively, it could be an oversight. In either case, this design flaw increases the risk to patients by allowing for the possibility that remote attackers might be able to cause patient harm solely via the long-range RF channel.
3. If it is possible – using software modifications only – to issue shock commands from the Merlin @Home device, then patients with an ICD may be vulnerable in the hypothetical event that their Merlin @Home device becomes remotely compromised by an attacker. Such a compromise might be accomplished remotely via a network attack on a single patient’s Merlin @Home device. Alternatively, a compromise might be accomplished at large scale through a compromise of St. Jude Medical’s server infrastructure.

We stress that the final scenario is strictly hypothetical. MedSec did not allege a specific vulnerability that allows for the remote compromise of Merlin @Home devices or SJM infrastructure. However, from the perspective of software and network security design, these attacks are one of the potential implications of a design that permits telematics devices to send such commands to an implantable device. It is important to stress that none of these attacks would be possible if St. Jude Medical’s design prohibited the implantable from accepting therapeutic commands from the Merlin @Home device (e.g., by requiring close physical interaction via the EM wand, or by somehow authenticating the provenance of commands and restricting critical commands to be sent by the Programmer only).

### Validating MedSec’s claim

To validate MedSec’s claim, we examined their methodology from start to finish. This methodology included extracting and decompiling Java-based software from a single PCS Programmer; accessing a Merlin @Home device to obtain a root shell via the JTAG port; and installing a new package of custom software written by MedSec onto a used Merlin @Home device.

We then observed MedSec issue a series of commands to an ICD device using a Merlin @Home device that had been customized (via software) as described above. We used the Programmer to verify that these commands were successfully received by the implantable device, and physically confirmed that MedSec had induced shocks by attaching a multimeter to the leads on the implantable device.

Finally, we reproduced MedSec’s claims by opening the case of a second Merlin @Home device (after verifying that the tape was intact over the screw holes), obtaining a shell by connecting a laptop computer to the JTAG port, and installing MedSec’s software on the device. We were then able to issue commands to the ICD from a distance of several feet. This process took us less than three hours in total, and required only inexpensive tools and a laptop computer.

### What are the technical details of the attack?

Simply reproducing a claim is only part of the validation process. To verify MedSec’s claims we also needed to understand why the attack described above was successful. Specifically, we were interested in identifying the security design issues that make it possible for a Merlin @Home device to successfully issue commands that are not intended to be issued from this type of device. The answer to this question is quite technical, and involves the specific way that SJM implantable devices verify commands before accepting them.

MedSec described to us the operation of SJM’s command protocol as part of their demonstration. They also provided us with Java JAR executable code files taken from the hard drive of the PCS Programmer. These files, which are not obfuscated and can easily be “decompiled” into clear source code, contain the software responsible for implementing the Programmer-to-Device communications protocol.

By examining the SJM Programmer code, we verified that Programmer commands are authenticated through the inclusion of a three-byte (24 bit) “authentication tag” that must be present and correct within each command message received by the implantable device. If this tag is not correct, the device will refuse to accept the command.

From a cryptographic perspective, 24 bits is a surprisingly short value for an important authentication field. However, we note that even this relatively short tag might be sufficient to prevent forgery of command messages – provided the tag ws calculated using a secure cryptographic function (e.g., a Message Authentication Code) using a fresh secret key that cannot be predicted by an the attacker.

Based on MedSec’s demonstration, and on our analysis of the Programmer code, it appears that SJM does not use the above approach to generate authentication tags. Instead, SJM authenticates the Programmer to the implantable with the assistance of a “key table” that is hard-coded within the Java code within the Programmer. At minimum, any party who obtains the (non-obfuscated) Java code from a legitimate SJM Programmer can gain the ability to calculate the correct authentication tags needed to produce viable commands – without any need to use the Programmer itself.

Moreover, MedSec determined – and successfully demonstrated – that there exists a “Universal Key”, i.e., a fixed three-byte authentication tag, that can be used in place of the calculated authentication tag. We identified this value in the Java code provided by MedSec, and verified that it was sufficient to issue shock commands from a Merlin @Home to an implantable device.

While these issues alone are sufficient to defeat the command authentication mechanism used by SJM implantable devices, we also analyzed the specific function that is used by SJM to generate the three-byte authentication tag.  To our surprise, SJM does not appear to use a standard cryptographic function to compute this tag. Instead, they use an unusual and apparently “homebrewed” cryptographic algorithm for the purpose.

Specifically, the PCS Programmer Java code contains a series of hard-coded 32-bit RSA public keys. To issue a command, the implantable device sends a value to the Programmer. This value is then “encrypted” by the Programmer using one of the RSA public keys, and the resulting output is truncated to produce a 24-bit output tag.

The above is not a standard cryptographic protocol, and quite frankly it is difficult to see what St. Jude Medical is trying to accomplish using this technique. From a cryptographic perspective it has several problems:

1. The RSA public keys used by the PCS Programmers are 32 bits long. Normal RSA keys are expected to be a minimum of 1024 bits in length. Some estimates predict that a 1024-bit RSA key can be factored (and thus rendered insecure) in approximately one year using a powerful network of supercomputers. Based on experimentation, we were able to factor the SJM public keys in less than one second on a laptop computer.
2. Even if the RSA keys were of an appropriate length, the SJM protocol does not make use of the corresponding RSA secret keys. Thus the authentication tag is not an RSA signature, nor does it use RSA in any way that we are familiar with.
3. As noted above, since there is no shared session key established between the specific implantable device and the Programmer, the only shared secret available to both parties is contained within the Programmer’s Java code. Thus any party who extracts the Java code from a PCS Programmer will be able to transmit valid commands to any SJM implantable device.

Our best interpretation of this design is that the calculation is intended as a form of “security by obscurity”, based on the assumption that an attacker will not be able to reverse engineer the protocol. Unfortunately, this approach is rarely successful when used in security systems. In this case, the system is fundamentally fragile – due to the fact that code for computing the correct authentication tag is likely available in easily-decompiled Java bytecode on each St. Jude Medical Programmer device. If this code is ever extracted and published, all St. Jude Medical devices become vulnerable to command forgery.

### How to remediate these attacks?

To reiterate, the fundamental security concerns with these St. Jude Medical devices (as of 2016) appeared to be problems of design. These were:

1. SJM implantable devices did not require close physical interaction prior to accepting commands (allegedly) sent by the Programmer.
2. SJM did not incorporate a strong cryptographic authentication mechanism in its RF protocol to verify that commands are truly sent by the Programmer.
3. Even if the previous issue was addressed, St. Jude did not appear to have an infrastructure for securely exchanging shared cryptographic keys between a legitimate Programmer and an implantable device.

There are various ways to remediate these issues. One approach is to require St. Jude implantable devices to exchange a secret key with the Programmer through a close-range interaction involving the Programmer’s EM wand. A second approach would be to use a magnetic sensor to verify the presence of a magnet on the device, prior to accepting Programmer commands. Other solutions are also possible. I haven’t reviewed the solution SJM ultimately adopted in their software patches, and I don’t know how many users patched.

### Conclusion

Implantable devices offer a number of unique security challenges. It’s naturally hard to get these things right. At the same time, it’s important that vendors take these issues seriously, and spend the time to get cryptographic authentication mechanisms right — because once deployed, these devices are very hard to repair, and the cost of a mistake is extremely high.

# Attack of the Week: Group Messaging in WhatsApp and Signal

If you’ve read this blog before, you know that secure messaging is one of my favorite topics. However, recently I’ve been a bit disappointed. My sadness comes from the fact that lately these systems have been getting too damned good. That is, I was starting to believe that most of the interesting problems had finally been solved.

If nothing else, today’s post helped disabuse me of that notion.

This result comes from a new paper by Rösler, Mainka and Schwenk from Ruhr-Universität Bochum (affectionately known as “RUB”). The RUB paper paper takes a close look at the problem of group messaging, and finds that while messengers may be doing fine with normal (pairwise) messaging, group messaging is still kind of a hack.

If all you want is the TL;DR, here’s the headline finding: due to flaws in both Signal and WhatsApp (which I single out because I use them), it’s theoretically possible for strangers to add themselves to an encrypted group chat. However, the caveat is that these attacks are extremely difficult to pull off in practice, so nobody needs to panic. But both issues are very avoidable, and tend to undermine the logic of having an end-to-end encryption protocol in the first place. (Wired also has a good article.)

First, some background.

### How do end-to-end encryption and group chats work?

In recent years we’ve seen plenty of evidence that centralized messaging servers aren’t a very good place to store confidential information. The good news is: we’re not stuck with them. One of the most promising advances in the area of secure communications has been the recent widespread deployment of end-to-end (e2e) encrypted messaging protocols.

At a high level, e2e messaging protocols are simple: rather than sending plaintext to a server — where it can be stolen or read — the individual endpoints (typically smartphones) encrypt all of the data using keys that the server doesn’t possess. The server has a much more limited role, moving and storing only meaningless ciphertext. With plenty of caveats, this means a corrupt server shouldn’t be able to eavesdrop on the communications.

In pairwise communications (i.e., Alice communicates with only Bob) this encryption is conducted using a mix of public-key and symmetric key algorithms. One of the most popular mechanisms is the Signal protocol, which is used by Signal and WhatsApp (notable for having 1.3 billion users!) I won’t discuss the details of the Signal protocol here, except to say that it’s complicated, but it works pretty well.

A fly in the ointment is that the standard Signal protocol doesn’t work quite as well for group messaging, primarily because it’s not optimized for broadcasting messages to many users.

To handle that popular case, both WhatsApp and Signal use a small hack. It works like this: each group member generates a single “group key” that this member will use to encrypt all of her messages to everyone else in the group. When a new member joins, everyone who is already in the group needs to send a copy of their group key to the new member (using the normal Signal pairwise encryption protocol). This greatly simplifies the operation of group chats, while ensuring that they’re still end-to-end encrypted.

### How do members know when to add a new user to their chat?

Here is where things get problematic.

From a UX perspective, the idea is that only one person actually initiates the adding of a new group member. This person is called the “administrator”. This administrator is the only human being who should actually do anything — yet, her one click must cause some automated action on the part of every other group members’ devices. That is, in response to the administrator’s trigger, all devices in the group chat must send their keys to this new group member.

(In Signal, every group member is an administrator. In WhatsApp it’s just a subset of the members.)

The trigger is implemented using a special kind of message called (unimaginatively) a “group management message”. When I, as an administrator, add Tom to a group, my phone sends a group management message to all the existing group members. This instructs them to send their keys to Tom — and to notify the members visually so that they know Tom is now part of the group. Obviously this should only happen if I really did add Tom, and not if some outsider (like that sneaky bastard Tom himself!) tries to add Tom.

And this is where things get problematic.

### Ok, what’s the problem?

According to the RUB paper, both Signal and WhatsApp fail to properly authenticate group management messages.

The upshot is that, at least in theory, this makes it possible for an unauthorized person — not a group administrator, possibly not even a member of the group — to add someone to your group chat.

The issues here are slightly different between Signal and WhatsApp. To paraphrase Tolstoy, every working implementation is alike, but every broken one is broken in its own way. And WhatsApp’s implementation is somewhat worse than Signal. Here I’ll break them down.

Signal. Signal takes a pragmatic (and reasonable) approach to group management. In Signal, every group member is considered an administrator — which means that any member can add a new member. Thus if I’m a member of a group, I can add a new member by sending a group management message to every other member. These messages are sent encrypted via the normal (pairwise) Signal protocol.

The group management message contains the “group ID” (a long, unpredictable number), along with the identity of the person I’m adding. Because messages are sent using the Signal (pairwise) protocol, they should be implicitly authenticated as coming from me — because authenticity is a property that the pairwise Signal protocol already offers. So far, this all sounds pretty good.

The problem that the RUB researchers discovered through testing, is that while the Signal protocol does authenticate that the group management comes from me, it doesn’t actually check that I am a member of the group — and thus authorized to add the new user!

In short, if this finding is correct, it turns out that any random Signal user in the world can you send a message of the form “Add Mallory to the Group 8374294372934722942947”, and (if you happen to belong to that group) your app will go ahead and try to do it.

The good news is that in Signal the attack is very difficult to execute. The reason is that in order to add someone to your group, I need to know the group ID. Since the group ID is a random 128-bit number (and is never revealed to non-group-members or even the server**) that pretty much blocks the attack. The main exception to this is former group members, who already know the group ID — and can now add themselves back to the group with impunity.

(And for the record, while the group ID may block the attack, it really seems like a lucky break — like falling out of a building and landing on a street awning. There’s no reason the app should process group management messages from random strangers.)

So that’s the good news. The bad news is that WhatsApp is a bit worse.

WhatsApp. WhatsApp uses a slightly different approach for its group chat. Unlike Signal, the WhatsApp server plays a significant role in group management, which means that it determines who is an administrator and thus authorized to send group management messages.

Additionally, group management messages are not end-to-end encrypted or signed. They’re sent to and from the WhatsApp server using transport encryption, but not the actual Signal protocol.

When an administrator wishes to add a member to a group, it sends a message to the server identifying the group and the member to add. The server then checks that the user is authorized to administer that group, and (if so), it sends a message to every member of the group indicating that they should add that user.

The flaw here is obvious: since the group management messages are not signed by the administrator, a malicious WhatsApp server can add any user it wants into the group. This means the privacy of your end-to-end encrypted group chat is only guaranteed if you actually trust the WhatsApp server.

This undermines the entire purpose of end-to-end encryption.

### But this is silly. Don’t we trust the WhatsApp server? And what about visual notifications?

One perfectly reasonable response is that exploiting this vulnerability requires a compromise of the WhatsApp server (or legal compulsion, perhaps). This seems fairly unlikely.

And yet, the entire point of end-to-end encryption is to remove the server from the trusted computing base. We haven’t entirely achieved this yet, thanks to things like key servers. But we are making progress. This bug is a step back, and it’s one a sophisticated attacker potentially could exploit.

A second obvious objection to these issues is that adding a new group member results in a visual notification to each group member. However, it’s not entirely clear that these messages are very effective. In general they’re relatively easy to miss. So these are meaningful bugs, and things that should be fixed.

### How do you fix this?

The great thing about these bugs is that they’re both eminently fixable.

The RUB paper points out some obvious countermeasures. In Signal, just make sure that the group management messages come from a legitimate member of the group. In WhatsApp, make sure that the group management messages are signed by an administrator.*

Obviously fixes like this are a bit complex to roll out, but none of these should be killers.

### Is there anything else in the paper?

Oh yes, there’s quite a bit more. But none of it is quite as dramatic. For one thing, it’s possible for attackers to block message acknowledgements in group chats, which means that different group members could potentially see very different versions of the chat. There are also several cases where forward secrecy can be interrupted. There’s also some nice analysis of Threema, if you’re interested.

### I need a lesson. What’s the moral of this story?

The biggest lesson is that protocol specifications are never enough. Both WhatsApp and Signal (to an extent) have detailed protocol specifications that talk quite a bit about the cryptography used in their systems. And yet the issues reported in the RUB paper not obvious from reading these summaries. I certainly didn’t know about them.

In practice, these problems were only found through testing.

So the main lesson here is: test, test, test. This is a strong argument in favor of open-source applications and frameworks that can interact with private-garden services like Signal and WhatsApp. It lets us see what the systems are getting right and getting wrong.

The second lesson — and a very old one — is that cryptography is only half the battle. There’s no point in building the most secure encryption protocol in the world if someone can simply instruct your client to send your keys to Mallory. The greatest lesson of all time is that real cryptosystems are always broken this way — and almost never through the fancy cryptographic attacks we love to write about.

Notes:

* The challenge here is that since WhatsApp itself determines who the administrators are, this isn’t quite so simple. But at very least you can ensure that someone in the group was responsible for the addition.

** According to the paper, the Signal group IDs are always sent encrypted between group members and are never revealed to the Signal server. Indeed, group chat messages look exactly like pairwise chats, as far as the server is concerned. This means only current or former group members should know the group ID.

# Attack of the week: DUHK

Before we get started, fair warning: this is going to be a post about a fairly absurd (but non-trivial!) attack on cryptographic systems. But that’s ok, because it’s based on a fairly absurd vulnerability.

This work comes from Nadia Heninger, Shaanan Cohney and myself, and follows up on some work we’ve been doing to look into the security of pseudorandom number generation in deployed cryptographic devices. We made a “fun” web page about it and came up with a silly logo. But since this affects something like 25,000 deployed Fortinet devices, the whole thing is actually kind of depressing.

The paper is called “Practical state recovery attacks against legacy RNG implementation“, and it attacks an old vulnerability in a pseudorandom number generator called ANSI X9.31, which is used in a lot of government certified products. The TL;DR is that this ANSI generator really sucks, and is easy to misuse. Worse, when it’s misused — as it has been — some very bad things can happen to the cryptography that relies on it.

First, some background.

### What is an ANSI, and why should I care?

A pseudorandom number generator (PRG) is a deterministic algorithm designed to “stretch” a short random seed into a large number of apparently random numbers. These algorithms are used ubiquitously in cryptographic software to supply all of the random bits that our protocols demand.

PRGs are so important, in fact, that the U.S. government has gone to some lengths to standardize them. Today there are three generators approved for use in the U.S. (FIPS) Cryptographic Module Validation Program. Up until 2016, there were four. This last one, which is called the ANSI X9.31 generator, is the one we’re going to talk about here.

ANSI X9.31 is a legacy pseudorandom generator based on a block cipher, typically AES. It takes as its initial seed a pair of values (K, V) where K is a key and V is an initial “seed” (or “state”). The generator now produces a long stream of pseudorandom bits by repeatedly applying the block cipher in the crazy arrangement below:

The diagram above illustrates one of the funny properties of the ANSI generator: namely, that while the state value V updates for each iteration of the generator, the key K never changes. It remains fixed throughout the entire process.

And this is a problem. Nearly twenty years ago, Kelsey, Schneier, Wagner and Hall pointed out that this fact makes the ANSI generator terribly insecure in the event that an attacker should ever learn the key K.

Specifically, if an attacker were to obtain K somehow, and then was able to learn only a single 16-byte raw output block (Ri) from a working PRG, she could do the following: (1) guess the timestamp T, (2) work backwards (decrypting using K) in order to recover the corresponding state value V, and now (3) run the generator forwards or backwards (with guesses for T) to obtain every previous and subsequent output of the generator.

Thus, if an application uses the ANSI generator to produce something like a random nonce (something that is typically sent in a protocol in cleartext), and also uses the generator to produce secret keys, this means an attacker could potentially recover those secret keys and completely break the protocol.

Of course, all of this requires that somehow the attacker learns the secret value K. At the time Kelsey et al. published their result, this was viewed as highly unlikely. After all, we’re really good at keeping secrets.

### I assume you’re joking?

So far we’ve established that the ANSI generator is only secure if you can forever secure the value K. However, this seems fairly reasonable. Surely implementers won’t go around leaking their critical secrets all over the place. And certainly not in government-validated cryptographic modules. That would be crazy.

Yet crazy things do happen. We figured someone should probably check.

To see how the X9.31 key is managed in real products, our team developed a sophisticated analytic technique called “making a graduate student read every FIPS document on the CMVP website”.

Most of the documents were fairly vague. And yet, a small handful of widely-used cryptographic modules had language that was troubling. Specifically, several vendors include language in their security policy that indicates the ANSI key was either hard-coded, or at least installed in a factory — as opposed to being freshly generated at each device startup.

Of even more concern: at least one of the hard-coded vendors was Fortinet, a very popular and successful maker of VPN devices and firewalls.

To get more specific, it turns out that starting (apparently in 2009, or perhaps earlier), every FortiOS 4.x device has shipped with a hardcoded value for K. This key has been involved in generating virtually every random bit used to establish VPN connections on those appliances, using both the TLS and IPSec protocols. The implication is that anyone with the resources to simply reverse-engineer the FortiOS firmware (between 2009 and today) could theoretically have been able to recover K themselves — and thus passively decrypt any VPN connection.

(Note: Independent of our work, the ANSI generator was replaced with a more secure alternative as of FortiOS 5.x. As a result of our disclosure, it has also been patched in FortiOS 4.3.19. There are still lots of unpatched firewalls out there, however.)

### What does the attack look like?

Running an attack against a VPN device requires three ingredients. The first is the key K, which can be recovered from the FortiOS firmware using a bit of elbow grease. Shaanan Cohney (the aforementioned graduate student) was able to pull it out with a bit of effort.

Next, the attacker must have access to some VPN or TLS traffic. It’s important to note that this is not an active attack. All you really need is a network position that’s capable of monitoring full two-sided TLS or IPSec VPN connections.

Specifically, the attacker needs a full AES block (16 bytes) worth of output from the ANSI generator, plus part of a second block to check success against. Fortunately both TLS and IPSec (IKE) include nonces of sufficient length to obtain this output, and both are drawn from the ANSI generator, which lives in the FortiOS kernel. The attacker also needs the Diffie-Hellman ephemeral public keys, which are part of the protocol transcript.

Finally, you need to know the timestamp Ti that was used to operate the generator. In FortiOS, these timestamps have a 1-microsecond resolution, so guessing them is actually a bit of a challenge. Fortunately, TLS and other protocols include the time-in-seconds as one of the outputs of the TLS protocol, so the actually guessing space is typically only about 2^20 at most. Still, this guessing proves to be one of the most costly elements of the attack.

Given all of the ingredients above, the attacker now decrypts the output block taken from the protocol nonce using K, guesses each possible Ti value, and then winds forward or backwards until she finds the random bits that were used to generate that party’s Diffie-Hellman secret key. Fortunately, the key and nonce are generated one after the other, so this is not quite as painful as it sounds. But it is fairly time consuming. Fortunately, computers are fast, so this is not a dealbreaker.

With the secret key in hand, it’s possible to fully decrypt the VPN connection, read all traffic, and modify the data as needed.

### Does the attack really work?

Since we’re not the NSA, it’s awfully hard for us to actually apply this attack to real Fortinet VPN connections in the wild. Not to mention that it would be somewhat unethical.

However, there’s nothing really unethical about scanning for FortiOS devices that are online and willing to accept incoming traffic from the Internet. To validate the attack, the team conducted a large-scale scan of the entire IPv4 address space. Each time we found a device that appeared to present as a FortiOS 4.x VPN, we initiated a connection with it and tested to see if we could break our own connection.

It turns out that there are a lot of FortiOS 4.x devices in the wild. Unfortunately, only a small number of them accept normal IPSec connections from strangers. Fortunately, however, a lot of them do accept TLS connections. Both protocol implementations use the same ANSI generator for their random numbers.

This scan allowed us to validate that — as of  October 2017 — the vulnerability was present and exploitable on more than 25,000 Fortinet devices across the Internet. And this count is likely conservative, since these were simply the devices that bothered to answer us when we scanned. A more sophisticated adversary like a nation-state would have access to existing VPN connections in flight.

In short, if you’re using a legacy Fortinet VPN you should probably patch.

### So what does it all mean?

There are really three lessons to be learned from a bug like this one.

The first is that people make mistakes. We should probably design our crypto and certification processes to anticipate that, and make it much harder for these mistakes to become catastrophic decryption vulnerabilities like the one in FortiOS 4.x. Enough said.

The second is that government crypto certifications are largely worthless. I realize that seems like a big conclusion to draw from a single vulnerability. But this isn’t just a single vendor — it’s potentially several vendors that all fell prey to the same well-known 20-year old vulnerability. When a vulnerability is old enough to vote, your testing labs should be finding it. If they’re not finding things like this, what value are they adding?

Finally, there’s a lesson here about government standards. ANSI X9.31 (and its cousin X9.17) is over twenty years old. It’s (fortunately) been deprecated as of 2016, but a huge number of products still use it. This algorithm should have disappeared ten years earlier — and yet here we are. It’s almost certain that this small Fortinet vulnerability is just the tip of the iceberg. Following on revelations of a possible deliberate backdoor in the Dual EC generator, none of this stuff looks good. It’s time to give serious thought to how we make cryptographic devices resilient — even against the people who are supposed to be helping us secure them.

But that’s a topic for a much longer post.

# Patching is hard; so what?

It’s now been about a week since Equifax announced the record-breaking breach that affected 143 million Americans. We still don’t know enough — but a few details have begun to come out about the causes of the attack. It’s now being reported that Equifax’s woes stem from an unpatched vulnerability in Apache Struts that dates from March 2017, nearly two months before the breach began. This flaw, which allows remote command execution on affected servers, somehow allowed an attacker to gain access to a whopping amount of Equifax’s customer data.

While many people have criticized Equifax for its failure, I’ve noticed a number of tweets from information security professionals making the opposite case. Specifically, these folks point out that patching is hard. The gist of these points is that you can’t expect a major corporation to rapidly deploy something as complex as a major framework patch across their production systems. The stronger version of this point is that the people who expect fast patch turnaround have obviously never patched a production server.

I don’t dispute this point. It’s absolutely valid. My very simple point in this post is that it doesn’t matter. Excusing Equifax for their slow patching is both irrelevant and wrong. Worse: whatever the context, statements like this will almost certainly be used by Equifax to excuse their actions. This actively makes the world a worse place.

I don’t operate production systems, but I have helped to design a couple of them. So I understand something about the assumptions you make when building them.

If you’re designing a critical security system you have choices to make. You can build a system that provides defense-in-depth — i.e., that makes the assumption that individual components will fail and occasionally become insecure. Alternatively, you can choose to build systems that are fragile — that depend fundamentally on the correct operation of all components at all times. Both options are available to system designers, and making the decision is up to those designers; or just as accurately, the managers that approve their design.

The key point is that once you’ve baked this cake, you’d better be willing to eat it. If your system design assumes that application servers will not contain critical vulnerabilities — and you don’t have resilient systems in place to handle the possibility that they do — then you’ve implicitly made the decision that you’re never ever going to allow those vulnerabilities to fester. Once an in-the-wild vulnerability is detected in your system, you’d damn well better have a plan to patch, and patch quickly. That may involve automated testing. It may involve taking your systems down, or devoting enormous resources to monitoring activity. If you can’t do that, you’d better have an alternative. Running insecure is not an option.

So what would those systems look like? Among more advanced system designs I’ve begun to see a move towards encrypting back-end data. By itself this doesn’t do squat to protect systems like Equifax’s, because those systems are essentially “hot” databases that have to provide cleartext data to application servers — precisely the systems that Equifax’s attackers breached.

The common approach to dealing with this problem is twofold. First, you harden the cryptographic access control components that handle decryption and key management for the data — so that a breach in an application server doesn’t lead to the compromise of the access control gates. Second, you monitor, monitor, monitor. The sole advantage that encryption gives you here is that your gates for access control are now reduced to only the systems that manage encryption. Not your database. Not your web framework. Just a — hopefully — small and well-designed subsystem that monitors and grants access to each record. Everything else is monitoring.

Equifax claims to have resilient systems in place. Only time will tell if they looked like this. What seems certain is that whatever those systems are, they didn’t work. And given both the scope and scale of this breach, that’s a cake I’d prefer not to have to eat.

# The future of Ransomware

This is kind of a funny post for me to write, since it involves speculating about a very destructive type of software — and possibly offering some (very impractical) suggestions on how it might be improved in the future. It goes without saying that there are some real downsides to this kind of speculation. Nonetheless, I’m going ahead on the theory that it’s usually better to talk and think about the bad things that might happen to you — before you meet them on the street and they steal your lunch money.

On the other hand, just as there’s a part of every karate master that secretly wants to go out and beat up a bar full of people, there’s a part of every security professional that looks at our current generation of attackers and thinks: why can’t you people just be a bit more imaginative?! And wonders whether, if our attackers were just a little more creative, people would actually pay attention to securing their system before the bad stuff happens.

And ransomware is definitely a bad thing. According to the FBI it sucks up \$1 billion/year in payments alone, and some unimaginably larger amount in remediation costs. This despite the fact that many ransomware packages truly suck, and individual ransomware developers get routinely pwned due to making stupid cryptographic errors. If this strategy is working so well today, the question  we should be asking ourselves is: how much worse could it get?

So that’s what I’m going to muse about now. A few (cryptographic) ways that it might.

Some of these ideas are the result of collaboration with my students Ian Miers, Gabe Kaptchuk and Christina Garman. They range from the obvious to the foolish to the whimsical, and I would be utterly amazed if any of them really do happen. So please don’t take this post too seriously. It’s all just fun.

### Quick background: ransomware today

The amazing thing about ransomware is that something so simple could turn out to be such a problem. Modern ransomware consists of malware that infects your computer and then goes about doing something nasty: it encrypts every file it can get its hands on. This typically includes local files as well as network shares that can be reached from the infected machine.

Once your data has been encrypted, your options aren’t great. If you’re lucky enough to have a recent backup, you can purge the infected machine and restore. Otherwise you’re faced with a devil’s bargain: learn top live without that data, or pay the bastards.

If you choose to pay up, there are all sorts of different procedures. However most break down into the following three steps:

1. When the ransomware encrypts your files, it generates a secret key file and stores it on your computer.
2. You upload that file (or data string) to your attackers along with a Bitcoin payment.
3. They process the result with their secrets and send you a decryption key.

If you’re lucky, and your attackers are still paying attention (or haven’t screwed up the crypto beyond recognition) you get back a decryption key or a tool you can use to undo the encryption on your files. The whole thing is very businesslike. Indeed, recent platforms will allegedly offer you a discount if you infect recommend it to your friends — just like Lyft!

The problem of course, is that nothing in this process guarantees that your attacker will give you that decryption key. They might be scammers. They might not have the secret anymore. They might get tracked down and arrested. Or they might get nervous and bail, taking your precious data and your payment with them. This uncertainty makes ransomware payments inherently risky — and worse, it’s the victims who mostly suffer for it.

Perhaps it would be nice if we could make that work better.

### Verifiable key delivery using smart contracts

Most modern ransomware employs a cryptocurrency like Bitcoin to enable the payments that make the ransom possible. This is perhaps not the strongest argument for systems like Bitcoin — and yet it seems unlikely that Bitcoin is going away anytime soon. If we can’t solve the problem of Bitcoin, maybe it’s possible to use Bitcoin to make “more reliable” ransomware.

Recall that following a ransomware infection, there’s a possibility that you’ll pay the ransom and get nothing in return. Fundamentally there’s very little you can do about this. A conscientious ransomware developer might in theory offer a “proof of life” — that is, offer to decrypt a few files at random in order to prove their bonafides. But even if they bother with all the risk and interaction of doing this, there’s still no guarantee that they’ll bother to deliver the hostage alive.

An obvious approach to this problem is to make ransomware payments conditional. Rather than sending off your payment and hoping for the best, victims could use cryptocurrency features to ensure that ransomware operators can’t get paid unless they deliver a key. Specifically, a ransomware developer could easily perform payment via a smart contract script (in a system like Ethereum) that guarantees the following property:

This payment will be delivered to the ransomware operator if and only if the ransomware author unlocks it — by posting the ransomware decryption key to the same blockchain.

The basic primitive needed for this is called a Zero Knowledge Contingent Payment. This idea was proposed by Greg Maxwell and demonstrated by Sean Bowe of the ZCash team.**** The rough idea is to set the decryption key to be some pre-image k for some public hash value K that the ransomware generates and leaves on your system. It’s relatively easy to imagine a smart contract that allows payment if and only if the payee can post the input k such that K=SHA256(k). This could easily be written in Ethereum, and almost certainly has an analog for Bitcoin script.

The challenge here, of course, is to prove that k is actually a decryption key for your files, and that the files contain valid data. There are a handful of different ways to tackle this problem. One is to use complex zero-knowledge proof techniques (like zkSNARKs or ZKBoo) to make the necessary proofs non-interactively. But this is painful, and frankly above the level of most ransomware developers — who are still struggling with basic RSA.

An alternative approach is to use several such K challenges in combination with the “proof of life” idea. The ransomware operator would prove her bonafides by decrypting a small, randomly selected subset of files before the issuer issued payment. The operator could still “fake” the encryption — or lose the decryption key — but she would be exposed with reasonable probability before money changed hands.

### “Autonomous” ransomware

Of course, the problem with “verifiable” ransomware is: what ransomware developer would bother with this nonsense?

While the ability to verify decryption might conceivably improve customer satisfaction, it’s not clear that it would really offer that much value to ransomware deverlopers. At the same time, it would definitely add a lot of nasty complexity to their software.

Instead of pursuing ideas that offer developers no obvious upside, ransomware designers presumably will pursue ideas that offer them some real benefits. And that brings us to an idea time whose time has (hopefully) not quite come yet. The idea itself is simple:

Make ransomware that doesn’t require operators.

Recall that in the final step of the ransom process, the ransomware operator must deliver a decryption key to the victim. This step is the most fraught for operators, since it requires them to manage keys and respond to queries on the Internet. Wouldn’t it be better for operators if they could eliminate this step altogether?

Of course, to accomplish this seems to require a trustworthy third party — or better, a form of ransomware that can decrypt itself when the victim makes a Bitcoin payment. Of course this last idea seems fundamentally contradictory. The decryption keys would have to live on the victim’s device, and the victim owns that device. If you tried that, then victim could presumably just hack the secrets out and decrypt the ransomware without paying.

But what if the victim couldn’t hack their own machine?

This isn’t a crazy idea. In fact, it’s exactly the premise that’s envisioned by a new class of trusted execution environments, including Intel’s SGX and ARM TrustZone. These systems — which are built into the latest generation of many processors — allow users to instantiate “secure enclaves”: software environments that can’t be accessed by outside parties. SGX also isolates enclaves from other enclaves, which means the secrets they hold are hard to pry out.

Hypothetically, after infecting your computer a piece of ransomware could generate and store its decryption key inside of a secure enclave. This enclave could be programmed to release the key only on presentation of a valid Bitcoin payment to a designated address.

The beauty of this approach is that no third party even needs to verify the payment. Bitcoin payments themselves consist of a publicly-verifiable transaction embedded in a series of “blocks”, each containing an expensive computational “proof of work“. In principle, after paying the ransom the victim could present the SGX enclave with a fragment of a blockchain all by itself — freeing the ransomware of the need to interact with third parties. If the blockchain fragment exhibited sufficient hashpower along with a valid payment to a specific address, the enclave would release the decryption key.*

The good news is that Intel and ARM have devoted serious resources to preventing this sort of unauthorized access. SGX developers must obtain a code signing certificate from Intel before they can make production-ready SGX enclaves, and it seems unlikely that Intel would partner up with a ransomware operation. Thus a ransomware operator would likely have to (1) steal a signing key from a legitimate Intel-certified developer, or (2) find an exploitable vulnerability in another developer’s enclave.**, ***

This all seems sort of unlikely, and that appears to block most of the threat — for now. Assuming companies like Intel and Qualcomm don’t screw things up, and have a good plan for revoking enclaves (uh oh), this is not very likely to be a big threat.

Of course, in the long run developers might not need Intel SGX at all. An even more speculative concern is that developments in the field of cryptographic obfuscation will provide a software-only alternative means to implement this type of ransomware. This would eliminate the need for a dependency like SGX altogether, allowing the ransomware to do its work with no hardware at all.

At present such techniques are far north of practical, keep getting broken, and might not work at all. But cryptographic researchers keep trying! I guess the lesson is that it’s not all roses if they succeed.

### Ransomware Skynet

Since I’m already this far into what reads like a Peyote-fueled rant, let’s see if we can stretch the bounds of credibility just a little a bit farther. If ransomware can become partially autonomous — i.e., do part of its job without the need for human masters — what would it mean for it to become fully autonomous? In other words, what if we got rid of the rest of the human equation?

Ransomware with the ability to enforce payments would provide a potent funding source for another type of autonomous agent: a Decentralized Autonomous Organization, or (DAO). These systems are “corporations” that consist entirely of code that runs on a consensus network like Ethereum. They’re driven by rules, and are capable of both receiving and transmitting funds without (direct) instruction from human beings.

At least in theory it might be possible to develop a DAO that’s funded entirely by ransomware payments — and in turn mindlessly contracts real human beings to develop better ransomware, deploy it against human targets, and… rinse repeat. It’s unlikely that such a system would be stable in the long run — humans are clever and good at destroying dumb things — but it might get a good run. Who knows? Maybe this is how the Rampant Orphan Botnet Ecologies get started.

(I hope it goes without saying that I’m mostly not being serious about this part. Even though it would be totally awesome in a horrible sort of way.)

### In conclusion

This hasn’t been a terribly serious post, although it was fun to write. The truth is that as a defender, watching your attackers fiddle around is pretty much the most depressing thing ever. Sometimes you have to break the monotony a bit.

But insofar as there is a serious core to this post, it’s that ransomware currently is using only a tiny fraction of the capabilities available to it. Secure execution technologies in particular represent a giant footgun just waiting to go off if manufacturers get things only a little bit wrong.

Hopefully they won’t, no matter how entertaining it might be.

Notes:

* This technique is similar to SPV verification. Of course, it would also be possible for a victim to “forge” a blockchain fragment without paying the ransom. However, the cost of this could easily be tuned to significantly exceed the cost of paying the ransom. There are also many issues I’m glossing over here like difficulty adjustments and the possibility of amortizing the forgery over many different victims. But thinking about that stuff is a drag, and this is all for fun, right?

** Of course, if malware can exploit such a vulnerability in another developer’s enclave to achieve code execution for “ransomware”, then the victim could presumably exploit the same vulnerability to make the ransomware spit out its key without a payment. So this strategy seems self-limiting — unless the ransomware developers find a bug that can be “repaired” by changing some immutable state held by the enclave. That seems like a long shot. And no, SGX does not allow you to “seal” data to the current state of the enclave’s RAM image.

*** In theory, Intel or an ARM manufacturer could also revoke the enclave’s signing certificate. However, the current SGX specification doesn’t explain how such a revocation strategy should work. I assume this will be more prominent in future specifications.

**** The original version of this post didn’t credit Greg and Sean properly, because I honestly didn’t make the connection that I was describing the right primitive. Neat!

# Attack of the week: 64-bit ciphers in TLS

A few months ago it was starting to seem like you couldn’t go a week without a new attack on TLS. In that context, this summer has been a blessed relief. Sadly, it looks like our vacation is over, and it’s time to go back to school.

Today brings the news that Karthikeyan Bhargavan and Gaëtan Leurent out of INRIA have a new paper that demonstrates a practical attack on legacy ciphersuites in TLS (it’s called “Sweet32”, website here). What they show is that ciphersuites that use 64-bit blocklength ciphers — notably 3DES — are vulnerable to plaintext recovery attacks that work even if the attacker cannot recover the encryption key.

While the principles behind this attack are well known, there’s always a difference between attacks in principle and attacks in practice. What this paper shows is that we really need to start paying attention to the practice.

## So what’s the matter with 64-bit block ciphers?

Block ciphers are one of the most widely-used cryptographic primitives. As the nameimplies, these are schemes designed to encipher data in blocks, rather than a single bit at a time.

The two main parameters that define a block cipher are its block size (the number of bits it processes in one go), and its key size. The two parameters need not be related. So for example, DES has a 56-bit key and a 64-bit block. Whereas 3DES (which is built from DES) can use up to a 168-bit key and yet still has the same 64-bit block. More recent ciphers have opted for both larger blocks and larger keys.

When it comes to the security provided by a block cipher, the most important parameter is generally the key size. A cipher like DES, with its tiny 56-bit key, is trivially vulnerable to brute force attacks that attempt decryption with every possible key (often using specialized hardware). A cipher like AES or 3DES is generally not vulnerable to this sort of attack, since the keys are much longer.

However, as they say: key size is not everything. Sometimes the block size matters too.

You see, in practice, we often need to encrypt messages that are longer than a single block. We also tend to want our encryption to be randomized. To accomplish this, most protocols use a block cipher in a scheme called a mode of operation. The most popular mode used in TLS is CBC mode. Encryption in CBC looks like this:

The nice thing about CBC is that (leaving aside authentication issues) it can be proven (semantically) secure if we make various assumptions about the security of the underlying block cipher. Yet these security proofs have one important requirement. Namely, the attacker must not receive too much data encrypted with a single key.

The reason for this can be illustrated via the following simple attack.

Imagine that an honest encryptor is encrypting a bunch of messages using CBC mode. Following the diagram above, this involves selecting a random Initialization Vector ($IV$) of size equal to the block size of the cipher, then XORing $IV$ with the first plaintext block ($P$), and enciphering the result ($P \oplus IV$). The $IV$ is sent (in the clear) along with the ciphertext.

Most of the time, the resulting ciphertext block will be unique — that is, it won’t match any previous ciphertext block that an attacker may have seen. However, if the encryptor processes enough messages, sooner or later the attacker will see a collision. That is, it will see a ciphertext block that is the same as some previous ciphertext block. Since the cipher is deterministic, this means the cipher’s input ($P \oplus IV$) must be identical to the cipher’s previous input $(P' \oplus IV')$ that created the previous block.

In other words, we have $(P \oplus IV) = (P' \oplus IV')$, which can be rearranged as $(P \oplus P') = (IV \oplus IV')$. Since the IVs are random and known to the attacker, the attacker has (with high probability) learned the XOR of two (unknown) plaintexts!

What can you do with the XOR of two unknown plaintexts? Well, if you happen to know one of those two plaintext blocks — as you might if you were able to choose some of the plaintexts the encryptor was processing — then you can easily recover the other plaintext. Alternatively, there are known techniques that can sometimes recover useful data even when you don’t know both blocks.

The main lesson here is that this entire mess only occurs if the attacker sees a collision. And the probability of such a collision is entirely dependent on the size of the cipher block. Worse, thanks to the (non-intuitive) nature of the birthday bound, this happens much more quickly than you might think it would. Roughly speaking, if the cipher block is b bits long, then we should expect a collision after roughly $2^{b/2}$ encrypted blocks.

In the case of a 64-bit blocksize cipher like 3DES, this is somewhere in the vicinity of $2^{32}$, or around 4 billion enciphered blocks.

(As a note, the collision does not really need to occur in the first block. Since all blocks in CBC are calculated in the same way, it could be a collision anywhere within the messages.)

## Whew. I thought this was a practical attack. 4 billion is a big number!

It’s true that 4 billion blocks seems like an awfully large number. In a practical attack, the requirements would be even larger — since the most efficient attack is for the attacker to know a lot of the plaintexts, in the hope that she will be able to recover one unknown plaintext when she learns the value (P ⊕ P’).

However, it’s worth keeping in mind that these traffic numbers aren’t absurd for TLS. In practice, 4 billion 3DES blocks works out to 32GB of raw ciphertext. A lot to be sure, but not impossible. If, as the Sweet32 authors do, we assume that half of the plaintext blocks are known to the attacker, we’d need to increase the amount of ciphertext to about 64GB. This is a lot, but not impossible.

The Sweet32 authors take this one step further. They imagine that the ciphertext consists of many HTTPS connections, consisting of 512 bytes of plaintext, in each of which is embedded the same secret 8-byte cookie — and the rest of the session plaintext is known. Calculating from these values, they obtain a requirement of approximately 256GB of ciphertext needed to recover the cookie with high probability.

That is really a lot.

But keep in mind that TLS connections are being used to encipher increasingly more data. Moreover, a single open browser frame running attacker-controlled Javascript can produce many gigabytes of ciphertext in a single hour. So these attacks are not outside of the realm of what we can run today, and presumably will be very feasible in the future.

How does the TLS attack work?

While the cryptographic community has been largely pushing TLS away from ciphersuites like CBC, in favor of modern authenticated modes of operation, these modes still exist in TLS. And they exist not only for use not only with modern ciphers like AES, but they are often available for older ciphersuites like 3DES. For example, here’s a connection I just made to Google:

Of course, just because a server supports 3DES does not mean that it’s vulnerable to this attack. In order for a particular connection to be vulnerable, both the client and server must satisfy three main requirements:

1. The client and server must negotiate a 64-bit cipher. This is a relatively rare occurrence, but can happen in cases where one of the two sides is using an out-of-date client. For example, stock Windows XP does not support any of the AES-based ciphersuites. Similarly, SSL3 connections may negotiate 3DES ciphersuites.
2. The server and client must support long-lived TLS sessions, i.e., encrypting a great deal of data with the same key. Unfortunately, most web browsers place no limit on the length of an HTTPS session if Keep-Alive is used, provided that the server allows the session. The Sweet32 authors scanned and discovered that many servers (including IIS) will allow sessions long enough to run their attack. Across the Internet, the percentage of vulnerable servers is small (less than 1%), but includes some important sites.
3. The client must encipher a great deal of known data, including a secret session cookie. This is generally achieved by running adversarial Javascript code in the browser, although it could be done using standard HTML as well.

These caveats aside, the authors were able to run their attack using Firefox, sending at a rate of about 1500 connections per second. With a few optimizations, they were able to recover a 16-byte secret cookie in about 30 hours (a lucky result, given an expected 38 hour run time).The client must encipher a great deal of known data, including a secret session cookie. This is generally achieved by running adversarial Javascript code in the browser, although it could be done using standard HTML as well.

## So what do we do now?

While this is not an earthshaking result, it’s roughly comparable to previous results we’ve seen with legacy ciphers like RC4.

In short, while these are not the easiest attacks to run, it’s a big problem that there even exist semi-practical attacks that undo the encryption used in standard encryption protocols. This is a problem that we should address, and these attack papers help to make those problems more clear.

# Attack of the Week: Apple iMessage

Today’s Washington Post has a story entitled “Johns Hopkins researchers poke a hole in Apple’s encryption“, which describes the results of some research my students and I have been working on over the past few months.

As you might have guessed from the headline, the work concerns Apple, and specifically Apple’s iMessage text messaging protocol. Over the past months my students Christina Garman, Ian Miers, Gabe Kaptchuk and Mike Rushanan and I have been looking closely at the encryption used by iMessage, in order to determine how the system fares against sophisticated attackers. The results of this analysis include some very neat new attacks that allow us to — under very specific circumstances — decrypt the contents of iMessage attachments, such as photos and videos.

Now before I go further, it’s worth noting that the security of a text messaging protocol may not seem like the most important problem in computer security. And under normal circumstances I might agree with you. But today the circumstances are anything but normal: encryption systems like iMessage are at the center of a critical national debate over the role of technology companies in assisting law enforcement.

A particularly unfortunate aspect of this controversy has been the repeated call for U.S. technology companies to add “backdoors” to end-to-end encryption systems such as iMessage. I’ve always felt that one of the most compelling arguments against this approach — an argument I’ve made along with other colleagues — is that we just don’t know how to construct such backdoors securely. But lately I’ve come to believe that this position doesn’t go far enough — in the sense that it is woefully optimistic. The fact of the matter is that forget backdoors: we barely know how to make encryption work at all. If anything, this work makes me much gloomier about the subject.

But enough with the generalities. The TL;DR of our work is this:

Apple iMessage, as implemented in versions of iOS prior to 9.3 and Mac OS X prior to 10.11.4, contains serious flaws in the encryption mechanism that could allow an attacker — who obtains iMessage ciphertexts — to decrypt the payload of certain attachment messages via a slow but remote and silent attack, provided that one sender or recipient device is online. While capturing encrypted messages is difficult in practice on recent iOS devices, thanks to certificate pinning, it could still be conducted by a nation state attacker or a hacker with access to Apple’s servers. You should probably patch now.

For those who want the gory details, I’ll proceed with the rest of this post using the “fun” question and answer format I save for this sort of post.

### What is Apple iMessage and why should I care?

Those of you who read this blog will know that I have a particular obsession with Apple iMessage. This isn’t because I’m weirdly obsessed with Apple — although it is a little bit because of that. Mostly it’s because I think iMessage is an important protocol. The text messaging service, which was introduced in 2011, has the distinction of being the first widely-used end-to-end encrypted text messaging system in the world.

To understand the significance of this, it’s worth giving some background. Before iMessage, the vast majority of text messages were sent via SMS or MMS, meaning that they were handled by your cellular provider. Although these messages are technically encrypted, this encryption exists only on the link between your phone and the nearest cellular tower. Once an SMS reaches the tower, it’s decrypted, then stored and delivered without further protection. This means that your most personal messages are vulnerable to theft by telecom employees or sophisticated hackers. Worse, many U.S. carriers still use laughably weak encryption and protocols that are vulnerable to active interception.

So from a security point of view, iMessage was a pretty big deal. In a single stroke, Apple deployed encrypted messaging to millions of users, ensuring (in principle) that even Apple itself couldn’t decrypt their communications. The even greater accomplishment was that most people didn’t even notice this happened — the encryption was handled so transparently that few users are aware of it. And Apple did this at very large scale: today, iMessage handles peak throughput of more than 200,000 encrypted messages per second, with a supported base of nearly one billion devices.

### So iMessage is important. But is it any good?

Answering this question has been kind of a hobby of mine for the past couple of years. In the past I’ve written about Apple’s failure to publish the iMessage protocol, and on iMessage’s dependence on a vulnerable centralized key server. Indeed, the use of a centralized key server is still one of iMessage’s biggest weaknesses, since an attacker who controls the keyserver can use it to inject keys and conduct man in the middle attacks on iMessage users.

But while key servers are a risk, attacks on a key server seem fundamentally challenging to implement — since they require the ability to actively manipulate Apple infrastructure without getting caught. Moreover, such attacks are only useful for prospective surveillance. If you fail to substitute a user’s key before they have an interesting conversation, you can’t recover their communications after the fact.A more interesting question is whether iMessage’s encryption is secure enough to stand up against retrospective decryption attacks — that is, attempts to decrypt messages after they have been sent. Conducting such attacks is much more interesting than the naive attacks on iMessage’s key server, since any such attack would require the existence of a fundamental vulnerability in iMessage’s encryption itself. And in 2016 encryption seems like one of those things that we’ve basically figured out how to get right.

Which means, of course, that we probably haven’t.

### How does iMessage encryption work?

What we know about the iMessage encryption protocol comes from a previous reverse-engineering effort by a group from Quarkslab, as well as from Apple’s iOS Security Guide. Based on these sources, we arrive at the following (simplified) picture of the basic iMessage encryption scheme:

To encrypt an iMessage, your phone first obtains the RSA public key of the person you’re sending to. It then generates a random AES key k and encrypts the message with that key using CTR mode. Then it encrypts k using the recipient’s RSA key. Finally, it signs the whole mess using the sender’s ECDSA signing key. This prevents tampering along the way.

So what’s missing here?

Well, the most obviously missing element is that iMessage does not use a Message Authentication Code (MAC) or authenticated encryption scheme to prevent tampering with the message. To simulate this functionality, iMessage simply uses an ECDSA signature formulated by the sender. Naively, this would appear to be good enough. Critically, it’s not.

The attack works as follows. Imagine that a clever attacker intercepts the message above and is able to register her own iMessage account. First, the attacker strips off the original ECDSA signature made by the legitimate sender, and replaces it with a signature of her own. Next, she sends the newly signed message to the original recipient using her own account:

The outcome is that the user receives and decrypts a copy of the message, which has now apparently originated from the attacker rather than from the original sender. Ordinarily this would be a pretty mild attack — but there’s a useful wrinkle. In replacing the sender’s signature with one of her own, the attacker has gained a powerful capability. Now she can tamper with the AES ciphertext (red) at will.

Specifically, since in iMessage the AES ciphertext is not protected by a MAC, it is therefore malleable. As long as the attacker signs the resulting message with her key, she can flip any bits in the AES ciphertext she wants — and this will produce a corresponding set of changes when the recipient ultimately decrypts the message. This means that, for example, if the attacker guesses that the message contains the word “cat” at some position, she can flip bits in the ciphertext to change that part of the message to read “dog” — and she can make this change even though she can’t actually read the encrypted message.

Only one more big step to go.

Now further imagine that the recipient’s phone will decrypt the message correctly provided that the underlying plaintext that appears following decryption is correctly formatted. If the plaintext is improperly formatted — for a silly example, our tampering made it say “*7!” instead of “pig” — then on receiving the message, the recipient’s phone might return an error that the attacker can see.

It’s well known that such a configuration capability allows our attacker the ability to learn information about the original message, provided that she can send many “mauled” variants to be decrypted. By mauling the underlying message in specific ways — e.g., attempting to turn “dog” into “pig” and observing whether decryption succeeds — the attacker can gradually learn the contents of the original message. The technique is known as a format oracle, and it’s similar to the padding oracle attack discovered by Vaudenay.

### So how exactly does this format oracle work?

The format oracle in iMessage is not a padding oracle. Instead it has to do with the compression that iMessage uses on every message it sends.

You see, prior to encrypting each message payload, iMessage applies a complex formatting that happens to conclude with gzip compression. Gzip is a modestly complex compression scheme that internally identifies repeated strings, applies Huffman coding, then tacks a CRC checksum computed over the original data at the end of the compressed message. It’s this gzip-compressed payload that’s encrypted within the AES portion of an iMessage ciphertext.

It turns out that given the ability to maul a gzip-compressed, encrypted ciphertext, there exists a fairly complicated attack that allows us to gradually recover the contents of the message by mauling the original message thousands of times and sending the modified versions to be decrypted by the target device. The attack turns on our ability to maul the compressed data by flipping bits, then “fix up” the CRC checksum correspondingly so that it reflects the change we hope to see in the uncompressed data. Depending on whether that test succeeds, we can gradually recover the contents of a message — one byte at a time.

While I’m making this sound sort of simple, the truth is it’s not. The message is encoded using Huffman coding, with a dynamic Huffman table we can’t see — since it’s encrypted. This means we need to make laser-specific changes to the ciphertext such that we can predict the effect of those changes on the decrypted message, and we need to do this blind. Worse, iMessage has various countermeasures that make the attack more complex.The complete details of the attack appear in the paper, and they’re pretty eye-glazing, so I won’t repeat them here. In a nutshell, we are able to decrypt a message under the following conditions:

1. We can obtain a copy of the encrypted message
2. We can send approximately 2^18 (invisible) encrypted messages to the target device
3. We can determine whether or not those messages decrypted successfully or not
The first condition can be satisfied by obtaining ciphertexts from a compromise of Apple’s Push Notification Service servers (which are responsible for routing encrypted iMessages) or by intercepting TLS connections using a stolen certificate — something made more difficult due to the addition of certificate pinning in iOS 9. The third element is the one that initially seems the most challenging. After all, when I send an iMessage to your device, there’s no particular reason that your device should send me any sort of response when the message decrypts. And yet this information is fundamental to conducting the attack!

It turns out that there’s a big exception to this rule: attachment messages.

### How do attachment messages differ from normal iMessages?

When I include a photo in an iMessage, I don’t actually send you the photograph through the normal iMessage channel. Instead, I first encrypt that photo using a random 256-bit AES key, then I compute a SHA1 hash and upload the encrypted photo to iCloud. What I send you via iMessage is actually just an iCloud.com URL to the encrypted photo, the SHA1 hash, and the decryption key.

When you successfully receive and decrypt an iMessage from some recipient, your Messages client will automatically reach out and attempt to download that photo. It’s this download attempt, which happens only when the phone successfully decrypts an attachment message, that makes it possible for an attacker to know whether or not the decryption has succeeded.

One approach for the attacker to detect this download attempt is to gain access to and control your local network connections. But this seems impractical. A more sophisticated approach is to actually maul the URL within the ciphertext so that rather than pointing to iCloud.com, it points to a related URL such as i8loud.com. Then the attacker can simply register that domain, place a server there and allow the client to reach out to it. This requires no access to the victim’s local network.

By capturing an attachment message, repeatedly mauling it, and monitoring the download attempts made by the victim device, we can gradually recover all of the digits of the encryption key stored within the attachment. Then we simply reach out to iCloud and download the attachment ourselves. And that’s game over. The attack is currently quite slow — it takes more than 70 hours to run — but mostly because our code is slow and not optimized. We believe with more engineering it could be made to run in a fraction of a day.

The need for an online response is why our attack currently works against attachment messages only: those are simply the messages that make the phone do visible things. However, this does not mean the flaw in iMessage encryption is somehow limited to attachments — it could very likely be used against other iMessages, given an appropriate side-channel.

### How is Apple fixing this?

Apple’s fixes are twofold. First, starting in iOS 9.0 (and before our work), Apple began deploying aggressive certificate pinning across iOS applications. This doesn’t fix the attack on iMessage crypto, but it does make it much harder for attackers to recover iMessage ciphertexts to decrypt in the first place.

Unfortunately even if this works perfectly, Apple still has access to iMessage ciphertexts. Worse, Apple’s servers will retain these messages for up to 30 days if they are not delivered to one of your devices. A vulnerability in Apple Push Network authentication, or a compromise of these servers could read them all out. This means that pinning is only a mitigation, not a true fix.

As of iOS 9.3, Apple has implemented a short-term mitigation that my student Ian Miers proposed. This relies on the fact that while the AES ciphertext is malleable, the RSA-OAEP portion of the ciphertext is not. The fix maintains a “cache” of recently received RSA ciphertexts and rejects any repeated ciphertexts. In practice, this shuts down our attack — provided the cache is large enough. We believe it probably is.

In the long term, Apple should drop iMessage like a hot rock and move to Signal/Axolotl.

### So what does it all mean?

As much as I wish I had more to say, fundamentally, security is just plain hard. Over time we get better at this, but for the foreseeable future we’ll never be ahead. The only outcome I can hope for is that people realize how hard this process is — and stop asking technologists to add unacceptable complexity to systems that already have too much of it.