# What is the random oracle model and why should you care? (Part 5)

This is part five of a series on the Random Oracle Model.  See here for the previous posts:

About eight years ago I set out to write a very informal piece on a specific cryptographic modeling technique called the “random oracle model”. This was way back in the good old days of 2011, which was a more innocent and gentle era of cryptography. Back then nobody foresaw that all of our standard cryptography would turn out to be riddled with bugs; you didn’t have to be reminded that “crypto means cryptography“. People even used Bitcoin to actually buy things.

That first random oracle post somehow sprouted three sequels, each more ridiculous than the last. I guess at some point I got embarrassed about the whole thing — it’s pretty cheesy, to be honest — so I kind of abandoned it unfinished. And that’s been a major source of regret for me, since I had always planned a fifth, and final post, to cap the whole messy thing off. This was going to be the best of the bunch: the one I wanted to write all along.

To give you some context, let me briefly remind you what the random oracle model is, and why you should care about it. (Though you’d do better just to read the series.)

The random oracle model is a bonkers way to model (reason about) hash functions, in which we assume that these are actually random functions and use this assumption to prove things about cryptographic protocols that are way more difficult to prove without such a model. Just about all the “provable” cryptography we use today depends on this model, which means that many of these proofs would be called into question if it was “false”.

And to tease the rest of this post, I’ll quote the final paragraphs of Part 4, which ends with this:

You see, we always knew that this ride wouldn’t last forever, we just thought we had more time. Unfortunately, the end is nigh. Just like the imaginary city that Leonardo de Caprio explored during the boring part of Inception, the random oracle model is collapsing under the weight of its own contradictions.

As promised, this post will be about that collapse, and what it means for cryptographers, security professionals, and the rest of us.

First, to make this post a bit more self-contained I’d like to recap a few of the basics that I covered earlier in the series. You can feel free to skip this part if you’ve just come from there.

### In which we (very quickly) remind the reader what hash functions are, what random functions are, and what a random oracle is.

As discussed in the early sections of this series, hash functions (or hashing algorithms) are a standard primitive that’s used in many areas of computer science. They take in some input, typically a string of variable length, and repeatably output a short and fixed-length “digest”. We often denote these functions as follows:

${\sf digest} \leftarrow H({\sf message})$

Cryptographic hashing takes this basic template and tacks on some important security properties that we need for cryptographic applications. Most famously these provide  well-known properties like collision resistance, which is needed for applications like digital signatures. But hash functions turn up all over cryptography, sometimes in unexpected places — ranging from encryption to zero-knowledge protocols — and sometimes these systems demand stronger properties. Those can sometimes be challenging to put into formal terms: for example, many protocols require a hash function to produce output that is extremely “random-looking”.*

In the earliest days of provably security, cryptographers realized that the ideal hash function would behave like a “random function”. This term refers to a function that is uniformly sampled from the set of all possible functions that have the appropriate input/output specification (domain and range). In a perfect world your protocol could, for example, randomly sample one of vast number of possible functions at setup, bake the identifier of that function into a public key or something, and then you’d be good to go.

Unfortunately it’s not possible to actually use random functions (of reasonably-sized domain and range) in real protocols. That’s because sampling and evaluating those functions is far too much work.

For example, the number of distinct functions that consume a piddly 256-bit input and produce a 256-bit digest is a mind-boggling (2^{256})^{2^{256}}. Simply “writing down” the identity of the function you chose would require memory that’s exponential in the function’s input length. Since we want our cryptographic algorithms to be efficient (meaning, slightly more formally, they run in polynomial time), using random functions is pretty much unworkable.

So we don’t use random functions to implement our hashing. Out in “the real world” we use weird functions developed by Belgians or the National Security Agency, things like like SHA256 and SHA3 and Blake2. These functions come with blazingly fast and tiny algorithms for computing them, most of which occupy few dozen lines of code or less. They certainly aren’t random, but as best we can tell, the output looks pretty jumbled up.

Still, protocol designers continue to long for the security that using  truly random function could give their protocol. What if, they asked, we tried to split the difference. How about we model our hash functions using random functions — just for the sake of writing our security proofs —  and then when we go to implement (or “instantiate”) our protocols, we’ll go use efficient hash functions like SHA3? Naturally these proofs wouldn’t exactly apply to the real protocol as instantiated, but they might still be pretty good.

A proof that uses this paradigm is called a proof in the random oracle model, or ROM. For the full mechanics of how the ROM works you’ll have to go back and read the series from the beginning. What you do need to know right now is that proofs in this model must somehow hack around the fact that evaluating a random function takes exponential time. The way the model handles this is simple: instead of giving the individual protocol participants a description of the hash function itself — it’s way too big for anyone to deal with — they give each party (including the adversary) access to a magical “oracle” that can evaluate the random function H efficiently, and hand them back a result.

This means that any time one of the parties wants to compute the function $H({\sf message})$ they don’t do it themselves. They instead calling out to a third party, the “random oracle” who keeps a giant table of random function inputs and outputs. At a high level, the model looks like sort of like this:

Since all parties in the system “talk” to the same oracle, they all get the same hash result when they ask it to hash a given message. This is a pretty good standin for what happens with a real hash function. The use of an outside oracle allows us to “bury” the costs of evaluating a random function, so that nobody else needs to spend exponential time evaluating one. Inside this artificial model, we get ideal hash functions with none of the pain.

### This seems pretty ridiculous already…

It absolutely is!

However — I think there are several very important things you should know about the random oracle model before you write it off as obviously inane:

1. Of course everyone knows random oracle proofs aren’t “real”. Most conscientious protocol designers will admit that proving something secure in the random oracle model does not actually mean it’ll be secure “in the real world”. In other words, the fact that random oracle model proofs are kind of bogus is not some deep secret I’m letting you in on.

2. And anyway: ROM proofs are generally considered a useful heuristic. For those who aren’t familiar with the term, “heuristic” is a word that grownups use when they’re about to secure your life’s savings using cryptography they can’t prove anything about.

I’m joking! In fact, random oracle proofs are still quite valuable. This is mainly because they often help us detect bugs in our schemes. That is, while a random oracle proof doesn’t imply security in the real world, the inability to write one is usually a red flag for protocols. Moreover, the existence of a ROM proof is hopefully an indicator that the “guts” of the protocol are ok, and that any real-world issues that crop up will have something to do with the hash function.

3. ROM-validated schemes have a pretty decent track record in practice. If ROM proofs were kicking out absurdly broken schemes every other day, we would probably have abandoned this technique. Yet we use cryptography that’s proven (only) in the ROM just about ever day — and mostly it works fine.

This is not to say that no ROM-proven scheme has ever been broken, when instantiated with a specific hash function. But normally these breaks happen because the hash function itself is obvious broken (as happened when MD4 and MD5 both cracked up a while back.) Still, those flaws are generally fixed by simply switching to a better function. Moreover, the practical attacks are historically more likely to come from obvious flaws, like the discovery of hash collisions screwing up signature schemes, rather than from some exotic mathematical flaw. Which brings us to a final, critical note…

4. For years, many people believed that the ROM could actually be saved. This hope was driven by the fact that ROM schemes generally seemed to work pretty well when implemented with strong hash functions, and so perhaps all we needed to do was to find a hash function that was “good enough” to make ROM proofs meaningful. Some theoreticians hoped that fancy techniques like cryptographic obfuscation could somehow be used to make concrete hashing algorithms that behaved well enough to make (some) ROM proofs instantiable.**

So that’s kind of the state of the ROM, or at least — it was the state up until the late 1990s. We knew this model was artificial, and yet it stubbornly refused to explode or produce totally nonsense results.

And then, in 1998, everything went south.

### CGH98: an “uninstantiable” scheme

For theoretical cryptographers, the real breaking point for the random oracle model came in the form of a 1998 STOC paper by Canetti, Goldreich and Halevi (henceforth CGH). I’m going to devote the rest of this (long!) post to explaining the gist of what they found.

What CGH proved was that, in fact, there exist cryptographic schemes that can be proven perfectly secure in the random oracle model, but that — terrifyingly — become catastrophically insecure the minute you instantiate the hash function with any concrete function.

This is a really scary result, at least from the point of view of the provable security community. It’s one thing to know in theory that your proofs might not be that strong. It’s a different thing entirely to know that in practice there are schemes that can walk right past your proofs like a Terminator infiltrating the Resistance, and then explode all over you in the most serious way.

Before we get to the details of CGH and its related results, a few caveats.

First, CGH is very much a theory result. The cryptographic “counterexample” schemes that trip this problem generally do not look like real cryptosystems that we would use in practice, although later authors have offered some more “realistic” variants. They are, in fact, designed to do very artificial things that no “real” scheme would ever do. This might lead readers to dismiss them on the grounds of artificiality.

The problem with this view is that looks aren’t a particularly scientific way to judge a scheme. Both “real looking” and “artificial” schemes are, if proven correct, valid cryptosystems. The point of these specific counterexamples is to do deliberately artificial things in order to highlight the problems with the ROM. But that does not mean that “realistic” looking schemes won’t do them.

A further advantage of these “artificial” schemes is that they make the basic ideas relatively easy to explain. As a further note on this point: rather than explaining CGH itseld, I’m going to use a formulation of the same basic result that was proposed by Maurer, Renner and Holenstein (MRH).

### A signature scheme

The basic idea of CGH-style counterexamples is to construct a “contrived” scheme that’s secure in the ROM, but totally blows up when we “instantiate” the hash function using any concrete function, meaning a function that has a real description and can be efficiently evaluated by the participants in the protocol.

While the CGH techniques can apply with lots of different types of cryptosystem, in this explanation, we’re going to start our example using a relatively simple type of system: a digital signature scheme.

You may recall from earlier episodes of this series that a normal signature scheme consists of three algorithms: key generation, signing, and verification. The key generation algorithm outputs a public and secret key. Signing uses the secret key to sign a message, and outputs a signature. Verification takes the resulting signature, the public key and the message, and determines whether the signature is valid: it outputs “True” if the signature checks out, and “False” otherwise.

Traditionally, we demand that signature schemes be (at least) existentially unforgeable under chosen message attack, or UF-CMA. This means that that we consider an efficient (polynomial-time bounded) attacker who can ask for signatures on chosen messages, which are produced by a “signing oracle” that contains the secret signing key. Our expectation of a secure scheme is that, even given this access, no attacker will be able to come up with a signature on some new message that she didn’t ask the signing oracle to sign for her, except with negligible probability.****

Having explained these basics, let’s talk about what we’re going to do with it. This will involve several steps:

Step 1: Start with some existing, secure signature scheme. It doesn’t really matter what signature scheme we start with, as long as we can assume that it’s secure (under the UF-CMA definition described above.) This existing signature scheme will be used as a building block for the new scheme we want to build.*** We’ll call this scheme S.

Step 2: We’ll use the existing scheme S as a building block to build a “new” signature scheme, which we’ll call ${\bf S_{\sf broken}}$. Building this new scheme will mostly consist of grafting weird bells and whistles onto the algorithms of the original scheme S.

Step 3: Having described the working of ${\bf S_{\sf broken}}$ in detail, we’ll argue that it’s totally secure in the ROM. Since we started with an (assumed) secure signature scheme S, this argument mostly comes down to showing that in the random oracle model the weird additional features we added in the previous step don’t actually make the scheme exploitable.

Step 4: Finally, we’ll demonstrate that ${\bf S_{\sf broken}}$ is totally broken when you instantiate the random oracle with any concrete hash function, no matter how “secure” it looks. In short, we’ll show that one you replace the random oracle with a real hash function, there’s a simple attack that always succeeds in forging signatures.

We’ll start by explaining how ${\bf S_{\sf broken}}$ works.

### Building a broken scheme

To build our contrived scheme, we begin with the existing secure (in the UF-CMA sense) signature scheme S. That scheme comprises the three algorithms mentioned above: key generation, signing and verification.

We need to build the equivalent three algorithms for our new scheme.

To make life easier, our new scheme will simply “borrow” two of the algorithms from S, making no further changes at all. These two algorithms will be the key generation and signature verification algorithms So two-thirds of our task of designing the new scheme is already done.

Each of the novel elements that shows up in ${\bf S_{\sf broken}}$ will therefore appear in the signing algorithm. Like all signing algorithms, this algorithm takes in a secret signing key and some message to be signed. It will output a signature.

At the highest level, our new signing algorithm will have two subcases, chosen by a branch that depends on the input message to be signed. These two cases are given as follows:

The “normal” case: for most messages M, the signing algorithm will simply run the original signing algorithm from the original (secure) scheme S. This will output a perfectly nice signature that we can expect to work just fine.

The “evil” case: for a subset of (reasonably-sized) messages that have a different (and very highly specific) form, our signing algorithm will not output a signature. It will instead output the secret key for the entire signature scheme. This is an outcome that cryptographers will sometimes call “very, very bad.”

So far this description still hides all of the really important details, but at least it gives us an outline of where we’re trying to go.

Recall that under the UF-CMA definition I described above, our attacker is allowed to ask for signatures on arbitrary messages. When we consider using this definition with our modified signing algorithm, it’s easy to see that the presence of these two cases could make things exciting.

Specifically: if any attacker can construct a message that triggers the “evil” case, her request to sign a message will actually result in her obtaining the scheme’s secret key. From that point on she’ll be able to sign any message that she wants — something that obviously breaks the UF-CMA security of the scheme. If this is too theoretical for you: imagine requesting a signed certificate from LetsEncrypt, and instead obtaining a copy of LetsEncrypt’s signing keys. Now you too are a certificate authority. That’s the situation we’re describing.

The only way this scheme could ever be proven secure is if we could somehow rule out the “evil” case happening at all.

More concretely: we would have to show that no attacker can construct a message that triggers the “evil case” — or at least, that their probability of coming up with such a message is very, very low (negligible). If we could prove this, then our scheme ${\bf S_{\sf broken}}$ basically just reduces to being the original secure scheme. Which means our new scheme would be secure.

In short: what we’ve accomplished is to build a kind of “master password” backdoor into our new scheme ${\bf S_{\sf broken}}$. Anyone who knows the password can break the scheme. Everything now depends on whether an attacker can figure out that password.

### So what is the “backdoor”?

The message that breaks the scheme ${\bf S_{\sf broken}}$ isn’t a password at all, of course. Because this is computer science and nothing is ever easy, the message will actually be a computer program. We’ll call it P.

More concretely, it will be some kind of program that can decoded within our new signing algorithm, and then evaluated (on some input) by an interpreter that we will also place within that algorithm.

If we’re being formal about this, we’d say the message contains an encoding of a program for a universal Turing machine (UTM), along with a unary-encoded integer t that represents the number of timesteps that the machine should be allowed to run for. However, it’s perfectly fine with me if you prefer to think of the message as containing a hunk of Javascript, an Ethereum VM blob combined with some maximum “gas” value to run on, a .tgz encoding of a Docker container, or any other executable format you fancy.

What really matters is the functioning of the program P.

A program P that successfully triggers the “evil case” is one that contains an efficient (e.g., polynomial-sized) implementation of a hash function. And not just any hash function. To actually trigger the backdoor, the algorithm P must a function that is identical to, or at least highly similar to, the random oracle function H.

There are several ways that the signing algorithm can verify this similarity. The MRH paper gives a very elegant one, which I’ll discuss further below. But for the purposes of this immediate intuition, let’s assume that our signing algorithm verifies this similarity probabilistically. Specifically: to check that P matches H, it won’t verify the correspondence at every possible input. It might, for example, simply verify that P(x) = H(x) for some large (but polynomial) number of random input values x.

So that’s the backdoor.

Let’s think briefly about what this means for security, both inside and outside of the random oracle mode.

#### Case 1: in the random oracle model

Recall that in the random oracle model, the “hash function” H is modeled as a random function. Nobody in the protocol actually has a copy of that function, they just have access to a third party (the “random oracle”) who can evaluate it for them.

If an attacker wishes to trigger the “evil case” in our signing scheme, they will somehow need to download a description of the random function from the oracle. then encode it into a program P, and send it to the signing oracle. This seems fundamentally hard.

To do this precisely — meaning that P would match H on every input — the attacker would need to query the random oracle on every possible input, and then design a program P that encodes all of these results. It suffices to say that this strategy would not be practical: it would require an exponential amount of time to do any of these, and the size of P would also be exponential in the input length of the function. So this attacker would seem virtually guaranteed to fail.

Of course the attacker could try to cheat: make a small function P that only matches H on a small of inputs, and hope that the signer doesn’t notice. However, even this seems pretty challenging to get away with. For example, to perform a probabilistic check, the signing algorithm can simply verify that P(x) = H(x) for a large number of random input points x. This approach will catch a cheating attacker with very high probability.

(We will end up using a slightly more elegant approach to checking the function and arguing this point further below.)

The above is hardly an exhaustive security analysis. But at a high level our argument should now be clear: in the random oracle model, the scheme ${\bf S_{\sf broken}}$ is secure because the attacker can’t know a short enough backdoor “password” that breaks the scheme. Having eliminated the “evil case”, the scheme ${\bf S_{\sf broken}}$ simply devolves to the original, secure scheme S.

#### Case 2: In the “real world”

Out in the real world, we don’t use random oracles. When we want to implement a scheme that has a proof in the ROM, we must first “instantiate” the scheme by substituting in some real hash function in place of the random oracle H.

This instantiated hash function must, by definition, be efficient to evaluate and describe. This means implicitly that it possesses a polynomial-size description and can be evaluated in expected polynomial time. If we did not require this, our schemes would never work. Moreover, we must further assume that all parties, including the attacker, possess a description of the hash function. That’s a standard assumption in cryptography, and is merely a statement of Kerckhoff’s principle.

With these facts stipulated, the problem with our new signature scheme becomes obvious.

In this setting, the attacker actually does have access to a short, efficient program P that matches the hash function H. In practice, this function will probably be something like SHA2 or Blake2. But even in a weird case where it’s some crazy obfuscated function, the attacker is still expected to have a program that they can efficiently evaluate. Since the attacker possesses this program, they can easily encode it into a short enough message and send it to the signing oracle.

When the signing algorithm receives this program, it will perform some kind of test of this function P against its own implementation of H, and — when it inevitably finds a match between the two functions with high probability — it will output the scheme’s secret key.

Hence, out in the real world our scheme ${\bf S_{\sf broken}}$ is always and forever, totally broken.

### A few boring technical details (that you can feel free to skip)

If you’re comfortable with the imprecise technical intuition I’ve given above, feel free to skip this section. You can jump on to the next part, which tries to grapple with tough philosophical questions like “what does this mean for the random oracle model” and “I think this is all nonsense” and “why do we drive on a parkway, and park in a driveway?

All I’m going to do here is clean up a few technical details.

One of the biggest pieces that’s missing from the intuition above is a specification of how the signing algorithm verifies that the program P it receives from the attacker actually “matches” the random oracle function H. The obvious way is to simply evaluate P(x) = H(x) on every possible input x, and output the scheme’s secret key if every comparison succeeds. But doing this exhaustively requires exponential time.

The MRH paper proposes a very neat alternative way to tackle this. They propose to test the functions on a few input values, and not even random ones. More concretely, they propose checking that P(x) = H(x) for values of $x \in \{1, \dots, q\}$ with the specific requirement that q is an integer such that $q = 2|P| + k$. Here $|P|$ represents the length of the encoding of program P in bits, and k is the scheme’s adjustable security parameter (for example, k=128).

What this means is that to trigger the backdoor, the attacker must come up with a program P that can be described in some number of bits (let’s call it n) , and yet will be able to correctly match the outputs of H at e.g., q=2n+128 different input points. If we conservatively assume that H produces (at least) a 1-bit digest, that means we’re effectively encoding at least 2n+128 bits of data into a string of length n.

If the function H is a real hash function like SHA256, then it should be reasonably easy for the attacker to find some n-bit program that matches H at, say, q=2n+128 different points. For example, here’s a Javascript implementation of SHA256 that fits into fewer than 8,192 bits. If we embed a Javascript interpreter into our signing algorithm, then it simply needs to evaluate this given program on q = 2(8,192)+128 = 16,512 different input points, compare each result to its own copy of SHA256, and if they all match, output the secret key.

However, if H is a random oracle, this is vastly harder for the attacker to exploit. The result of evaluating a random oracle at q distinct points should be a random string of (at minimum) q bits in length. Yet in order for the backdoor to be triggered, we require the encoding of program P to be less than half that size. You can therefore think of the process by which the attacker compresses a random string into that program P to be a very effective compression algorithm, one takes in a random string, and compresses it into a string of less than half the size.

Despite what you may have seen on Silicon Valley (NSFW), compression algorithms do not succeed in compressing random strings that much with high probability. Indeed, for a given string of bits, this is so unlikely to occur that the attacker succeeds with at probability that is at most negligible in the scheme’s security parameter k. This effectively neutralizes the backdoor when H is a random oracle.

Phew.

### So what does this all mean?

Judging by actions, and not words, the cryptographers of the world have been largely split on this question.

Theoretical cryptographers, for their part, gently chuckled at the silly practitioners who had been hoping to use random functions as hash functions. Brushing pipe ash from their lapels, they returned to more important tasks, like finding ways to kill off cryptographic obfuscation.

Applied academic cryptographers greeted the new results with joy — and promptly authored 10,000 new papers, each of which found some new way to remove random oracles from an existing construction — while at the same time making said construction vastly slower, more complicated, and/or based on entirely novel made-up and flimsy number-theoretic assumptions. (Speaking from personal experience, this was a wonderful time.)

Practitioners went right on trusting the random oracle model. Because really, why not?

And if I’m being honest, it’s a bit hard to argue with the practitioners on this one.

That’s because a very reasonable perspective to take is that these “counterexample” schemes are ridiculous and artificial. Ok, I’m just being nice. They’re total BS, to be honest. Nobody would ever design a scheme that looks so ridiculous.

Specifically, you need a scheme that explicitly parses an input as a program, runs that program, and then checks to see whether the program’s output matches a different hash function. What real-world protocol would do something so stupid? Can’t we still trust the random oracle model for schemes that aren’t stupid like that?

Well, maybe and maybe not.

One simple response to this argument is that there are examples of schemes that are significantly less artificial, and yet still have random oracle problems. But even if one still views those results as artificial — the fact remains that while we only know of random oracle counterexamples that seem artificial, there’s no principled way for us to prove that the badness will only affect “artificial-looking” protocols. In fact, the concept of “artificial-looking” is largely a human judgement, not something one can realiably think about mathematically.

In fact, at any given moment someone could accidentally (or on purpose) propose a perfectly “normal looking” scheme that passes muster in the random oracle model, and then blows to pieces when it gets actually deployed with a standard hash function. By that point, the scheme may be powering our certificate authority infrastructure, or Bitcoin, or our nuclear weapons systems (if one wants to be dramatic.)

The probability of this happening accidentally seems low, but it gets higher as deployed cryptographic schemes get more complex. For example, people at Google are now starting to deploy complex multi-party computation and others are launching zero-knowledge protocols that are actually capable of running (or proving things about the execution of) arbitrary programs in a cryptographic way. We can’t absolutely rule out the possibility that the CGH and MRH-type counterexamples could actually be made to happen in these weird settings, if someone is a just a little bit careless.

It’s ultimately a weird and frustrating situation, and frankly, I expect it all to end in tears.

Photo by Flickr user joyosity.

Notes:

* Intuitively, this definition sounds a lot like “pseudorandomness”. Pseudorandom functions are required to be indistinguishable from random functions only in a setting where the attacker does not know some “secret key” used for the function. Whereas hash functions are often used in protocols where there is no opporunity to use a secret key, such as in public key encryption protocols.

** One particular hope was that we could find a way to obfuscate pseudorandom function families (PRFs). The idea would be to wrap up a keyed PRF that could be evaluated by anyone, even if they didn’t actually know the key. The result would be indistinguishable from a random function, without actually being one.

*** It might seem like “assume the existence of a secure signature scheme” drags in an extra assumption. However: if we’re going to make statements in the random oracle model it turns out there’s no additional assumption. This is because in the ROM we have access to “secure” (at least collision-resistant, [second] pre-image resistant) hash function, which means that we can build hash-based signatures. So the existence of signature schemes comes “free” with the random oracle model.

**** The “except with negligible probability [in the adjustable security parameter of the scheme]” caveat is important for two reasons. First, a dedicated attacker can always try to forge a signature just by brute-force guessing values one at a time until she gets one that satisfies the verification algorithm. If the attacker can run for an unbounded number of time steps, she’ll always win this game eventually. This is why modern complexity-theoretic cryptography assumes that our attackers must run in some reasonable amount of time — typically a number of time steps that is polynomial in the scheme’s security parameter. However, even a polynomial-time bounded adversary can still try to brute force the signature. Her probability of succeeding may be relatively small, but it’s non-zero: for example, she might succeed after the first guess. So in practice what we ask for in security definitions like UF-CMA is not “no attacker can ever forge a signature”, but rather “all attackers succeed with at most negligible probability [in the security parameter of the scheme]”, where negligible has a very specific meaning.

# Looking back at the Snowden revelations

Edward Snowden recently released his memoirs. In some parts of the Internet, this has rekindled an ancient debate: namely, was it all worth it? Did Snowden’s leaks make us better off, or did Snowden just embarass us and set back U.S. security by decades? Most of the arguments are so familiar that they’re boring at this point. But no matter how many times I read them, I still feel that there’s something important missing.

It’s no coincidence that this is a cryptography blog, which means that I’m not concerned with the same things as the general public. That is, I’m not terribly interested in debating the value of whistleblower laws (for some of that, see this excellent Twitter thread by Jake Williams). Instead, when it comes to Snowden’s leaks, I think the question we should be asking ourselves is very different. Namely:

What did the Snowden leaks tell us about modern surveillance capabilities? And what did we learn about our ability to defend against them?

And while the leaks themselves have receded into the past a bit — and the world has continued to get more complicated — the technical concerns that Snowden alerted us to are only getting more salient.

### Life before June 2013

It’s difficult to believe that the Snowden revelations began over six years ago. It’s also easy to forget how much things have changed in the intervening years.

Six years ago, vast portions of our communication were done in plaintext. It’s hard to believe how bad things were, but back in 2013, Google was one of the only major tech companies who had deployed HTTPS in its services by default, and even there they had some major exceptions. Web clients were even worse. These graphs (source and source) don’t cover the whole time period, but they give some of the flavor:

Outside of HTTPS, the story was even worse. In 2013 the vast majority of text messages were sent via unencrypted SMS/MMS or poorly-encrypted IM services, which were a privacy nightmare. Future developments like the inclusion of default end-to-end encryption in WhatsApp were years away. Probably the sole (and surprising) exception to was Apple, which had been ahead of the curve in deploying end-to-end encryption. This was largely counterbalanced by the tire fire that was Android back in those days.

But even these raw facts don’t tell the full story.

What’s harder to present in a chart is how different attitudes were towards surveillance back before Snowden. The idea that governments would conduct large-scale interception of our communications traffic was a point of view that relatively few “normal people” spent time thinking about — it was mostly confined to security mailing lists and X-Files scripts. Sure, everyone understood that government surveillance was a thing, in the abstract. But actually talking about this was bound to make you look a little silly, even in paranoid circles.

That these concerns have been granted respectability is one of the most important things Snowden did for us.

### So what did Snowden’s leaks really tell us?

The brilliant thing about the Snowden leaks was that he didn’t tell us much of anything. He showed us. Most of the revelations came in the form of a Powerpoint slide deck, the misery of which somehow made it all more real. And despite all the revelation fatigue, the things he showed us were remarkable. I’m going to hit a few of the highlights from my perspective. Many are cryptography-related, just because that’s what this blog is about. Others tell a more basic story about how vulnerable our networks are.

### “Collect it all”

Prior to Snowden, even surveillance-skeptics would probably concede that, yes, the NSA collects data on specific targets. But even the most paranoid observers were shocked by the sheer scale of what the NSA was actually doing out there.

The Snowden revelations detailed several programs that were so astonishing in the breadth and scale of the data being collected, the only real limits on them were caused by technical limitations in the NSA’s hardware. Most of us are familiar with the famous examples, like nationwide phone metadata collection. But it’s the bizarre, obscure leaks that really drive this home. For example:

“Optic Nerve”. From 2008-2010 the NSA and GCHQ collected millions of still images from every Yahoo! Messenger webchat stream, and used them to build a massive database for facial recognition. The collection of data had no particular rhyme or reason — i.e., it didn’t target specific users who might be a national security threat. It was just… everything. Don’t believe me? Here’s how we know how indiscriminate this was: the program didn’t even necessarily target faces. It got… other things:

MYSTIC/SOMALGET. In addition to collecting massive quantities of Internet metadata, the NSA recorded the full audio every cellular call made in the Bahamas. (Note: this is not simply calls to the Bahamas, which might be sort of a thing. They abused a law enforcement access feature in order to record all the mobile calls made within the country.) Needless to say, the Bahamian government was not party to this secret.

MUSCULAR. In case anyone thought the NSA avoided attacks on American providers, a series of leaks in 2014 documented that the NSA had tapped the internal leased lines used to connect Google and Yahoo datacenters. This gave the agencies access to vast and likely indiscriminate access to torrents of data on U.S. and European users, information was likely above and beyond the data that these companies already shared with the U.S. under existing programs like PRISM. This leak is probably most famous for this slide:

Yahoo!, post-Snowden. And in case you believe that this all ended after Snowden’s leaks, we’ve learned even more disturbing things since. For example, in 2015, Yahoo got caught installing what has been described as a “rootkit” that scanned every single email in its database for specific selectors, at the request of the U.S. government. This was so egregious that the company didn’t even tell it’s CISO, who left the next week. In fact, we know a lot more about Yahoo’s collaboration during this time period, thanks to Snowden.

These examples are not necessarily the worst things we learned from the Snowden leaks. I chose them only to illustrate how completely indiscriminate the agency’s surveillance really was. And not because the NSA was especially evil, but just because it was easy to do. If you had any illusions that this data was being carefully filtered to exclude capturing data belonging to U.S. citizens, or U.S. companies, the Snowden leaks should have set you straight.

### SIGINT Enabling

The Snowden leaks also helped shatter a second illusion: the idea that the NSA was on the side of the angels when it comes to making the Internet more secure. I’ve written about this plenty on this blog (with sometimes exciting results), but maybe this needs to be said again.

One of the most important lessons we learned from the Snowden leaks was that the NSA very much prioritizes its surveillance mission, to the point where it is willing to actively insert vulnerabilities into encryption products and standards used on U.S. networks. And this kind of thing wasn’t just an occasional crime of opportunity — the agency spent $250 million per year on a program called the SIGINT Enabling Project. Its goal was, basically, to bypass our commercial encryption at any cost. This kind of sabotage is, needless to say, something that not even the most paranoid security researchers would have predicted from our own intelligence agencies. Agencies that, ostensibly have a mission to protect U.S. networks. The Snowden reporting not only revealed the existence of these overall programs, but they uncovered a lot of unpleasant specifics, leading to a great deal of follow-up investigation. For example, the Snowden leaks contained specific allegations of a vulnerability in a NIST standard called Dual EC. The possibility of such a vulnerability had previously been noted by U.S. security researchers Dan Shumow and Niels Ferguson a few years earlier. But despite making a reasonable case for re-designing this algorithm, those researchers (and others) were basically brushed off by the “serious” people at NIST. The Snowden documents changed all that. The leaks were a devastating embarassment to the U.S. cryptographic establishment, and led to some actual changes. Not only does it appear that the NSA deliberately backdoored Dual EC, it seems that they did so (and used NIST) in order to deploy the backdoor into U.S. security products. Later investigations would show that Dual EC was present in software by RSA Security (allegedly because of a secret contract with the NSA) and in firewalls made by Juniper Networks. (Just to make everything a bit more horrifying, Juniper’s Dual EC backdoor would later be hijacked and turned against the United States by unknown hackers — illustrating exactly how reckless this all was.) And finally, there are the mysteries. Snowden slides indicate that the NSA has been decrypting SSL/TLS and IPsec connections at vast scale. Even beyond the SIGINT Enabling-type sabotage, this raises huge questions about what the hell is actually going on here. There are theories. These may or may not be correct, but at least now people are thinking about them. At very least, it’s clear that something is very, very wrong. ### Have things improved? This is the$250 million question.

Some of the top-level indicators are surprisingly healthy. HTTPS adoption has taken off like a rocket, driven in part by Google’s willingness to use it as a signal for search rankings — and the rise of free Certificate Authorities like LetsEncrypt. It’s possible that these things would have happened eventually without Snowden, but it’s less likely.

End-to-end encrypted messaging has also taken off, largely due to adoption by WhatsApp and a host of relatively new apps. It’s reached the point where law enforcement agencies have begun to freak out, as the slide below illustrates.

Does Snowden deserve credit for this? Maybe not directly, but it’s almost certain that concerns over the surveillance he revealed did play a role. (It’s worth noting that this adoption is not evenly distributed across the globe.)

It’s also worth pointing out that at least in the open source community the quality of our encryption software has improved enormously, largely due to the fact that major companies made well-funded efforts to harden their systems, in part as a result of serious flaws like Heartbleed — and in part as a response to the company’s own concerns about surveillance.

It might very well be that the NSA has lost a significant portion of its capability since Snowden.

### The future isn’t American

I’ve said this before, as have many others: even if you support the NSA’s mission, and believe that the U.S. is doing everything right, it doesn’t matter. Unfortunately, the future of surveillance has very little to do with what happens in Ft. Meade, Maryland. In fact, the world that Snowden brought to our attention isn’t necessarily a world that Americans have much say in.

As an example: today the U.S. government is in the midst of forcing a standoff with China over the global deployment of Huawei’s 5G wireless networks around the world. This is a complicated issue, and financial interest probably plays a big role. But global security also matters here. This conflict is perhaps the clearest acknowledgement we’re likely to see that our own government knows how much control of communications networks really matters, and our inability to secure communications on these networks could really hurt us. This means that we, here in the West, had better get our stuff together — or else we should be prepared to get a taste of our own medicine.

If nothing else, we owe Snowden for helping us to understand how high the stakes might be.

# On Ghost Users and Messaging Backdoors

The past few years have been an amazing time for the deployment of encryption. In ten years, encrypted web connections have gone from a novelty into a requirement for running a modern website. Smartphone manufacturers deployed default storage encryption to billions of phones. End-to-end encrypted messaging and phone calls are now deployed to billions of users.

While this progress is exciting to cryptographers and privacy advocates, not everyone sees it this way. A few countries, like the U.K. and Australia, have passed laws in an attempt to gain access to this data, and at least one U.S. proposal has made it to Congress. The Department of Justice recently added its own branding to the mix, asking tech companies to deploy “responsible encryption“.

What, exactly, is “responsible encryption”? Well, that’s a bit of a problem. Nobody on the government’s side of the debate has really been willing to get very specific about that. In fact, a recent speech by U.S. Deputy Attorney General Rod Rosenstein implored cryptographers to go figure it out.

With this as background, a recent article by GCHQ’s Ian Levy and Crispin Robinson reads like a breath of fresh air. Unlike their American colleagues, the British folks at GCHQ — essentially, the U.K.’s equivalent of NSA — seem eager to engage with the technical community and to put forward serious ideas. Indeed, Levy and Robinson make a concrete proposal in the article above: they offer a new solution designed to surveil both encrypted messaging and phone calls.

In this post I’m going to talk about that proposal as fairly as I can — given that I only have a high-level understanding of the idea. Then I’ll discuss what I think could go wrong.

### A brief, illustrated primer on E2E

The GCHQ proposal deals with law-enforcement interception on messaging systems and phone calls. To give some intuition about the proposal, I first need to give a very brief (and ultra-simplified) explanation of how those systems actually work.

The basic idea in any E2E communication systems is that each participant encrypts messages (or audio/video data) directly from one device to the other. This layer of encryption reduces the need to trust your provider’s infrastructure — ranging from telephone lines to servers to undersea cables — which gives added assurance against malicious service providers and hackers.

If you’ll forgive a few silly illustrations, the intuitive result is a picture that looks something like this:

If we consider the group chat/call setting, the picture changes slightly, but only slightly. Each participant still encrypts data to the other participants directly, bypassing the provider. The actual details (specific algorithms, key choices) vary between different systems. But the concept remains the same.

The problem with the simplified pictures above is that there’s actually a lot more going on in an E2E system than just encryption.

In practice, one of the most challenging problems in encrypted messaging stems is getting the key you need to actually perform the encryption. This problem, which is generally known as key distribution, is an age-old concern in the field of computer security. There are many ways for it to go wrong.

In the olden days, we used to ask users to manage and exchange their own keys, and then select which users they wanted to encrypt to. This was terrible and everyone hated it. Modern E2E systems have become popular largely because they hide all of this detail from their users. This comes at the cost of some extra provider-operated infrastructure.

In practice, systems like Apple iMessage, WhatsApp and Facebook Messenger actually look more like this:

The Apple at the top of the picture above stands in for Apple’s “identity service”, which is a cluster of servers running in Apple’s various data centers. These servers perform many tasks, but most notably: they act as a directory for looking up the encryption key of the person you’re talking to. If that service misfires and gives you the wrong key, the best ciphers in the world won’t help you. You’ll just be encrypting to the wrong person.

These identity services do more than look up keys. In at least some group messaging systems like WhatsApp and iMessage, they also control the membership of group conversations. In poorly-designed systems, the server can add and remove users from a group conversation at will, even if none of the participants have requested this. It’s as though you’re having a conversation in a very private room — but the door is unlocked and the building manager controls who can come enter and join you.

(A technical note: while these two aspects of the identity system serve different purposes, in practice they’re often closely related. For example, in many systems there is little distinction between “group” and “two-participant” messaging. For example, in systems that support multiple devices connected to a single account, like Apple’s iMessage, every single device attached to your user account is treated as a separate party to the conversation. Provided either party has more than one device on their account [say, an iPhone and an iPad] , you can think of every iMessage conversation as being a group conversation.)

Most E2E systems have basic countermeasures against bad behavior by the identity service. For example, client applications will typically alert you when a new user joins your group chat, or when someone adds a new device to your iMessage account. Similarly, both WhatsApp and Signal expose “safety numbers” that allow participants to verify that they received the right cryptographic keys, which offers a check against dishonest providers.

But these countermeasures are not perfect, and not every service offers them. Which brings me to the GCHQ proposal.

### What GCHQ wants

The Lawfare article by Levy and Robinson does not present GCHQ’s proposal in great detail. Fortunately, both authors have spent most of the touring the U.S., giving several public talks about their ideas. I had the privilege of speaking to both of them earlier this summer when they visited Johns Hopkins, so I think I have a rough handle on what they’re thinking.

In its outlines, the idea they propose is extremely simple. The goal is to take advantage of existing the weaknesses in the identity management systems of group chat and calling systems. This would allow law enforcement — with the participation of the service provider — to add a “ghost user” (or in some cases, a “ghost device”) to an existing group chat or calling session. In systems where group membership can be modified by the provider infrastructure, this could mostly be done via changes to the server-side components of the provider’s system.

I say that it could mostly be done server-side, because there’s a wrinkle. Even if you modify the provider infrastructure to add unauthorized users to a conversation, most existing E2E systems do notify users when a new participant (or device) joins a conversation. Generally speaking, having a stranger wander into your conversation is a great way to notify criminals that the game’s afoot or what have you, so you’ll absolutely want to block this warning.

While the GCHQ proposal doesn’t go into great detail, it seems to follow that any workable proposal will require providers to suppress those warning messages at the target’s device. This means the proposal will also require changes to the client application as well as the server-side infrastructure.

(Certain apps like Signal are already somewhat hardened against these changes, because group chat setup is handled in an end-to-end encrypted/authenticated fashion by clients. This prevents the server from inserting new users without the collaboration of at least one group participant. At the moment, however, both WhatsApp and iMessage seem vulnerable to GCHQ’s proposed approach.)

Due to this need for extensive server and client modifications, the GCHQ proposal actually represents a very significant change to the design of messaging systems. It seems likely that the client-side code changes would need to be deployed to all users, since you can’t do targeted software updates just against criminals. (Or rather, if you could rely on such targeted software updates, you would just use that capability instead of the thing that GCHQ is proposing.)

Which brings us to the last piece: how do get providers to go along with all of this?

While optimism and cooperation are nice in principle, it seems unlikely that communication providers are going to to voluntarily insert a powerful eavesdropping capability into their encrypted services, if only because it represents a huge and risky modification. Presumably this means that the UK government will have to compel cooperation. One potential avenue for this is to use Technical Capability Notices from the UK’s Investigatory Powers Act. Those notices mandate that a provider offer real-time decryption for sets of users ranging from 1-10,000 users, and moreover, that providers must design their systems to ensure this such a capability remains available.

And herein lies the problem.

### Providers are already closing this loophole

The real problem with the GCHQ proposal is that it targets a weakness in messaging/calling systems that’s already well-known to providers, and moreover, a weakness that providers have been working to close — perhaps because they’re worried that someone just like GCHQ (or probably, much worse) will try to exploit it. By making this proposal, the folks at GCHQ have virtually guaranteed that those providers will move much, much faster on this.

And they have quite a few options at their disposal. Over the past several years researchers have proposed several designs that offer transparency to users regarding which keys they’re obtaining from a provider’s identity service. These systems operate by having the identity service commit to the keys that are associated with individual users, such that it’s very hard for the provider to change a user’s keys (or to add a device) without everyone in the world noticing.

As mentioned above, advanced messengers like Signal have “submerged” the group chat management into the encrypted communications flow, so that the server cannot add new users without the digitally authenticated approval of one of the existing participants. This design, if ported to in more popular services like WhatsApp, would seem to kill the GCHQ proposal dead.

Of course, these solutions highlight the tricky nature of GCHQ’s proposal. Note that in order to take advantage of existing vulnerabilities, GCHQ is going to have to require that providers change their system. And of course, once you’ve opened the door to forcing providers to change their system, why stop with small changes? What stops the UK government from, say, taking things a step farther, and using the force of law to compel providers not to harden their systems against this type of attack?

Which brings us to the real problem with the GCHQ proposal. As far as I can see, there are two likely outcomes. In the first, providers rapidly harden their system — which is good! — and in the process kill off the vulnerabilities that make GCHQ’s proposal viable (which is bad, at least for GCHQ). The more interest that governments express towards the proposal, the more likely this first outcome is. In the second outcome, the UK government, perhaps along with other governments, solve this problem by forcing the providers to keep their systems vulnerable. This second outcome is what I worry about.

More concretely, it’s true that today’s systems include existing flaws that are easy to exploit. But that does not mean we should entomb those flaws in concrete. And once law enforcement begins to rely on them, we will effectively have done so. Over time what seems like a “modest proposal” using current flaws will rapidly become an ossifying influence that holds ancient flaws in place. In the worst-case outcome, we’ll be appointing agencies like GCHQ as the ultimate architect of Apple and Facebook’s communication systems.

That is not a good outcome. In fact, it’s one that will likely slow down progress for years to come.

# Friday Dachshund Blogging

For over a year this blog has failed to deliver on an essential promise — that there would someday be pictures of dachshunds. Today we deliver.

This is Callie (short for Calliope) working her way through a bit of summer crypto reading:

But sometimes that’s exhausting and you’ve gotta take a break.

A visit from a strange metallic dachshund:

Summer:

And in memoriam, Zoe and Sophie, who helped me start this blog.

# Wonk post: chosen ciphertext security in public-key encryption (Part 2)

This continues the post from Part 1. Note that this is a work in progress, and may have some bugs in it 🙂 I’ll try to patch them up as I go along.

In the previous post I discussed the problem of building CCA-secure public key encryption. Here’s a quick summary of what we discussed in the first part:

• We covered the definition of CCA2 security.
• We described how you can easily achieve this notion in the symmetric encryption setting using a CPA-secure encryption scheme, plus a secure MAC.
• We talked about why this same approach doesn’t work for standard public-key encryption.

In this post I’m going to discuss a few different techniques that actually do provide CCA security for public key encryption. We’ll be covering these in no particular order.

A quick note on security proofs. There are obviously a lot of different ways you could try to hack together a CCA2 secure scheme out of different components. Some of those might be secure, or they might not be. In general, the key difference between a “secure” and “maybe secure” scheme is the fact that we can construct some kind of security proof for it.

The phrase “some kind” will turn out to be an important caveat, because these proofs might require a modest amount of cheating.

### The bad and the ugly

Before we get to the constructive details, it’s worth talking a bit about some ideas that don’t work to achieve CCA security. The most obvious place to start is with some of the early RSA padding schemes, particularly the PKCS#1v1.5 padding standard.

PKCS#1 padding was developed in the 1980s, when it was obvious that public key encryption was going to become widely deployed. It was intended as a pre-processing stage for messages that were going to be encrypted using an RSA public key.

This padding scheme had two features. First, it added randomness to the message prior to encrypting it. This was designed to defeat the simple ciphertext guessing attacks that come from deterministic encryption schemes like RSA. It can be easily shown that randomized encryption is absolutely necessary for any IND-CPA (and implicitly, IND-CCA) secure public key encryption scheme. Second, the padding added some “check” bytes that were intended to help detect mangled ciphertexts after decryption; this was designed (presumably) to shore the scheme up against invalid decryption attempts.

PKCS#1v1.5 is still widely used in protocols, including all versions of TLS prior to TLS 1.3. The diagram below shows what the padding scheme looks like when used in TLS with a 2048-bit RSA key. The section labeled “48 bytes PMS” (pre-master secret) in this example represents the plaintext being encrypted. The 205 “non-zero padding” consists of purely random bytes that exclude the byte “0”, because that value is reserved to indicate the end of the padding section and the beginning of the plaintext.

After using the RSA secret key to recover the padded message, the decryptor is supposed to parse the message and verify that the first two bytes (“00 02”) and the boundary “00” byte are all correct and in not violating any rules. The decryptor may optionally conduct other checks like verifying the length and structure of the plaintext, in case that’s known in advance.

One of the most immediate observations about PKCS#1v1.5 is that the designers kind of intuitively understood that chosen ciphertext attacks were a thing. They clearly added some small checks to make sure that it would be hard for an attacker to modify a given ciphertext (e.g., by multiplying it by a chosen value). It’s also obvious that these checks aren’t very strong. In the standardized version of the padding scheme, there are essentially three bytes to check — and one of them (the “00” byte after the padding) can “float” at a large number of different positions, depending on how much padding and plaintext there is in the message.

The use of a weak integrity check leads to a powerful CCA2 attack on the encryption scheme that was first discovered by Daniel Bleichenbacher. The attack is powerful due to the fact that it actually leverages the padding check as a way to learn information about the plaintext. That is: the attacker “mauls” a ciphertext and sends it to be decrypted, and relies on the fact that the decryptor will simply perform the decryption checks they’re supposed to perform — and output a noticable error if they fail. Given only this one bit of information per decryption, the attack can gradually recover the full plaintext of a specific ciphertext by (a) multiplying it with some value, (b) sending the result to be decrypted, (c) recording the success/failure result, (d) adaptively picking a new value and repeating step (a) many thousands or millions of times.

The PKCS#1v1.5 padding scheme is mainly valuable to us today because it provides an excellent warning to cryptographic engineers, who would otherwise continue to follow the “you can just hack together something that looks safe” school of building protocols. Bleichenbacher-style attacks have largely scared the crypto community straight. Rather than continuing to use this approach, the crypto community has (mostly) moved towards techniques that at least offer some semblance of provable security.

That’s what we’ll cover in just a moment.

### A few quick notes on achieving CCA2-secure public key encryption

Before we get to a laundry list of specific techniques and schemes, it’s worth asking what types of design features we might be looking for in a CCA2 public key encryption scheme. Historically there have been two common requirements:

• It would be super convenient if we could start with an existing encryption scheme, like RSA or Elgamal encryption, and generically tweak (or “compile”) that scheme into a CCA2-secure scheme. (Re-usable generic techniques are particularly useful in the event that someone comes up with new underlying encryption schemes, like post-quantum secure ones.)
• The resulting scheme should be pretty efficient. That rules out most of the early theoretical techniques that use huge zero knowledge proofs (as cool as they are).

Before we get to the details, I also want to repeat the intuitive description of the CCA2 security game, which I gave in the previous post. The game (or “experiment”) works like this:

1. I generate an encryption keypair for a public-key scheme and give you the public key.
2. You can send me (sequentially and adaptively) many ciphertexts, which I will decrypt with my secret key. I’ll give you the result of each decryption.
3. Eventually you’ll send me a pair of messages (of equal length) $M_0, M_1$ and I’ll pick a bit $b$ at random, and return to you the encryption of $M_b$, which I will denote as $C^* \leftarrow {\sf Encrypt}(pk, M_b)$.
4. You’ll repeat step (2), sending me ciphertexts to decrypt. If you send me $C^*$ I’ll reject your attempt. But I’ll decrypt any other ciphertext you send me, even if it’s only slightly different from $C^*$.
5. You (the attacker) will output your guess $b'$. They “win” the game if $b'=b$.
6. We say a scheme is IND-CCA2 secure if the attacker wins with probability “not much greater” than 1/2 (which is the best an attacker can do if they just guess randomly.)

A quick review of this definition shows that we need a CCA2-encryption scheme to provide at least two major features.

First off, it should be obvious that the scheme must not leak information about the secret key, even when I’m using it to decrypt arbitrary chosen ciphertexts of your choice. There are obvious examples of schemes that fail to meet this requirement: the most famous is the (textbook) Rabin cryptosystem — where the attacker’s ability to obtain the decryption of a single chosen ciphertext can leak the entire secret key.

More subtly, it seems obvious that CCA2 security is related to non-malleabilityHere’s why: suppose I receive a challenge ciphertext $C^*$ at step (3). It must be the case that I cannot easily “maul” that ciphertext into a new ciphertext $C'$ that contains a closely related plaintext (and that the challenger will be able and willing to meaingfully decrypt). It’s easy to see that if I could get away with this, by the rules of the game I could probably win at step (4), simply by sending $C'$ in to be decrypted, getting the result, and seeing whether it’s more closely related to $M_0$ or $M_1$. (This is, in fact, a very weak explanation of what the Bleichenbacher attack does.)

It turns out that an even stronger property that helps achieve both of these conditions is something called plaintext awareness. There are various subtly-different mathematical formulations of this idea, but here I’ll try to give only the English-language intuition:

If the attacker is able to submit a (valid) ciphertext to be decrypted, it must be the case that she already knows the plaintext of that message.

This guarantee is very powerful, because it helps us to be sure that the decryption process doesn’t give the attacker any new information that she doesn’t already have. She can submit any messages she wants (including mauling the challenge ciphertext $C^*$) but if this plaintext-awareness property holds in the strongest sense, those decryptions won’t tell her anything she doesn’t already know.

Of course, just because your scheme appears to satisfy the above conditions does not mean it’s secure. Both rules above are heuristics: that is, they’re necessary conditions to prevent attacks, but they may or may not be sufficient. To really trust a scheme (in the cryptographic sense) we should be able to offer a proof (under some assumptions) that these guarantees hold. We’ll address that a bit as we go forward.

### Technique 1: Optimal Asymmetric Encryption Padding

One of the earlier practical CCA2 transforms was developed by Bellare and Rogaway as a direct replacement for the PKCS#1v1.5 padding scheme in RSA encryption. The scheme they developed — called Optimal Asymmetric Encryption Padding — represents a “drop-in” replacement for the earlier, broken v1.5 padding scheme. It also features a security proof. (Mostly. We’ll come back to this last point.)

(Confusingly, OAEP was adopted into the PKCS#1 standards as of version 2.0, so sometimes you’ll see it referred to as PKCS#1v2.0-OAEP.)

OAEP’s most obvious advance over the previous state of the art is the addition of not one, but two separate hash functions $G()$ and $H()$ that don’t exist in the v1.5 predecessor. (These are sometimes referred to as “mask generation functions”, which is just a fancy way of saying they’re hash functions with outputs of a custom, chosen size. Such functions can be easily built from existing hash functions like SHA256.)

Expressed graphically, this is what OAEP it looks like:

If you’ve ever seen the DES cipher, this structure should look familiar to you. Basically OAEP is a two-round (unkeyed) Feistel network that uses a pair of hash functions to implement the round functions. There are a few key observations you can make right off the bat:

• Just looking at the diagram above, you can see that it’s very easy to compute this padding function forward (going from a plaintext $m$ and some random padding $r$ to a padded message) and backwards — that is, it’s an easily-invertible permutation. The key to this feature is the Feistel network structure.
• Upon decryption, a decryptor can invert the padding of a given message and verify that the “check string” (the string of k1 “0” bits) is correctly structured. If this string is not structured properly, the decryptor can simply output an error. This comprises the primary decryption check.
• Assuming some (strong) properties of the hash functions, it intuitively seems that the OAEP transform is designed to create a kind of “avalanche effect” where even a small modification of a padded message will result in a very different unpadded result when the transform is inverted. In practice any such modification should “trash” the check string with overwhelming probability.

From an intuitive point of view, these last two properties are what makes OAEP secure against chosen-ciphertext attacks. The idea here is that, due to the random properties of the hash function, it should be hard for the attacker to construct a valid ciphertext (one that has a correct check string) if she does not already know the plaintext that goes into the transform. This should hold even if the attacker already has some known valid ciphertext (like $C^*$) that she wishes to maul.

More specifically related to mauling: if I send an RSA-OAEP ciphertext $C^*$ that encrypts a specific message $m$, the attacker should not be able to easily maul that ciphertext into a different ciphertext $C'$ that will still pass the decryption checks. This is due to two facts: (1) because RSA is a (trapdoor) permutation, any change to $C^*$ will implicitly change the padded message your recover after inverting the RSA function. And (2) sending this altered padded message backwards through the OAEP transform should, with overwhelming probability, trash the check string (and the message $m$). The result is that the adversary can’t maul someone else’s ciphertext.

This all assumes some very strong assumptions about the hash functions, which we’ll discuss below.

The OAEP proof details (at the most ridiculously superficial level)

Proving OAEP secure requires two basic techniques. Both fundamentally rely on the notion that the functions $G()$ and $H()$ are random oraclesThis is important for two very different reasons.

First: assuming a function is a “random oracle” means that we’re assuming it to have the same behavior as a random function. This is an awesome property for a hash function to have! (Note: real hash functions don’t have it. This means that hypothetically they could have very ‘non-random’ behavior that would make RSA-OAEP insecure. In practice this has not yet been a practical concern for real OAEP implementations, but it’s worth keeping in mind.

It’s easy to see that if the hash functions $G()$ and $H()$ were random functions, it would give OAEP some very powerful properties. Remember, one of the main intuitive goals of the OAEP scheme is to prevent attackers from successfully getting you to decrypt an improperly-constructed (e.g., mauled) ciphertext. If both hash functions are truly random, then this implies that any invalid ciphertext will almost certainly fail decryption, because the padding check will fail.

At a much deeper level, the use of random oracles in RSA’s security proof gives the security reduction a great deal of “extra power” to handle things like decrypt chosen ciphertexts. This is due to the fact that, in a random oracle proof, the proof reduction is allowed to both “see” every value hashed through those hash functions, and also to “program” the functions so that they will produce specific outputs. This would not be possible if $G()$ and $H()$ were implemented using real hash functions, and so the entire security proof would break down.

These properties provide a tool in the security proof to enable decryption even when the secret key is unknown. In a traditional proof of the RSA-OAEP scheme, the idea is to show that an attacker who breaks the encryption (in the IND-CCA2 sense) can be used to construct a second attacker who solves the RSA problem. This is done by taking some random values $(N, e, C)$ where $N, e$ is an RSA public key of unknown factorization and “programming” the random oracles such that $C^* = C$. The intuitive idea is that an attacker who is able to learn something about the underlying message must query the functions $G()$ and $H()$ on correct inputs that, ultimately will allow the security reduction to obtain the RSA inverse of $C^*$ even when the reduction does not know the RSA secret key, That is, such an attacker will allow us to find an integer $M'$ such that $M'^e = C$.

(There turned out to be some issues in the original OAEP proof that make it not quite work for arbitrary trapdoor permutations. Shoup fixed these by providing a new padding padding scheme called OAEP+, but the original OAEP had since gone into heavy usage within standards! It turns out that RSA-OAEP does work, however, for RSA with public exponents 3 and other exponents, though proving this required some ugly band-aids. This whole story is part of a cautionary tail about provably security, which Koblitz discusses here.)

### Technique 2: The Fujisaki-Okamoto Transform

One limitation of OAEP (and OAEP+) padding is that it requires a trapdoor permutation in order to work. This applies nicely to RSA encryption, but does not necessarily work with every existing public-key encryption scheme. This motivates the need for other CCA transforms that work with arbitrary existing (non-CCA) encryption schemes.

One of the nicest generic techniques for building CCA2-secure public-key encryption is due to Eiichiro Fujisaki and Tatsuaki Okamoto. The idea of this transform is to begin with a scheme that already meets the definition of IND-CPA security — that is, it is semantically secure, but not against chosen ciphertext attacks. (For this description, we’ll also require that this scheme has a large [exponentially-sized] message space and some specific properties related to randomness.) The beauty of the “Fujisaki-Okamoto transform” (henceforth: F-O) is that, like OAEP before it, given a working public-key encryption scheme, it requires only the addition of some hash functions, and can be proven secure in the random oracle model.

Let’s imagine that we have an IND-CPA encryption public-key encryption algorithm that consists of the algorithms ${\sf KeyGen}, {\sf Encrypt}, {\sf Decrypt}$. We’ll also make use of two independent hash functions $H_1, H_2$.

A key observation here is that in every IND-CPA (semantically secure) public key encryption scheme, the ${\sf Encrypt}$ algorithm is randomized. This actually has to be the case, due to the definition of IND-CPA. (See here for a discussion of why that is.) Put more explicitly, what this means is that the encryption algorithm must have acccess to some set of random bits that will be used to produce the ciphertext.

The main trick that the F-O transform uses is to de-randomize this public-key encryption algorithm. Instead of using real random bits to encrypt, it will instead use the output of the hash function $H_1$ to produce the random bits that will be used for encryption. This turns a randomized encryption into a deterministic one. (This, of course, requires that both the input and the internals of $H_1$ are capable of producing bits that “look” random.)

Let’s get to the nuts and bolts. The F-O transform does not change the key generation algorithm of the original encryption scheme at all, except to specify the hash functions $H_1, H_2$. The main changes come in the new encryption and decryption algorithms. I’m going to present one variant of the transform, though there are others. This one works as follows.

To encrypt a message $M$, which we’ll express as some fixed-length string of bits:

1. Instead of encrypting the actual message $M$, we instead sample a random message $R$ from the message space of the original CPA-secure scheme.
2. We hash the random message $R$ together with the original message $M$ using that first hash function $H_1$. The output of this function will give us a ‘random’ bitstring. Let’s denote this as: $r \leftarrow H_1(R \| M)$.
3. Next, we’ll encrypt the new random message $R$ using the original (CPA-secure) encryption scheme’s ${\sf Encrypt}$ algorithm, but critically: we will use the bits $r$ as the randomness for that encryption. The result of this process will give the first part of the ciphertext: $C_1 \leftarrow {\sf Encrypt}(pk, R; r)$. Note that here $r$ just refers to the randomness for the encryption algorithm, not an actual message being encrypted.
4. Finally, we derive a “key” for encrypting the real message we want to send. We can compute this as $K \leftarrow H_2(R)$.
5. We now encrypt the original message $M$ we want to send using some secure encryption scheme, for example the simple one-time pad: $C_2 \leftarrow M \oplus K$.
6. We output the “ciphertext” $C = (C_1, C_2)$.

To decrypt $C = (C_1, C_2)$, we would perform the following steps:

1. First, use the original public-key encryption scheme’s secret key to decrypt the ciphertext $C_1$, which (if all is well) should give us $R' \leftarrow {\sf Decrypt}(sk, C_1)$.
2. Now use knowledge of $R'$ to recover the key $K' \leftarrow H_2(R')$ and thus the message $M'$ which we can obtain as $M' \leftarrow C_2 \oplus K'$.
3. Now check that both $R', M'$ are valid by re-computing the randomness $r' \leftarrow H_1(R' \| M')$ and verifying the condition $C_1 = {\sf Encrypt}(pk, R'; r')$. If this final check fails, simply output a decryption error.

Phew. So what the heck is going on here?

Let’s tackle this scheme from a practical perspective. Earlier in this post, we said that to achieve IND-CCA2 security, a scheme must have two features. First, it must be plaintext aware, which means that in order to construct a valid ciphertext (that passes all decryption checks) the attacker should already know the plaintext.

Does F-O have this property? Well, intuitively we would hope that the answer is “yes”. Note for some valid F-O ciphertext $C = (C_1, C_2)$ the decrypted plaintext is implicitly defined as $M' \leftarrow C_2 \oplus H_2(R')$. So what we really want to prove is that in order to construct a valid ciphertext the attacker must already know $R'$ and $M'$ prior to sending the message for decryption.

This guarantee (with high probability) comes from the structure of $C_1$. In order for the ciphertext to be considered valid by the decryptor, it must be the case that $C_1$ satisfies the check $C_1 = {\sf Encrypt}(pk, M'; r' = H_1(R' \| M'))$. The idea of this proof is that it should be hard for an attacker to construct such a $C_1$ unless she has previously called the hash function $H_1$ on input $(R', M')$. If she has called the hash function to produce this portion of the ciphertext, then she already knows those values and the decryption oracle provides her with no additional information she didn’t already have. (Alternatively, if she did not call the hash function, then her probability of making a valid $C_1$ should be extremely low.)

Of course, this is only one strategy available to the attacker. She could also maul an existing ciphertext like $C^* = (C_1^*, C_2^*)$. In this case her strategy is twofold: she can tamper with the first portion of the ciphertext and/or she can tamper with the second. But it’s easy to see that this will tend to break some portion of the decryption checks:

1. If she tampers with any bit of $C_2^*$, she will change the recovered message into a new value that we can call $M''$. However this will in turn (with overwhelming probability) cause the decryptor to recover very different random coins $r'' \leftarrow H_1(R' \| M'')$ than were used in the original construction of $C_1^*$, and thus decryption check on that piece will probably fail.
2. If she tampers with any bit of $C_1^*$, the decryption check \$latex C_1^* = {\sf Encrypt}(pk, M’; r’) ought not to pass, and decryption will just produce an error.
3. She might try to tamper with both parts of the ciphertext, of course. But this would seem even more challenging.

The problem with the exercise above is that none of this constitutes a proof that the approach works. There is an awful lot of should and probably in this argument, and none of this ought to make you very happy. A rough sketch of the proof for an F-O scheme can be found here. (I warn you that it’s probably got some bugs in it, and I’m offering it mainly as an intuition.)

The F-O scheme has many variants. A slightly different and much more formal treatment by Hofheinz and Kiltz can be found here, and deals with some other requirements on the underlying CPA-secure scheme.

### To be continued…

So far in this discussion we’ve covered two basic techniques — both at a very superficial level — that achieve CCA2 security under the ridiculously strong assumption that random oracles exist. Unfortunately, they don’t. This motivates the need for better approaches that don’t require random oracles at all.

There are a couple of those that, sadly, nobody uses. Those will have to wait until the next post.

# Was the Efail disclosure horribly screwed up?

TL;DR. No. Or keep reading if you want.

On Monday a team of researchers from Münster, RUB and NXP disclosed serious cryptographic vulnerabilities in a number of encrypted email clients. The flaws, which go by the cute vulnerability name of “Efail”, potentially allow an attacker to decrypt S/MIME or PGP-encrypted email with only minimal user interaction.

By the standards of cryptographic vulnerabilities, this is about as bad as things get. In short: if an attacker can intercept and alter an encrypted email — say, by sending you a new (altered) copy, or modifying a copy stored on your mail server — they can cause many GUI-based email clients to send the full plaintext of the email to an attacker controlled-server. Even worse, most of the basic problems that cause this flaw have been known for years, and yet remain in clients.

The big (and largely under-reported) story of EFail is the way it affects S/MIME. That “corporate” email protocol is simultaneously (1) hated by the general crypto community because it’s awful and has a slash in its name, and yet (2) is probably the most widely-used email encryption protocol in the corporate world. The table at the right — excerpted from the paper — gives you a flavor of how Efail affects S/MIME clients. TL;DR it affects them very badly.

Efail also happens to affect a smaller, but non-trivial number of OpenPGP-compatible clients. As one might expect (if one has spent time around PGP-loving folks) the disclosure of these vulnerabilities has created something of a backlash on HN, and among people who make and love OpenPGP clients. Mostly for reasons that aren’t very defensible.

So rather than write about fun things — like the creation of CFB and CBC gadgets — today, I’m going to write about something much less exciting: the problem of vulnerability disclosure in ecosystems like PGP. And how bad reactions to disclosure can hurt us all.

### How Efail was disclosed to the PGP community

Putting together a comprehensive timeline of the Efail disclosure process would probably be a boring, time-intensive project. Fortunately Thomas Ptacek loves boring and time-intensive projects, and has already done this for us.

Briefly, the first Efail disclosures to vendors began last October, more than 200 days prior to the agreed publication date. The authors notified a large number of vulnerable PGP GUI clients, and also notified the GnuPG project (on which many of these projects depend) by February at the latest. From what I can tell every major vendor agreed to make some kind of patch. GnuPG decided that it wasn’t their fault, and basically stopped corresponding.

All parties agreed not to publicly discuss the vulnerability until an agreed date in April, which was later pushed back to May 15. The researchers also notified the EFF and some journalists under embargo, but none of them leaked anything. On May 14 someone dumped the bug onto a mailing list. So the EFF posted a notice about the vulnerability (which we’ll discuss a bit more below), and the researchers put up a website. That’s pretty much the whole story.

There are three basic accusations going around about the Efail disclosure. They can be summarized as (1) maintaining embargoes in coordinated disclosures is really hard, (2) the EFF disclosure “unfairly” made this sound like a serious vulnerability “when it isn’t”, and (3) everything was already patched anyway so what’s the big deal.

### Disclosures are hard; particularly coordinated ones

I’ve been involved in two disclosures of flaws in open encryption protocols. (Both were TLS issues.) Each one poses an impossible dilemma. You need to simultaneously (a) make sure every vendor has as much advance notice as possible, so they can patch their software. But at the same time (b) you need to avoid telling literally anyone, because nothing on the Internet stays secret. At some point you’ll notify some FOSS project that uses an open development mailing list or ticket server, and the whole problem will leak out into the open.

Disclosing bugs that affect PGP is particularly fraught. That’s because there’s no such thing as “PGP”. What we have instead is a large and distributed community that revolves around the OpenPGP protocol. The pillar of this community is the GnuPG project, which maintains the core GnuPG tool and libraries that many clients rely on. Then there are a variety of niche GUI-based clients and email plugin projects. Finally, there are commercial vendors like Apple and Microsoft. (Who are mostly involved in the S/MIME side of things, and may reluctantly allow PGP plugins.)

Then, of course there are thousands of end-users, who will generally fail to update their software unless something really bad and newsworthy happens.

The obvious solution to the disclosure problem to use a staged disclosure. You notify the big commercial vendors first, since that’s where most of the affected users are. Then you work your way down the “long tail” of open source projects, knowing that inevitably the embargo could break and everyone will have to patch in a hurry. And you keep in mind that no matter what happens, everyone will blame you for screwing up the disclosure.

For the PGP issues in Efail, the big client vendors are Mozilla (Thunderbird), Microsoft (Outlook) and maybe Apple (Mail). The very next obvious choice would be to patch the GnuPG tool so that it no longer spits out unauthenticated plaintext, which is the root of many of the problems in Efail.

The Efail team appears to have pursued exactly this approach for the client-side vulnerabilities. Sadly, the GnuPG team made the decision that it’s not their job to pre-emptively address problems that they view as ‘clients misusing the GnuPG API’ (my paraphrase), even when that misuse appears to be rampant across many of the clients that use their tool. And so the most obvious fix for one part of the problem was not available.

This is probably the most unfortunate part of the Efail story, because in this case GnuPG is very much at fault. Their API does something that directly violates cryptographic best practices — namely, releasing unauthenticated plaintext prior to producing an error message. And while this could be understood as a reasonable API design at design time, continuing to support this API even as clients routinely misuse it has now led to flaws across the ecosystem. The refusal of GnuPG to take a leadership role in preemptively safeguarding these vulnerabilities both increases the difficulty of disclosing these flaws, and increases the probability of future issues.

### So what went wrong with the Efail disclosure?

Despite what you may have heard, given the complexity of this disclosure, very little went wrong. The main issues people have raised seem to have to do with the contents of an EFF post. And with some really bad communications from Robert J. Hansen at the Enigmail (and GnuPG) project.

The EFF post. The Efail researchers chose to use the Electronic Frontier Foundation as their main source for announcing the existence of the vulnerability to the privacy community. This hardly seems unreasonable, because the EFF is generally considered a trusted broker, and speaks to the right community (at least here in the US).

The EFF post doesn’t give many details, nor does it give a list of affected (or patched) clients. It does give two pretty mild recommendations:

1. Temporarily disable or uninstall your existing clients until you’ve checked that they’re patched.
2. Maybe consider using a more modern cryptosystem like Signal, at least until you know that your PGP client is safe again.

This naturally led to a huge freakout by many in the PGP community. Some folks, including vendors, have misrepresented the EFF post as essentially pushing people to “permanently” uninstall PGP, which will “put lives at risk” because presumably these users (whose lives are at risk, remember) will immediately fall back to sending incriminating information via plaintext emails — rather than temporarily switching their communications to one of several modern, well-studied secure messengers, or just not emailing for a few hours.

The most reasonable criticism I’ve heard of the EFF post is that it doesn’t give many details about which clients are patched, and which are vulnerable. This could presumably give someone the impression that this vulnerability is still present in their email client, and thus would cause them to feel less than secure in using it.

I have to be honest that to me that sounds like a really good outcome. The problem with Efail is that it doesn’t matter if your client is secure. The Efail vulnerability could affect you if even a single one of your communication partners is using an insecure client.

So needless to say I’m not very sympathetic to the reaction around the EFF post. If you can’t be sure whether your client is secure, you probably should feel insecure.

Bad communications from GnuPG and Enigmail. On the date of the disclosure, anyone looking for accurate information about security from two major projects — GnuPG and Enigmail — would not have been able to find it.

They wouldn’t have found it because developers from both Enigmail and GnuPG were on mailing lists and Twitter claiming that they had never heard of Efail, and hadn’t been notified by the researchers. Needless to say, these allegations took off around the Internet, sometimes in place of real information that could have helped users (like, whether either project had patched.)

It goes without saying that neither allegation was actually true. In fact, both project members soon checked with their fellow developers (and their memories) and found out that they’d both been given months of notice by the researchers, and that Enigmail had even developed a patch. (However, it turned out that even this patch may not perfectly address the issue, and the community is still working to figure out exactly what still needs to be done.)

This is an understandable mistake, perhaps. But it sure is a bad one.

Now that I’ve made it clear that neither the researchers nor the EFF is out to get the PGP community, let me put on my mask and horns and tell you why someone should be.

I’ve written extensively about PGP on this blog, but in the past I’ve written mostly from a technical point of view about the problems with PGP. But what’s really problematic about PGP is not just the cryptography; it’s the story it tells about path dependence and how software communities work.

The fact of the matter is that OpenPGP is not really a cryptography project. That is, it’s not held together by cryptography.  It’s held together by backwards-compatibility and (increasingly) a kind of an obsession with the idea of PGP as an end in and of itself, rather than as a means to actually make end-users more secure.

Let’s face it, as a protocol, PGP/OpenPGP is just not what we’d develop if we started over today. It was formed over the years out of mostly experimental parts, which were in turn replaced, bandaged and repaired — and then worked into numerous implementations, which all had to be insanely flexible and yet compatible with one another. The result is bad, and most of the software implementing it is worse. It’s the equivalent of a beloved antique sports car, where the electrical system is totally shot, but it still drives. You know, the kind of car where the owner has to install a hand-switch so he can turn the reverse lights on manually whenever he wants to pull out of a parking space.

If PGP went away, I estimate it would take the security community less than a year to entirely replace (the key bits of) the standard with something much better and modern. It would have modern crypto and authentication, and maybe even extensions for future post-quantum future security. It would be simple. Many bright new people would get involved to help write the inevitable Rust, Go and Javascript clients and libraries.

Unfortunately for us all, (Open)PGP does exist. And that means that even fancy greenfield email projects feel like they need to support OpenPGP, or at least some subset of it. This in turn perpetuates the PGP myth, and causes other clients to use it. And as a direct result, even if some clients re-implement OpenPGP from scratch, other clients will end up using tools like GnuPG which will support unauthenticated encryption with bad APIs. And the cycle will go round and around, like a spaceship stuck near the event horizon of a black hole.

And as the standard perpetuates itself, largely for the sake of being a standard, it will fail to attract new security people. It will turn away exactly the type of people who should be working on these tools. Those people will go off and build encryption systems in a totally different area, or they’ll get into cryptocurrency. And — with some exceptions — the people who work in the community will increasingly work in that community because they’re supporting PGP, and not because they’re trying to seek out the best security technologies for their users. And the serious (email) users of PGP will be using it because they like the idea of using PGP better than they like using an actual, secure email standard.

And as things get worse, and fail to develop, people who work on it will become more dogmatic about its importance, because it’s something threatened and not a real security protocol that anyone’s using. To me that’s where PGP is going today, and that is why the community has such a hard time motivating itself to take these vulnerabilities seriously, and instead reacts defensively.

Maybe that’s a random, depressing way to end a post. But that’s the story I see in OpenPGP. And it makes me really sad.

# Wonk post: chosen ciphertext security in public-key encryption (Part 1)

In general I try to limit this blog to posts that focus on generally-applicable techniques in cryptography. That is, I don’t focus on the deeply wonky. But this post is going to be an exception. Today, I’m going to talk about a topic that most “typical” implementers don’t — and shouldn’t — think about.

Specifically: I’m going to talk about various techniques for making public key encryption schemes chosen ciphertext secure. I see this as the kind of post that would have saved me ages of reading when I was a grad student, so I figured it wouldn’t hurt to write it all down (even though this absolutely shouldn’t serve as a replacement for just reading the original papers!)

### Background: CCA(1/2) security

Early (classical) ciphers used a relatively weak model of security, if they used one at all. That is, the typical security model for an encryption scheme was something like the following:

1. I generate an encryption key (or keypair for public-key encryption)
2. I give you the encryption of some message of my choice
3. You “win” if you can decrypt it

This is obviously not a great model in the real world, for several reasons. First off, in some cases the attacker knows a lot about the message to be decrypted. For example: it may come from a small space (like a set of playing cards). For this reason we require a stronger definition like “semantic security” that assumes the attacker can choose the plaintext distribution, and can also obtain the encryption of messages of his/her own choice. I’ve written more about this here.

More relevant to this post, another limitation of the above game is that — in some real-world examples — the attacker has even more power. That is: in addition to obtaining the encryption of chosen plaintexts, they may be able to convince the secret keyholder to decrypt chosen ciphertexts of their choice.

The latter attack is called a chosen-ciphertext (CCA) attack.

At first blush this seems like a really stupid model. If you can ask the keyholder to decrypt chosen ciphertexts, then isn’t the scheme just obviously broken? Can’t you just decrypt anything you want?

The answer, it turns out, is that there are many real-life examples where the attacker has decryption capability, but the scheme isn’t obviously broken. For example:

1. Sometimes an attacker can decrypt a limited set of ciphertexts (for example, because someone leaves the decryption machine unattended at lunchtime.) The question then is whether they can learn enough from this access to decrypt other ciphertexts that are generated after she loses access to the decryption machine — for example, messages that are encrypted after the operator comes back from lunch.
2. Sometimes an attacker can submit any ciphertext she wants — but will only obtain a partial decryption of the ciphertext. For example, she might learn only a single bit of information such as “did this ciphertext decrypt correctly”. The question, then, is whether she can leverage this tiny amount of data to fully decrypt some ciphertext of her choosing.

The first example is generally called a “non-adaptive” chosen ciphertext attack, or a CCA1 attack (and sometimes, historically, a “lunchtime” attack). There are a few encryption schemes that totally fall apart under this attack — the most famous textbook example is Rabin’s public key encryption scheme, which allows you to recover the full secret key from just a single chosen-ciphertext decryption.

The more powerful second example is generally referred to as an “adaptive” chosen ciphertext attack, or a CCA2 attack. The term refers to the idea that the attacker can select the ciphertexts they try to decrypt based on seeing a specific ciphertext that they want to attack, and by seeing the answers to specific decryption queries.

In this article we’re going to use the more powerful “adaptive” (CCA2) definition, because that subsumes the CCA1 definition. We’re also going to focus primarily on public-key encryption.

With this in mind, here is the intuitive definition of the experiment we want a CCA2 public-key encryption scheme to be able to survive:

1. I generate an encryption keypair for a public-key scheme and give you the public key.
2. You can send me (sequentially and adaptively) many ciphertexts, which I will decrypt with my secret key. I’ll give you the result of each decryption.
3. Eventually you’ll send me a pair of messages (of equal length) $M_0, M_1$ and I’ll pick a bit $b$ at random, and return to you the encryption of $M_b$, which I will denote as $C^* \leftarrow {\sf Encrypt}(pk, M_b)$.
4. You’ll repeat step (2), sending me ciphertexts to decrypt. If you send me $C^*$ I’ll reject your attempt. But I’ll decrypt any other ciphertext you send me, even if it’s only slightly different from $C^*$.
5. The attacker outputs their guess $b'$. They “win” the game if $b'=b$.

We say that our scheme is secure if the attacker wins only with a significantly greater probability than they would win with if they simply guessed $b'$ at random. Since they can win this game with probability 1/2 just by guessing randomly, that means we want (Probability attacker wins the game) – 1/2 to be “very small” (typically a negligible function of the security parameter).

You should notice two things about this definition. First, it gives the attacker the full decryption of any ciphertext they send me. This is obviously much more powerful than just giving the attacker a single bit of information, as we mentioned in the example further above. But note that powerful is good. If our scheme can remain secure in this powerful experiment, then clearly it will be secure in a setting where the attacker gets strictly less information from each decryption query.

The second thing you should notice is that we impose a single extra condition in step (4), namely that the attacker cannot ask us to decrypt $C^*$. We do this only to prevent the game from being “trivial” — if we did not impose this requirement, the attacker could always just hand us back $C^*$ to decrypt, and they would always learn the value of $b$.

(Notice as well that we do not give the attacker the ability to request encryptions of chosen plaintexts. We don’t need to do that in the public key encryption version of this game, because we’re focusing exclusively on public-key encryption here — since the attacker has the public key, she can encrypt anything she wants without my help.)

With definitions out of the way, let’s talk a bit about how we achieve CCA2 security in real schemes.

### A quick detour: symmetric encryption

This post is mainly going to focus on public-key encryption, because that’s actually the problem that’s challenging and interesting to solve. It turns out that achieving CCA2 for symmetric-key encryption is really easy. Let me briefly explain why this is, and why the same ideas don’t work for public-key encryption.

(To explain this, we’ll need to slightly tweak the CCA2 definition above to make it work in the symmetric setting. The changes here are small: we won’t give the attacker a public key in step (1), and at steps (2) and (4) we will allow the attacker to request the encryption of chosen plaintexts as well as the decryption.)

The first observation is that many common encryption schemes — particularly, the widely-used cipher modes of operation like CBC and CTR — are semantically secure in a model where the attacker does not have the ability to decrypt chosen ciphertexts. However, these same schemes break completely in the CCA2 model.

The simple reason for this is ciphertext malleability. Take CTR mode, which is particularly easy to mess with. Let’s say we’ve obtained a ciphertext $C^*$ at step (4) (recall that $C^*$ is the encryption of $M_b$), it’s trivially easy to “maul” the ciphertext — simply by flipping, say, a bit of the message (i.e., XORing it with “1”). This gives us a new ciphertext $C' = C^* \oplus 1$ that we are now allowed to submit for decryption. We are now allowed (by the rules of the game) to submit this ciphertext, and obtain $M_b \oplus 1$, which we can use to figure out $b$.

(A related, but “real world” variant of this attack is Vaudenay’s Padding Oracle Attack, which breaks actual implementations of symmetric-key cryptosystems. Here’s one we did against Apple iMessage. Here’s an older one on XML encryption.)

So how do we fix this problem? The straightforward observation is that we need to prevent the attacker from mauling the ciphertext $C^*$. The generic approach to doing this is to modify the encryption scheme so that it includes a Message Authentication Code (MAC) tag computed over every CTR-mode ciphertext. The key for this MAC scheme is generated by the encrypting party (me) and kept with the encryption key. When asked to decrypt a ciphertext, the decryptor first checks whether the MAC is valid. If it’s not, the decryption routine will output “ERROR”. Assuming an appropriate MAC scheme, the attacker can’t modify the ciphertext (including the MAC) without causing the decryption to fail and produce a useless result.

So in short: in the symmetric encryption setting, the answer to CCA2 security is simply for the encrypting parties to authenticate each ciphertext using a secret authentication (MAC) key they generate. Since we’re talking about symmetric encryption, that extra (secret) authentication key can be generated and stored with the decryption key. (Some more efficient schemes make this all work with a single key, but that’s just an engineering convenience.) Everything works out fine.

So now we get to the big question.

### CCA security is easy in symmetric encryption. Why can’t we just do the same thing for public-key encryption?

As we saw above, it turns out that strong authenticated encryption is sufficient to get CCA(2) security in the world of symmetric encryption. Sadly, when you try this same idea generically in public key encryption, it doesn’t always work. There’s a short reason for this, and a long one. The short version is: it matters who is doing the encryption.

Let’s focus on the critical difference. In the symmetric CCA2 game above, there is exactly one person who is able to (legitimately) encrypt ciphertexts. That person is me. To put it more clearly: the person who performs the legitimate encryption operations (and has the secret key) is also the same person who is performing decryption.

Even if the encryptor and decryptor aren’t literally the same person, the encryptor still has to be honest. (To see why this has to be the case, remember that the encryptor has shared secret key! If that party was a bad guy, then the whole scheme would be broken, since they could just output the secret key to the bad guys.) And once you’ve made the stipulation that the encryptor is honest, then you’re almost all the way there. It suffices simply to add some kind of authentication (a MAC or a signature) to any ciphertext she encrypts. At that point the decryptor only needs to determine whether any given ciphertexts actually came from the (honest) encryptor, and avoid decrypting the bad ones. You’re done.

Public key encryption (PKE) fundamentally breaks all these assumptions.

In a public-key encryption scheme, the main idea is that anyone can encrypt a message to you, once they get a copy of your public key. The encryption algorithm may sometimes be run by good, honest people. But it can also be run by malicious people. It can be run by parties who are adversarial. The decryptor has to be able to deal with all of those cases. One can’t simply assume that the “real” encryptor is honest.

Let me give a concrete example of how this can hurt you. A couple of years ago I wrote a post about flaws in Apple iMessage, which (at the time) used simple authenticated (public key) encryption scheme. The basic iMessage encryption algorithm used public key encryption (actually a combination of RSA with some AES thrown in for efficiency) so that anyone could encrypt a message to my key. For authenticity, it required that every message be signed with an ECDSA signature by the sender.

When I received a message, I would look up the sender’s public key and first make sure the signature was valid. This would prevent bad guys from tampering with the message in flight — e.g., executing nasty stuff like adaptive chosen ciphertext attacks. If you squint a little, this is almost exactly a direct translation of the symmetric crypto approach we discussed above. We’re simply swapping the MAC for a digital signature.

The problems with this scheme start to become apparent when we consider that there might be multiple people sending me ciphertexts. Let’s say the adversary is on the communication path and intercepts a signed message from you to me. They want to change (i.e., maul) the message so that they can execute some kind of clever attack. Well, it turns out this is simple. They simply rip off the honest signature and replace it one they make themselves:

The new message is identical, but now appears to come from a different person (the attacker). Since the attacker has their own signing key, they can maul the encrypted message as much as they want, and sign new versions of that message. If you plug this attack into (a version) of the public-key CCA2 game up top, you see they’ll win quite easily. All they have to do is modify the challenge ciphertext $C^*$ at step (4) to be signed with their own signing key, then they can change it by munging with the CTR mode encryption, and request the decryption of that ciphertext.

Of course if I only accept messages from signed by some original (guaranteed-to-be-honest) sender, this scheme might work out fine. But that’s not the point of public key encryption. In a real public-key scheme — like the one Apple iMessage was trying to build — I should be able to (safely) decrypt messages from anyone, and in that setting this naive scheme breaks down pretty badly.

Whew.

Ok, this post has gotten a bit long, and so far I haven’t actually gotten to the various “tricks” for adding chosen ciphertext security to real public key encryption schemes. That will have to wait until the next post, to come shortly.