What is the random oracle model and why should you care? (Part 5)

What is the random oracle model and why should you care? (Part 5)

This is part five of a series on the Random Oracle Model.  See here for the previous posts:

Part 1: An introduction
Part 2: The ROM formalized, a scheme and a proof sketch
Part 3: How we abuse the ROM to make our security proofs work
Part 4: Some more examples of where the ROM is used

About eight years ago I set out to write a very informal piece on a specific cryptographic modeling technique called the “random oracle model”. This was way back in the good old days of 2011, which was a more innocent and gentle era of cryptography. Back then nobody foresaw that all of our standard cryptography would turn out to be riddled with bugs; you didn’t have to be reminded that “crypto means cryptography“. People even used Bitcoin to actually buy things.

That first random oracle post somehow sprouted three sequels, each more ridiculous than the last. I guess at some point I got embarrassed about the whole thing — it’s pretty cheesy, to be honest — so I kind of abandoned it unfinished. And that’s been a major source of regret for me, since I had always planned a fifth, and final post, to cap the whole messy thing off. This was going to be the best of the bunch: the one I wanted to write all along.

To give you some context, let me briefly remind you what the random oracle model is, and why you should care about it. (Though you’d do better just to read the series.)

The random oracle model is a bonkers way to model (reason about) hash functions, in which we assume that these are actually random functions and use this assumption to prove things about cryptographic protocols that are way more difficult to prove without such a model. Just about all the “provable” cryptography we use today depends on this model, which means that many of these proofs would be called into question if it was “false”.

And to tease the rest of this post, I’ll quote the final paragraphs of Part 4, which ends with this:

You see, we always knew that this ride wouldn’t last forever, we just thought we had more time. Unfortunately, the end is nigh. Just like the imaginary city that Leonardo de Caprio explored during the boring part of Inception, the random oracle model is collapsing under the weight of its own contradictions. 

As promised, this post will be about that collapse, and what it means for cryptographers, security professionals, and the rest of us.

First, to make this post a bit more self-contained I’d like to recap a few of the basics that I covered earlier in the series. You can feel free to skip this part if you’ve just come from there.

In which we (very quickly) remind the reader what hash functions are, what random functions are, and what a random oracle is.

As discussed in the early sections of this series, hash functions (or hashing algorithms) are a standard primitive that’s used in many areas of computer science. They take in some input, typically a string of variable length, and repeatably output a short and fixed-length “digest”. We often denote these functions as follows:

{\sf digest} \leftarrow H({\sf message})

Cryptographic hashing takes this basic template and tacks on some important security properties that we need for cryptographic applications. Most famously these provide  well-known properties like collision resistance, which is needed for applications like digital signatures. But hash functions turn up all over cryptography, sometimes in unexpected places — ranging from encryption to zero-knowledge protocols — and sometimes these systems demand stronger properties. Those can sometimes be challenging to put into formal terms: for example, many protocols require a hash function to produce output that is extremely “random-looking”.*

In the earliest days of provably security, cryptographers realized that the ideal hash function would behave like a “random function”. This term refers to a function that is uniformly sampled from the set of all possible functions that have the appropriate input/output specification (domain and range). In a perfect world your protocol could, for example, randomly sample one of vast number of possible functions at setup, bake the identifier of that function into a public key or something, and then you’d be good to go.

Unfortunately it’s not possible to actually use random functions (of reasonably-sized domain and range) in real protocols. That’s because sampling and evaluating those functions is far too much work.

For example, the number of distinct functions that consume a piddly 256-bit input and produce a 256-bit digest is a mind-boggling (2^{256})^{2^{256}}. Simply “writing down” the identity of the function you chose would require memory that’s exponential in the function’s input length. Since we want our cryptographic algorithms to be efficient (meaning, slightly more formally, they run in polynomial time), using random functions is pretty much unworkable.

So we don’t use random functions to implement our hashing. Out in “the real world” we use weird functions developed by Belgians or the National Security Agency, things like like SHA256 and SHA3 and Blake2. These functions come with blazingly fast and tiny algorithms for computing them, most of which occupy few dozen lines of code or less. They certainly aren’t random, but as best we can tell, the output looks pretty jumbled up.

Still, protocol designers continue to long for the security that using  truly random function could give their protocol. What if, they asked, we tried to split the difference. How about we model our hash functions using random functions — just for the sake of writing our security proofs —  and then when we go to implement (or “instantiate”) our protocols, we’ll go use efficient hash functions like SHA3? Naturally these proofs wouldn’t exactly apply to the real protocol as instantiated, but they might still be pretty good.

A proof that uses this paradigm is called a proof in the random oracle model, or ROM. For the full mechanics of how the ROM works you’ll have to go back and read the series from the beginning. What you do need to know right now is that proofs in this model must somehow hack around the fact that evaluating a random function takes exponential time. The way the model handles this is simple: instead of giving the individual protocol participants a description of the hash function itself — it’s way too big for anyone to deal with — they give each party (including the adversary) access to a magical “oracle” that can evaluate the random function H efficiently, and hand them back a result.

This means that any time one of the parties wants to compute the function H({\sf message}) they don’t do it themselves. They instead calling out to a third party, the “random oracle” who keeps a giant table of random function inputs and outputs. At a high level, the model looks like sort of like this:

b68a0-diagram

Since all parties in the system “talk” to the same oracle, they all get the same hash result when they ask it to hash a given message. This is a pretty good standin for what happens with a real hash function. The use of an outside oracle allows us to “bury” the costs of evaluating a random function, so that nobody else needs to spend exponential time evaluating one. Inside this artificial model, we get ideal hash functions with none of the pain.

This seems pretty ridiculous already…

It absolutely is!

However — I think there are several very important things you should know about the random oracle model before you write it off as obviously inane:

1. Of course everyone knows random oracle proofs aren’t “real”. Most conscientious protocol designers will admit that proving something secure in the random oracle model does not actually mean it’ll be secure “in the real world”. In other words, the fact that random oracle model proofs are kind of bogus is not some deep secret I’m letting you in on.

2. And anyway: ROM proofs are generally considered a useful heuristic. For those who aren’t familiar with the term, “heuristic” is a word that grownups use when they’re about to secure your life’s savings using cryptography they can’t prove anything about.

I’m joking! In fact, random oracle proofs are still quite valuable. This is mainly because they often help us detect bugs in our schemes. That is, while a random oracle proof doesn’t imply security in the real world, the inability to write one is usually a red flag for protocols. Moreover, the existence of a ROM proof is hopefully an indicator that the “guts” of the protocol are ok, and that any real-world issues that crop up will have something to do with the hash function.

3. ROM-validated schemes have a pretty decent track record in practice. If ROM proofs were kicking out absurdly broken schemes every other day, we would probably have abandoned this technique. Yet we use cryptography that’s proven (only) in the ROM just about ever day — and mostly it works fine.

This is not to say that no ROM-proven scheme has ever been broken, when instantiated with a specific hash function. But normally these breaks happen because the hash function itself is obvious broken (as happened when MD4 and MD5 both cracked up a while back.) Still, those flaws are generally fixed by simply switching to a better function. Moreover, the practical attacks are historically more likely to come from obvious flaws, like the discovery of hash collisions screwing up signature schemes, rather than from some exotic mathematical flaw. Which brings us to a final, critical note…

4. For years, many people believed that the ROM could actually be saved. This hope was driven by the fact that ROM schemes generally seemed to work pretty well when implemented with strong hash functions, and so perhaps all we needed to do was to find a hash function that was “good enough” to make ROM proofs meaningful. Some theoreticians hoped that fancy techniques like cryptographic obfuscation could somehow be used to make concrete hashing algorithms that behaved well enough to make (some) ROM proofs instantiable.**

So that’s kind of the state of the ROM, or at least — it was the state up until the late 1990s. We knew this model was artificial, and yet it stubbornly refused to explode or produce totally nonsense results.

And then, in 1998, everything went south.

CGH98: an “uninstantiable” scheme

For theoretical cryptographers, the real breaking point for the random oracle model came in the form of a 1998 STOC paper by Canetti, Goldreich and Halevi (henceforth CGH). I’m going to devote the rest of this (long!) post to explaining the gist of what they found.

What CGH proved was that, in fact, there exist cryptographic schemes that can be proven perfectly secure in the random oracle model, but that — terrifyingly — become catastrophically insecure the minute you instantiate the hash function with any concrete function.

This is a really scary result, at least from the point of view of the provable security community. It’s one thing to know in theory that your proofs might not be that strong. It’s a different thing entirely to know that in practice there are schemes that can walk right past your proofs like a Terminator infiltrating the Resistance, and then explode all over you in the most serious way.

Before we get to the details of CGH and its related results, a few caveats.

First, CGH is very much a theory result. The cryptographic “counterexample” schemes that trip this problem generally do not look like real cryptosystems that we would use in practice, although later authors have offered some more “realistic” variants. They are, in fact, designed to do very artificial things that no “real” scheme would ever do. This might lead readers to dismiss them on the grounds of artificiality.

The problem with this view is that looks aren’t a particularly scientific way to judge a scheme. Both “real looking” and “artificial” schemes are, if proven correct, valid cryptosystems. The point of these specific counterexamples is to do deliberately artificial things in order to highlight the problems with the ROM. But that does not mean that “realistic” looking schemes won’t do them.

A further advantage of these “artificial” schemes is that they make the basic ideas relatively easy to explain. As a further note on this point: rather than explaining CGH itseld, I’m going to use a formulation of the same basic result that was proposed by Maurer, Renner and Holenstein (MRH).

A signature scheme

The basic idea of CGH-style counterexamples is to construct a “contrived” scheme that’s secure in the ROM, but totally blows up when we “instantiate” the hash function using any concrete function, meaning a function that has a real description and can be efficiently evaluated by the participants in the protocol.

While the CGH techniques can apply with lots of different types of cryptosystem, in this explanation, we’re going to start our example using a relatively simple type of system: a digital signature scheme.

You may recall from earlier episodes of this series that a normal signature scheme consists of three algorithms: key generation, signing, and verification. The key generation algorithm outputs a public and secret key. Signing uses the secret key to sign a message, and outputs a signature. Verification takes the resulting signature, the public key and the message, and determines whether the signature is valid: it outputs “True” if the signature checks out, and “False” otherwise.

Traditionally, we demand that signature schemes be (at least) existentially unforgeable under chosen message attack, or UF-CMA. This means that that we consider an efficient (polynomial-time bounded) attacker who can ask for signatures on chosen messages, which are produced by a “signing oracle” that contains the secret signing key. Our expectation of a secure scheme is that, even given this access, no attacker will be able to come up with a signature on some new message that she didn’t ask the signing oracle to sign for her, except with negligible probability.****

Having explained these basics, let’s talk about what we’re going to do with it. This will involve several steps:

Step 1: Start with some existing, secure signature scheme. It doesn’t really matter what signature scheme we start with, as long as we can assume that it’s secure (under the UF-CMA definition described above.) This existing signature scheme will be used as a building block for the new scheme we want to build.*** We’ll call this scheme S.

Step 2: We’ll use the existing scheme S as a building block to build a “new” signature scheme, which we’ll call {\bf S_{\sf broken}}. Building this new scheme will mostly consist of grafting weird bells and whistles onto the algorithms of the original scheme S.

Step 3: Having described the working of {\bf S_{\sf broken}} in detail, we’ll argue that it’s totally secure in the ROM. Since we started with an (assumed) secure signature scheme S, this argument mostly comes down to showing that in the random oracle model the weird additional features we added in the previous step don’t actually make the scheme exploitable.

Step 4: Finally, we’ll demonstrate that {\bf S_{\sf broken}} is totally broken when you instantiate the random oracle with any concrete hash function, no matter how “secure” it looks. In short, we’ll show that one you replace the random oracle with a real hash function, there’s a simple attack that always succeeds in forging signatures.

We’ll start by explaining how {\bf S_{\sf broken}} works.

Building a broken scheme

To build our contrived scheme, we begin with the existing secure (in the UF-CMA sense) signature scheme S. That scheme comprises the three algorithms mentioned above: key generation, signing and verification.

We need to build the equivalent three algorithms for our new scheme.

To make life easier, our new scheme will simply “borrow” two of the algorithms from S, making no further changes at all. These two algorithms will be the key generation and signature verification algorithms So two-thirds of our task of designing the new scheme is already done.

Each of the novel elements that shows up in {\bf S_{\sf broken}} will therefore appear in the signing algorithm. Like all signing algorithms, this algorithm takes in a secret signing key and some message to be signed. It will output a signature.

At the highest level, our new signing algorithm will have two subcases, chosen by a branch that depends on the input message to be signed. These two cases are given as follows:

The “normal” case: for most messages M, the signing algorithm will simply run the original signing algorithm from the original (secure) scheme S. This will output a perfectly nice signature that we can expect to work just fine.

The “evil” case: for a subset of (reasonably-sized) messages that have a different (and very highly specific) form, our signing algorithm will not output a signature. It will instead output the secret key for the entire signature scheme. This is an outcome that cryptographers will sometimes call “very, very bad.”

So far this description still hides all of the really important details, but at least it gives us an outline of where we’re trying to go.

Recall that under the UF-CMA definition I described above, our attacker is allowed to ask for signatures on arbitrary messages. When we consider using this definition with our modified signing algorithm, it’s easy to see that the presence of these two cases could make things exciting.

Specifically: if any attacker can construct a message that triggers the “evil” case, her request to sign a message will actually result in her obtaining the scheme’s secret key. From that point on she’ll be able to sign any message that she wants — something that obviously breaks the UF-CMA security of the scheme. If this is too theoretical for you: imagine requesting a signed certificate from LetsEncrypt, and instead obtaining a copy of LetsEncrypt’s signing keys. Now you too are a certificate authority. That’s the situation we’re describing.

The only way this scheme could ever be proven secure is if we could somehow rule out the “evil” case happening at all.

More concretely: we would have to show that no attacker can construct a message that triggers the “evil case” — or at least, that their probability of coming up with such a message is very, very low (negligible). If we could prove this, then our scheme {\bf S_{\sf broken}} basically just reduces to being the original secure scheme. Which means our new scheme would be secure.

In short: what we’ve accomplished is to build a kind of “master password” backdoor into our new scheme {\bf S_{\sf broken}}. Anyone who knows the password can break the scheme. Everything now depends on whether an attacker can figure out that password.

So what is the “backdoor”?

The message that breaks the scheme {\bf S_{\sf broken}} isn’t a password at all, of course. Because this is computer science and nothing is ever easy, the message will actually be a computer program. We’ll call it P.

More concretely, it will be some kind of program that can decoded within our new signing algorithm, and then evaluated (on some input) by an interpreter that we will also place within that algorithm.

If we’re being formal about this, we’d say the message contains an encoding of a program for a universal Turing machine (UTM), along with a unary-encoded integer t that represents the number of timesteps that the machine should be allowed to run for. However, it’s perfectly fine with me if you prefer to think of the message as containing a hunk of Javascript, an Ethereum VM blob combined with some maximum “gas” value to run on, a .tgz encoding of a Docker container, or any other executable format you fancy.

What really matters is the functioning of the program P.

A program P that successfully triggers the “evil case” is one that contains an efficient (e.g., polynomial-sized) implementation of a hash function. And not just any hash function. To actually trigger the backdoor, the algorithm P must a function that is identical to, or at least highly similar to, the random oracle function H.

There are several ways that the signing algorithm can verify this similarity. The MRH paper gives a very elegant one, which I’ll discuss further below. But for the purposes of this immediate intuition, let’s assume that our signing algorithm verifies this similarity probabilistically. Specifically: to check that P matches H, it won’t verify the correspondence at every possible input. It might, for example, simply verify that P(x) = H(x) for some large (but polynomial) number of random input values x.

So that’s the backdoor.

Let’s think briefly about what this means for security, both inside and outside of the random oracle mode.

Case 1: in the random oracle model

Recall that in the random oracle model, the “hash function” H is modeled as a random function. Nobody in the protocol actually has a copy of that function, they just have access to a third party (the “random oracle”) who can evaluate it for them.

If an attacker wishes to trigger the “evil case” in our signing scheme, they will somehow need to download a description of the random function from the oracle. then encode it into a program P, and send it to the signing oracle. This seems fundamentally hard.

To do this precisely — meaning that P would match H on every input — the attacker would need to query the random oracle on every possible input, and then design a program P that encodes all of these results. It suffices to say that this strategy would not be practical: it would require an exponential amount of time to do any of these, and the size of P would also be exponential in the input length of the function. So this attacker would seem virtually guaranteed to fail.

Of course the attacker could try to cheat: make a small function P that only matches H on a small of inputs, and hope that the signer doesn’t notice. However, even this seems pretty challenging to get away with. For example, to perform a probabilistic check, the signing algorithm can simply verify that P(x) = H(x) for a large number of random input points x. This approach will catch a cheating attacker with very high probability.

(We will end up using a slightly more elegant approach to checking the function and arguing this point further below.)

The above is hardly an exhaustive security analysis. But at a high level our argument should now be clear: in the random oracle model, the scheme {\bf S_{\sf broken}} is secure because the attacker can’t know a short enough backdoor “password” that breaks the scheme. Having eliminated the “evil case”, the scheme {\bf S_{\sf broken}} simply devolves to the original, secure scheme S.

Case 2: In the “real world”

Out in the real world, we don’t use random oracles. When we want to implement a scheme that has a proof in the ROM, we must first “instantiate” the scheme by substituting in some real hash function in place of the random oracle H.

This instantiated hash function must, by definition, be efficient to evaluate and describe. This means implicitly that it possesses a polynomial-size description and can be evaluated in expected polynomial time. If we did not require this, our schemes would never work. Moreover, we must further assume that all parties, including the attacker, possess a description of the hash function. That’s a standard assumption in cryptography, and is merely a statement of Kerckhoff’s principle.

With these facts stipulated, the problem with our new signature scheme becomes obvious.

In this setting, the attacker actually does have access to a short, efficient program P that matches the hash function H. In practice, this function will probably be something like SHA2 or Blake2. But even in a weird case where it’s some crazy obfuscated function, the attacker is still expected to have a program that they can efficiently evaluate. Since the attacker possesses this program, they can easily encode it into a short enough message and send it to the signing oracle.

When the signing algorithm receives this program, it will perform some kind of test of this function P against its own implementation of H, and — when it inevitably finds a match between the two functions with high probability — it will output the scheme’s secret key.

Hence, out in the real world our scheme {\bf S_{\sf broken}} is always and forever, totally broken.

A few boring technical details (that you can feel free to skip)

If you’re comfortable with the imprecise technical intuition I’ve given above, feel free to skip this section. You can jump on to the next part, which tries to grapple with tough philosophical questions like “what does this mean for the random oracle model” and “I think this is all nonsense” and “why do we drive on a parkway, and park in a driveway?

All I’m going to do here is clean up a few technical details.

One of the biggest pieces that’s missing from the intuition above is a specification of how the signing algorithm verifies that the program P it receives from the attacker actually “matches” the random oracle function H. The obvious way is to simply evaluate P(x) = H(x) on every possible input x, and output the scheme’s secret key if every comparison succeeds. But doing this exhaustively requires exponential time.

The MRH paper proposes a very neat alternative way to tackle this. They propose to test the functions on a few input values, and not even random ones. More concretely, they propose checking that P(x) = H(x) for values of x \in \{1, \dots, q\} with the specific requirement that q is an integer such that q = 2|P| + k. Here |P| represents the length of the encoding of program P in bits, and k is the scheme’s adjustable security parameter (for example, k=128).

What this means is that to trigger the backdoor, the attacker must come up with a program P that can be described in some number of bits (let’s call it n) , and yet will be able to correctly match the outputs of H at e.g., q=2n+128 different input points. If we conservatively assume that H produces (at least) a 1-bit digest, that means we’re effectively encoding at least 2n+128 bits of data into a string of length n.

If the function H is a real hash function like SHA256, then it should be reasonably easy for the attacker to find some n-bit program that matches H at, say, q=2n+128 different points. For example, here’s a Javascript implementation of SHA256 that fits into fewer than 8,192 bits. If we embed a Javascript interpreter into our signing algorithm, then it simply needs to evaluate this given program on q = 2(8,192)+128 = 16,512 different input points, compare each result to its own copy of SHA256, and if they all match, output the secret key.

However, if H is a random oracle, this is vastly harder for the attacker to exploit. The result of evaluating a random oracle at q distinct points should be a random string of (at minimum) q bits in length. Yet in order for the backdoor to be triggered, we require the encoding of program P to be less than half that size. You can therefore think of the process by which the attacker compresses a random string into that program P to be a very effective compression algorithm, one takes in a random string, and compresses it into a string of less than half the size.

Despite what you may have seen on Silicon Valley (NSFW), compression algorithms do not succeed in compressing random strings that much with high probability. Indeed, for a given string of bits, this is so unlikely to occur that the attacker succeeds with at probability that is at most negligible in the scheme’s security parameter k. This effectively neutralizes the backdoor when H is a random oracle.

Phew.

So what does this all mean?

Judging by actions, and not words, the cryptographers of the world have been largely split on this question.

Theoretical cryptographers, for their part, gently chuckled at the silly practitioners who had been hoping to use random functions as hash functions. Brushing pipe ash from their lapels, they returned to more important tasks, like finding ways to kill off cryptographic obfuscation.

Applied academic cryptographers greeted the new results with joy — and promptly authored 10,000 new papers, each of which found some new way to remove random oracles from an existing construction — while at the same time making said construction vastly slower, more complicated, and/or based on entirely novel made-up and flimsy number-theoretic assumptions. (Speaking from personal experience, this was a wonderful time.)

Practitioners went right on trusting the random oracle model. Because really, why not?

And if I’m being honest, it’s a bit hard to argue with the practitioners on this one.

That’s because a very reasonable perspective to take is that these “counterexample” schemes are ridiculous and artificial. Ok, I’m just being nice. They’re total BS, to be honest. Nobody would ever design a scheme that looks so ridiculous.

Specifically, you need a scheme that explicitly parses an input as a program, runs that program, and then checks to see whether the program’s output matches a different hash function. What real-world protocol would do something so stupid? Can’t we still trust the random oracle model for schemes that aren’t stupid like that?

Well, maybe and maybe not.

One simple response to this argument is that there are examples of schemes that are significantly less artificial, and yet still have random oracle problems. But even if one still views those results as artificial — the fact remains that while we only know of random oracle counterexamples that seem artificial, there’s no principled way for us to prove that the badness will only affect “artificial-looking” protocols. In fact, the concept of “artificial-looking” is largely a human judgement, not something one can realiably think about mathematically.

In fact, at any given moment someone could accidentally (or on purpose) propose a perfectly “normal looking” scheme that passes muster in the random oracle model, and then blows to pieces when it gets actually deployed with a standard hash function. By that point, the scheme may be powering our certificate authority infrastructure, or Bitcoin, or our nuclear weapons systems (if one wants to be dramatic.)

The probability of this happening accidentally seems low, but it gets higher as deployed cryptographic schemes get more complex. For example, people at Google are now starting to deploy complex multi-party computation and others are launching zero-knowledge protocols that are actually capable of running (or proving things about the execution of) arbitrary programs in a cryptographic way. We can’t absolutely rule out the possibility that the CGH and MRH-type counterexamples could actually be made to happen in these weird settings, if someone is a just a little bit careless.

It’s ultimately a weird and frustrating situation, and frankly, I expect it all to end in tears.

Photo by Flickr user joyosity.

Notes:

* Intuitively, this definition sounds a lot like “pseudorandomness”. Pseudorandom functions are required to be indistinguishable from random functions only in a setting where the attacker does not know some “secret key” used for the function. Whereas hash functions are often used in protocols where there is no opporunity to use a secret key, such as in public key encryption protocols.

** One particular hope was that we could find a way to obfuscate pseudorandom function families (PRFs). The idea would be to wrap up a keyed PRF that could be evaluated by anyone, even if they didn’t actually know the key. The result would be indistinguishable from a random function, without actually being one.

*** It might seem like “assume the existence of a secure signature scheme” drags in an extra assumption. However: if we’re going to make statements in the random oracle model it turns out there’s no additional assumption. This is because in the ROM we have access to “secure” (at least collision-resistant, [second] pre-image resistant) hash function, which means that we can build hash-based signatures. So the existence of signature schemes comes “free” with the random oracle model.

**** The “except with negligible probability [in the adjustable security parameter of the scheme]” caveat is important for two reasons. First, a dedicated attacker can always try to forge a signature just by brute-force guessing values one at a time until she gets one that satisfies the verification algorithm. If the attacker can run for an unbounded number of time steps, she’ll always win this game eventually. This is why modern complexity-theoretic cryptography assumes that our attackers must run in some reasonable amount of time — typically a number of time steps that is polynomial in the scheme’s security parameter. However, even a polynomial-time bounded adversary can still try to brute force the signature. Her probability of succeeding may be relatively small, but it’s non-zero: for example, she might succeed after the first guess. So in practice what we ask for in security definitions like UF-CMA is not “no attacker can ever forge a signature”, but rather “all attackers succeed with at most negligible probability [in the security parameter of the scheme]”, where negligible has a very specific meaning.

Let’s talk about PAKE

The first rule of PAKE is: nobody ever wants to talk about PAKE. The second rule of passwordPAKE is that this is a shame, because PAKE — which stands for Password Authenticated Key Exchange — is actually one of the most useful technologies that (almost) never gets used. It should be deployed everywhere, and yet it isn’t.

To understand why this is such a damn shame, let’s start by describing a very real problem.

Imagine I’m operating a server that has to store user passwords. The traditional way to do this is to hash each user password and store the result in a password database. There are many schools of thought on how to handle the hashing process; the most common recommendation these days is to use a memory-hard password hashing function like scrypt or argon2 (with a unique per-password salt), and then store only the hashed result. There are various arguments about which hash function to use, and whether it could help to also use some secret value (called “pepper“), but we’ll ignore these for the moment.

Regardless of the approach you take, all of these solutions have a single achilles heel:

When the user comes back to log into your website, they will still need to send over their (cleartext) password, since this is required in order for the server to do the check. 

This requirement can lead to disaster if your server is ever persistently compromised, or if your developers make a simple mistake. For example, earlier this year Twitter asked all of its (330 million!) users to change their passwords — because it turned out that company had been logging cleartext (unhashed) passwords.

Now, the login problem doesn’t negate the advantage of password hashing in any way. But it does demand a better solution: one where the user’s password never has to go to the server in cleartext. The cryptographic tool that can give this to us is PAKE, and in particular a new protocol called OPAQUE, which I’ll get to at the end of this post.

What’s a PAKE?

A PAKE protocol, first introduced by Bellovin and Merritt, is a special form of cryptographic key exchange protocol. Key exchange (or “key agreement”) protocols are designed to help two parties (call them a client and server) agree on a shared key, using public-key cryptography. The earliest key exchange protocols — like classical Diffie-Hellman — were unauthenticated, which made them vulnerable to man-in-the-middle attacks. The distinguishing feature of PAKE protocols is the client will authenticate herself to the server using a password. For obvious reasons, the password, or a hash of it, is assumed to be already known to the server, which is what allows for checking.

If this was all we required, PAKE protocols would be easy to build. What makes a PAKE truly useful is that it should also provide protection for the client’s password. A stronger version of this guarantee can be stated as follows: after a login attempt (valid, or invalid) both the client and server should learn only whether the client’s password matched the server’s expected value, and no additional information. This is a powerful guarantee. In fact, it’s not dissimilar to what we ask for from a zero knowledge proof.

pakediagram
Ideal representation of a PAKE protocol. The two parties’ inputs also include some randomness, which isn’t shown. An eavesdropper should not learn the strong shared secret key K, which should itself be random and not simply a function of the password.

Of course, the obvious problem with PAKE is that many people don’t want to run a “key exchange” protocol in the first place! They just want to verify that a user knows a password.

The great thing about PAKE is that the simpler “login only” use-case is easy to achieve. If I have a standard PAKE protocol that allows a client and server to agree on a shared key K if (and only if) the client knows the right password, then all we need add is a simple check that both parties have arrived at the same key. (This can be done, for example, by having the parties compute some cryptographic function with it and check the results.) So PAKE is useful even if all you’ve got in mind is password checking.

SRP: The PAKE that Time Forgot

The PAKE concept seems like it provides an obvious security benefit when compared to the naive approach we use to log into servers today. And the techniques are old, in the sense that PAKEs have been known since way back in 1992! Despite this, they’ve seen from almost no adoption. What’s going on?

There are a few obvious reasons for this. The most obvious has to do with the limitations of the web: it’s much easier to put a password form onto a web page than it is to do fancy crypto in the browser. But this explanation isn’t sufficient. Even native applications rarely implement PAKE for their logins. Another potential explanation has to do with patents, though most of these are expired now. To me there are two likely reasons for the ongoing absence of PAKE: (1) there’s a lack of good PAKE implementations in useful languages, which makes it a hassle to use, and (2) cryptographers are bad at communicating the value of their work, so most people don’t know PAKE is even an option.

Even though I said PAKE isn’t deployed, there are some exceptions to the rule.

One of the remarkable ones is a 1998 protocol designed by Tom Wu [correction: not Tim Wu] and called “SRP”. Short for “Secure Remote Password“, this is a simple three-round PAKE with a few elegant features that were not found in the earliest works. Moreover, SRP has the distinction of being (as far as I know) the most widely-deployed PAKE protocol in the world. I cite two pieces of evidence for this claim:

  1. SRP has been standardized as a TLS ciphersuite, and is actually implemented in libraries like OpenSSL, even though nobody seems to use it much.
  2. Apple uses SRP extensively in their iCloud Key Vault.

This second fact by itself could make SRP one of the most widely used cryptographic protocols in the world, so vast is the number of devices that Apple ships. So this is nothing to sneer at.

Industry adoption of SRP is nice, but also kind of a bummer: mainly because while any PAKE adoption is cool, SRP itself isn’t the best PAKE we can deploy. I was planning to go into the weeds about why I feel so strongly about SRP, but it got longwinded and it distracted from the really nice protocol I actually want to talk about further below. If you’re still interested, I moved the discussion onto this page.

In lieu of those details, let me give a quick and dirty TL;DR on SRP:

  1. SRP does some stuff “right”. For one thing, unlike early PAKEs it does not require you to store a raw password on the server (or, equivalently, a hash that could be used by a malicious client in place of the password). Instead, the server stores a “verifier” which is a one-way function of the password hash. This means a leak of the password database does not (immediately) allow the attacker to impersonate the user — unless they conduct further expensive dictionary attacks. (The technical name for this is “asymmetric” PAKE.)
  2. Even better, the current version of SRP (v4 v6a) isn’t obviously broken!
  3. However (and with no offense to the designers) the SRP protocol design is completely bonkers, and earlier versions have been broken several times — which is why we’re now at revision 6a. Plus the “security proof” in the original research paper doesn’t really prove anything meaningful.
  4. SRP currently relies on integer (finite field) arithmetic, and for various reasons (see point 3 above) the construction is not obviously transferable to the elliptic curve setting. This requires more bandwidth and computation, and thus SRP can’t take advantage of the many efficiency improvements we’ve developed in settings like Curve25519.
  5. SRP is vulnerable to pre-computation attacks, due to the fact that it hands over the user’s “salt” to any attacker who can start an SRP session. This means I can ask a server for your salt, and build a dictionary of potential password hashes even before the server is compromised.
  6. Despite all these drawbacks, SRP is simple — and actually ships with working code. Plus there’s working code in OpenSSL that even integrates with TLS, which makes it relatively easy to adopt.

Out of all these points, the final one is almost certainly responsible for the (relatively) high degree of commercial success that SRP has seen when compared to other PAKE protocols. It’s not ideal, but it’s real. This is something for cryptographers to keep in mind.

OPAQUE: The PAKE of a new generation

When I started thinking about PAKEs a few months ago, I couldn’t help but notice that most of the existing work was kind of crummy. It either had weird problems like SRP, or it required the user to store the password (or an effective password) on the server, or it revealed the salt to an attacker — allowing pre-computation attacks.

Then earlier this year, Jarecki, Krawczyk and Xu proposed a new protocol called OPAQUE. Opaque has a number of extremely nice advantages:

  1. It can be implemented in any setting where Diffie-Hellman and discrete log (type) problems are hard. This means that, unlike SRP, it can be easily instantiated using efficient elliptic curves.
  2. Even better: OPAQUE does not reveal the salt to the attacker. It solves this problem by using an efficient “oblivious PRF” to combine the salt with the password, in a way that ensures the client does not learn the salt and the server does not learn the password.
  3. OPAQUE works with any password hashing function. Even better, since all the hashing work is done on the client, OPAQUE can actually take load off the server, freeing an online service up to use much strong security settings — for example, configuring scrypt with large RAM parameters.
  4. In terms of number of messages and exponentiations, OPAQUE is not much different from SRP. But since it can be implemented in more efficient settings, it’s likely to be a lot more efficient.
  5. Unlike SRP, OPAQUE has a reasonable security proof (in a very strong model).

There’s even an Internet Draft proposal for OPAQUE, which you can read here. Unfortunately, at this point I’m not aware of any production quality implementations of the code (if you know of one, please link to it in the comments and I’ll update). (Update: There are several potential implementations listed in the comments — I haven’t looked closely enough to endorse any, but this is great!) But that should soon change.

The full OPAQUE protocol is given a little bit further below. In the rest of this section I’m going to go into the weeds on how OPAQUE works.

Problem 1: Keeping the salt secret. As I mentioned above, the main problem with earlier PAKEs is the need to transmit the salt from a server to a (so far unauthenticated) client. This enables an attacker to run pre-computation attacks, where they can build an offline dictionary based on this salt.

The challenge here is that the salt is typically fed into a hash function (like scrypt) along with the password. Intuitively someone has to compute that function. If it’s the server, then the server needs to see the password — which defeats the whole purpose. If it’s the client, then the client needs the salt.

In theory one could get around this problem by computing the password hashing function using secure two-party computation (2PC). In practice, solutions like this are almost certainly not going to be efficient — most notably because password hashing functions are designed to be complex and time consuming, which will basically explode the complexity of any 2PC system.

OPAQUE gets around this with the following clever trick. They leave the password hash on the client’s side, but they don’t feed it the stored salt. Instead, they use a special two-party protocol called an oblivious PRF to calculate a second salt (call it salt2) so that the client can use salt2 in the hash function — but does not learn the original salt.

The basic idea of such a function is that the server and client can jointly compute a function PRF(salt, password), where the server knows “salt” and the client knows “password”. Only the client learns the output of this function. Neither party learns anything about the other party’s input.

The gory details:

The actual implementation of the oblivious PRF relies on the idea that the client has the password P and the server has the salt, which is expressed as a scalar value s. The output of the PRF function should be of the form H(P)^s, where H:\{0,1\}^* \rightarrow {\mathcal G} is a special hash function that hashes passwords into elements of a cyclic (prime-order) group.

To compute this PRF requires a protocol between the client and server. In this protocol, the client first computes H(P) and then “blinds” this password by selecting a random scalar value r, and blinding the result to obtain C = H(P)^r. At this point, the client can send the blinded value C over to the server, secure in the understanding that (in a prime-order group), the blinding by r hides all partial information about the underlying password.

The server, which has a salt value s, now further exponentiates this calue to obtain R = C^s and sends the result R back to the client. If we write this out in detail, the result can be expressed as $R = H(P)^{rs}$. The client now computes the inverse of its own blinding value r and exponentiates one more time as follows: R' = R^{r^{-1}} = H(P)^s. This element R', which consists of the hash of the password exponentiated by the salt, is the output of the desired PRF function.

A nice feature of this protocol is that, if the client enters the wrong password into the protocol, she should obtain a value that is very different from the actual value she wants. This guarantee comes from the fact that the hash function is likely to produce wildly different outputs for distinct passwords.

Problem 2: Proving that the client got the right key K. Of course, at this point, the client has derived a key K, but the server has no idea what it is. Nor does the server know whether it’s the right key.

The solution OPAQUE uses based an old idea due to Gentry, Mackenzie and Ramzan. When the user first registers with the server, she generates a strong public and private key for a secure agreement protocol (like HMQV), and encrypts the resulting private key under K, along with the server’s public key. The resulting authenticated ciphertext (and the public key) is stored in the password database.

C = Encrypt(K, client secret key | server’s public key)

opaqueprotocol
Full OPAQUE protocol, excerpted from the paper.

When the client wishes to authenticate using the OPAQUE protocol, the server sends it the stored ciphertext C. If the client entered the right password into the first phase, she can derive K, and now decrypt this ciphertext. Otherwise it’s useless. Using the embedded secret key, she can now run a standard authenticated key agreement protocol to complete the handshake. (The server verifies the clients’ inputs against its copy of the client’s public key, and the client does similarly.)

Putting it all together. All of these different steps can be merged together into a single protocol that has the same number of rounds as SRP. Leaving aside the key verification steps, it looks like the protocol above. Basically, just two messages: one from the client and one returned to the server.

The final aspect of the OPAQUE work is that it includes a strong security proof that shows the resulting protocol can be proven secure under the 1-more discrete logarithm assumption in the random oracle model, which is a (well, relatively) standard assumption that appears to hold in the settings we work with.

In conclusion

So in summary, we have this neat technology that could make the process of using passwords much easier, and could allow us to do it in a much more efficient way — with larger hashing parameters, and more work done by the client? Why isn’t this everywhere?

Maybe in the next few years it will be.

 

 

 

 

In defense of Provable Security

It’s been a long time with no blogging, mostly thanks to travel and deadlines. In fact I’m just coming back from a workshop in Tenerife, where I learned a lot. Some of which I can’t write about yet, but am really looking forward to sharing with you soon.

During the workshop I had some time to talk to Dan Bernstein (djb), and to hear his views on the relevance of provable security. This is a nice coincidence, since I notice that Dan’s slides have been making the rounds on Twitter — to the general approval of some who, I suspect, agree with Dan because they think that security proofs are hard.

The problem here is that this isn’t what Dan’s saying. Part of the issue is that his presentation is short, so it’s easy to misinterpret his position as a call to just start designing cryptosystems the way we design software. That’s not right, or if it is: get ready for a lot of broken crypto.

This post is my attempt to explain what Dan’s saying, and then (hopefully) convince you he’s not recommending the crazy things above.

There’s no such thing as a “security proof”

Dan’s first point is that we’re using the wrong nomenclature. The term ‘security proof’ is misleading in that it gives you the impression that a scheme is, well… provably secure. There aren’t many schemes that can make this claim (aside from the One-Time Pad). Most security proofs don’t say this, and that can lead to misunderstandings.

The proofs that we see in day-to-day life are more accurately referred to as security reductions. These take something (like a cryptographic scheme) and reduce its security to the hardness of some other problem — typically a mathematical problem, but sometimes even another cryptosystem.

A classic example of this is something like the RSA-PSS signature, which is unforgeable if the RSA problem is hard, or Chaum-van Heijst-Pfitzmann hashing, which reduce to the hardness of the Discrete Logarithm problem. But there are more complex examples like block cipher modes of operation, which can often be reduced to the (PRP) security of a block cipher.

So the point here is that these proofs don’t actually prove security — since the RSA problem or Discrete Log or block cipher could still be broken. What they do is allow us to generalize: instead of analyzing many different schemes, we can focus our attention one or a small number of hard problems. In other words, it’s a different — and probably much better — way to allocate our (limited) cryptanalytic effort.

But we don’t study those problems well, and when we do, we break them

Dan argues that this approach is superficially appealing, but concretely it can be harmful. Take the Chaum et al. hash function listed above. Nobody should ever use this thing: it’s disastrously slow and there’s no solid evidence that it’s truly more secure than something like SHA-3 or even SHA-3’s lamest competitors.

And here (unfortunately) he’s got some evidence on his side: we’ve been amazingly unsuccessful at cryptanalyzing complex new cipher/hash primitives like AES, BLAKE and Keccak, despite the fact that these functions don’t have [real] security proofs. Where we make cryptanalytic progress, it’s almost always on first-round competition proposals, or on truly ancient functions like MD5. Moreover, if you take a look at ‘provably-secure’ number theoretic systems from the same era, you’ll find that they’re equally broken — thanks to bad assumptions about key and parameter sizes.

We’ve also gotten pretty good at chipping away at classic problems like the Discrete Logarithm. The charitable interpretation is that this is a feature, not a bug — we’re focusing cryptanalytic effort on those problems, and we’re making progress, whereas nobody’s giving enough attention to all these new ciphers. The less charitable interpretation is that the Discrete Logarithm problem is a bad problem to begin with. Maybe we’re safer with unprovable schemes that we can’t break, then provable schemes that seem to be slowly failing.

You need a cryptanalyst…

This is by far the fuzziest part (for me) of what Dan’s saying. Dan argues that security proofs are a useful tool, but they’re no substitute for human cryptanalysis. None of which I would argue with at all. But the question is: cryptanalysis of what?

The whole point of a security reduction is to reduce the amount of cryptanalysis we have to do. Instead of a separate signature and encryption scheme to analyze, we can design two schemes that both reduce to the RSA problem, then we can cryptanalyze that. Instead of analyzing a hundred different authenticated cipher modes, we can simply analyze one AES — and know that OCB and GCM and CBC and CTR will all be secure (for appropriate definitions of ‘secure’).

This is good, and it’s why we should be using security proofs. Not to mislead people, but to help us better allocate our very scarce resources — of smart people who can do this work (and haven’t sold out to the NSA).

…because people make mistakes

One last point: errors in security proofs are pretty common, but this isn’t quite what Dan is getting at. We both agree that this problem can be fixed, hopefully with the help of computer-aided proof techniques. Rather, he’s concerned that security proofs only prove that something is secure within a given model. There are  many examples of provably-secure schemes that admit attacks because those attacks were completely outside of that threat model.

As an example, Dan points to some older EC key agreement protocols that did not explicitly include group membership tests in their description. Briefly, these schemes are secure if the attacker submits valid elements of an elliptic curve group. But of course, a real life attacker might not. The result can be disastrously insecure.

So where’s the problem here? Technically the proof is correct — as long as the attacker submits group elements, everything’s fine. What the protocol doesn’t model is the fact that an attacker can cheat — it just assumes honesty. Or as Dan puts it: ‘the attack can’t even be modeled in the language of the proof’.

What Dan’s not saying

The one thing you should not take away from this discussion is the idea that security proofs have no value. What Dan is saying is that security proofs are one element of the design process, but not 100% of it. And I can live with that.

The risk is that some will see Dan’s talk as a justification for using goofy, unprovable protocols like PKCS#1v1.5 signature or the SRP password protocol. It’s not. We have better protocols that are just as well analyzed, but actually have a justification behind them.

Put it this way: if you have a choice between driving on a suspension bridge that was designed using scientific engineering techniques, and one that simply hasn’t fallen down yet, which one are you going to take? Me, I’ll take the scientific techniques. But I admit that scientifically-designed bridges sometimes do fall down.

In conclusion

While I’ve done my best to sum up Dan’s position, what I’ve written above is probably still a bit inaccurate. In fact, it’s entirely possible that I’ve just constructed a ‘strawman djb’ to argue with here. If so, please don’t blame me — it’s a whole lot easier to argue with a straw djb than the real thing.

On the (provable) security of TLS: Part 2

This is the second post in a series on the provable security of TLS. If you haven’t read the first part, you really should!

The goal of this series is to try to answer an age-old question that is often asked and rarely answered. Namely: is the TLS protocol provably secure?

While I find the question interesting in its own right, I hope to convince you that it’s of more than academic interest. TLS is one of the fundamental security protocols on the Internet, and if it breaks lots of other things will too. Worse, it has broken — repeatedly. Rather than simply patch and hope for the best, it would be fantastic if we could actually prove that the current specification is the right one.

Unfortunately this is easier said than done. In the first part of this series I gave an overview of the issues that crop up when you try to prove TLS secure. They come at you from all different directions, but most stem from TLS’s use of ancient, archaic cryptography; gems like, for example, the ongoing use of RSA-PKCS#1v1.5 encryption fourteen years after it was shown to be insecure.

Despite these challenges, cryptographers have managed to come up with a handful of nice security results on portions of the protocol. In the previous post I discussed Jonnson and Kaliski’s proof of security for the RSA-based TLS handshake. This is an important and confidence-inspiring result, given that the RSA handshake is used in almost all TLS connections.

In this post we’re going to focus on a similarly reassuring finding related to the the TLS record encryption protocol — and the ‘mandatory’ ciphersuites used by the record protocol in TLS 1.1 and 1.2 (nb: TLS 1.0 is broken beyond redemption). What this proof tells us is that TLS’s CBC mode ciphersuites are secure, assuming… well, a whole bunch of things, really.

The bad news is that the result is extremely fragile, and owes its existence more to a series of happy accidents than from any careful security design. In other words, it’s just like TLS itself.

Records and handshakes

Let’s warm up with a quick refresher.

TLS is a layered protocol, with different components that each do a different job. In the previous post I mostly focused on the handshake, which is a beefed-up authenticated key agreement protocol. Although the handshake does several things, its main purpose is to negotiate a shared encryption key between a client and a server — parties who up until this point may be complete strangers.

The handshake gets lots of attention from cryptographers because it’s exciting. Public key crypto! Certificates! But really, this portion of the protocol only lasts for a moment. Once it’s done, control heads over to the unglamorous record encryption layer which handles the real business of the protocol: securing application data.

Most kids don’t grow up dreaming about a chance to work on the TLS record encryption layer, and that’s fine — they shouldn’t have to. All the record encryption layer does is, well, encrypt stuff. In 2012 that should be about as exciting as mailing a package.

And yet TLS record encryption still manages to be a source of endless excitement! In the past year alone we’ve seen three critical (and exploitable!) vulnerabilities in this part of TLS. Clearly, before we can even talk about the security of record encryption, we have to figure out what’s wrong with it.

Welcome to 1995

Development of the SSLv1
record encryption layer

The problem (again) is TLS’s penchant for using prehistoric cryptography, usually justified on some pretty shaky ‘backwards compatibility‘ grounds. This excuse is somewhat bogus, since the designers have actually changed the algorithms in ways that break compatibility with previous versions — and yet retained many of the worst features of the originals.

The most widely-used ciphersuites employ a block cipher configured in CBC mode, along with a MAC to ensure record authenticity. This mode can be used with various ciphers/MAC algorithms, but encryption always involves the following steps:

  1. If both sides support TLS compression, first compress the plaintext.
  2. Next compute a MAC over the plaintext, record type, sequence number and record length. Tack the MAC onto the end of the plaintext.
  3. Pad the result with up to 256 bytes of padding, such that the padded length is a multiple of the cipher’s block size. The last byte of the padding should contain the padding length (excluding this byte), and all padding bytes must also contain the same value. A padded example (with AES) might look like:

    0x MM MM MM MM MM MM MM MM MM 06 06 06 06 06 06 06

  4. Encrypt the padded message using CBC mode. In TLS 1.0 the last block of the previous ciphertext (called the ‘residue’) is used as the Initialization Vector. Both TLS 1.1 and 1.2 generate a fresh random IV for each record.
Upon decryption the attacker verifies that the padding is correctly structured, checks the MAC, and outputs an error if either condition fails.

To get an idea of what’s wrong with the CBC ciphersuite, you can start by looking at the appropriate section of the TLS 1.2 spec — which reads more like the warning label on a bottle of nitroglycerin than a cryptographic spec. Allow me sum up the problems.

First, there’s the compression. It’s long been known that compression can leak information about the contents of a plaintext, simply by allowing the adversary to see how well it compresses. The CRIME attack recently showed how nasty this can get, but the problem is not really news. Any analysis of TLS encryption begins with the assumption that compression is turned off.

Next there’s the issue of the Initialization Vectors. TLS 1.0 uses the last block of the previous ciphertext, which is absurd, insecure and — worse — actively exploitable by the BEAST attack. Once again, the issue has been known for years yet was mostly ignored until it was exploited.

So ok: no TLS 1.0, no compression. Is that all?

Well, we still haven’t discussed the TLS MAC, which turns out to be in the wrong place — it’s applied before the message is padded and encrypted. This placement can make the protocol vulnerable to padding oracle attacks, which (amazingly) will even work across handshakes. This last fact is significant, since TLS will abort the connection (and initiate a new handshake) whenever a decryption error occurs in the record layer. It turns out that this countermeasure is not sufficient.

To deal with this, recent versions of TLS have added the following patch: they require implementers to hide the cause of each decryption failure — i.e., make MAC errors indistinguishable from padding failures. And this isn’t just a question of changing your error codes, since clever attackers can learn this information by measuring the time it takes to receive an error. From the TLS 1.2 spec:

In general, the best way to do this is to compute the MAC even if the padding is incorrect, and only then reject the packet. For instance, if the pad appears to be incorrect, the implementation might assume a zero-length pad and then compute the MAC. This leaves a small timing channel, since MAC performance depends to some extent on the size of the data fragment, but it is not believed to be large enough to be exploitable.

To sum up: TLS is insecure if your implementation leaks the cause of a decryption error, but careful implementations can avoid leaking much, although admittedly they probably will leak some — but hopefully not enough to be exploited. Gagh!

At this point, just take a deep breath and say ‘all horses are spherical‘ three times fast, cause that’s the only way we’re going to get through this.

Accentuating the positive

Having been through the negatives, we’re almost ready to say nice things about TLS. Before we do, let’s just take a second to catch our breath and restate some of our basic assumptions:

  1. We’re not using TLS 1.0 because it’s broken.
  2. We’re not using compression because it’s broken.
  3. Our TLS implementation is perfect — i.e., doesn’t leak any information about why a decryption failed. This is probably bogus, yet we’ve decided to look the other way.
  4. Oh yeah: we’re using a secure block cipher and MAC (in the PRP and PRF sense respectively).**

And now we can say nice things. In fact, thanks to a recent paper by Kenny Paterson, Thomas Ristenpart and Thomas Shrimpton, we can say a few surprisingly positive things about TLS record encryption.

What Paterson/Ristenpart/Shrimpton show is that TLS record encryption satisfies a notion they call ‘length-hiding authenticated encryption‘, or LHAE. This new (and admittedly made up) notion not only guarantees the confidentiality and authenticity of records, but ensures that the attacker can’t tell how long they are. The last point seems a bit extraneous, but it’s important in the case of certain TLS libraries like GnuTLS, which actually add random amounts of padding to messages in order to disguise their length.

There’s one caveat to this proof: it only works in cases where the MAC has an output size that’s greater or equal to the cipher’s block size. This is, needless to say, a totally bizarre and fragile condition for the security of a major protocol to hang on. And while the condition does hold for all of the real TLS ciphersuites we use — yay! — this is more a happy accident than the result of careful design on anyone’s part. It could easily have gone the other way.

So how does the proof work?

Good question. Obviously the best way to understand the proof is to read the paper itself. But I’d like to try to give an intuition.

First of all, we can save a lot of time by starting with the fact that CBC-mode encryption is already known to be IND-CPA secure if implemented with a secure block cipher (PRP). This result tells us only that CBC is secure against passive attackers who can request the encryption of chosen messages. (In fact, a properly-formed CBC mode ciphertext should be indistinguishable from a string of random bits.)

The problem with plain CBC-mode is that these security results don’t hold in cases where the attacker can ask for the decryption of chosen ciphertexts.

This limitation is due to CBC’s malleability — specifically, the fact that an attacker can tamper with a ciphertext, then gain useful information by sending the result to be decrypted. To show that TLS record encryption is secure, what we really want to prove is that tampering gives no useful results. More concretely, we want to show that asking for the decryption of a tampered ciphertext will always produce an error.

We have a few things working in our favor. First, remember that the underlying TLS record has a MAC on it. If the MAC is (PRF) secure, then any ciphertext tampering that results in a change to this record data or its MAC will be immediately detected (and rejected) by the decryptor. This is good.

Unfortunately the TLS MAC doesn’t cover the padding. To continue our argument, we need to show that no attacker can produce a legitimate ciphertext, and that includes tampering that messes with the padding section of the message. Here again things look intuitively good for TLS. During decryption, the decryptor checks the last byte of the padded message to see how much padding there is, then verifies that all padding bytes contain the same numeric value. Any tampering that affects this section of the plaintext should either:

  1. Produce inconsistencies in some padding bytes, resulting in a padding error, or
  2. Cause the wrong amount of padding to be stripped off, resulting in a MAC error.
Both of these error conditions will appear the same to the attacker, thanks to the requirement that TLS implementations hide the reason for a decryption failure. Once again, the attacker should learn nothing useful.

This all seems perfectly intuitive, and you can imagine the TLS developers making exactly this argument as they wrote up the spec. However there’s one small exception to the rule above, which can turn TLS implementations that add an unnecessarily large amount of padding to the plaintext. (For example, GnuTLS.)

To give an example, let’s say the unpadded record + MAC is 15 bytes. If we’re using AES, then this plaintext can be padded with a single byte. Of course, if we’re inclined to add extra padding, it could also be padded with seventeen bytes — both are valid padding strings. The two possible paddings are presented below:

If the extra-long padding is used, the attacker could maul the longer ciphertext by truncating it — throwing away the last 16-byte ciphertext block. Truncation is a viable way to maul CBC ciphertexts, since it has the same effect on the underlying plaintext. The CBC decryption of the truncated ciphertext would yield:
Which not very useful, since the invalid padding would lead to a decryption error. However, the attacker could fix this — this time, using the fact that TLS can be mauled by flipping bits in the last byte of the Initialization Vector. Such an attack would allow the attacker to convert that trailing 0x10 into an 0x00. This result is now valid ciphertext that will not produce an error on decryption.
So what has the attacker learned by this attack? In practice, not very much. Mostly what they know is the length of the original ciphertext — so much for GnuTLS’s attempt to disguise the length. But more fundamentally: this attacker of ‘mauling’ the ciphertext totally screws up the nice idea we were going for in our proof.
So the question is: can this attack be used against real TLS? And this is where the funny restriction about MAC size comes into play.

You see, if TLS MACs are always bigger than a ciphertext block, then all messages will obey a strict rule: no padding will ever appear in the first block of the CBC ciphertext.

Since the padding is now guaranteed to start in the second (or later) block of the CBC ciphertext, the attacker cannot ‘tweak’ it by modifying the IV (this attack only works against the first block of the plaintext). Instead, they would have to tamper with a ciphertext block. And in CBC mode, tampering with ciphertext blocks has consequences! Such a tweak will allow the attacker to change padding bytes, but as a side effect it will cause one entire block of the record or MAC to be randomized when decrypted. And what Paterson/Ristenpart/Shrimpton prove is that this ‘damage’ will inevitably lead to a MAC error.

This ‘lucky break’ means that an attacker can’t successfully tamper with a CBC-mode TLS ciphertext. And that allows us to push our way to a true proof of the CBC-mode TLS ciphersuites. By contrast, if the MAC was only 80 bits (as it is in some IPSEC configurations), the proof would not be possible. So it goes.

Now I realize this has all been pretty wonky, and that’s kind of the point! The moral to the story is that we shouldn’t need this proof in the first place! What it illustrates is how fragile and messy the TLS design really is, and how (once again) it achieves security by luck and the skin of its teeth, rather than secure design.

What about stream ciphers?

The good news — to some extent — is that none of the above problems apply to stream ciphers, which don’t attempt to hide the record length, and don’t use padding in the first place. So the security of these modes is much ‘easier’ to argue.

Of course, this is only ‘good news’ if you believe that the stream ciphers included with TLS are good in the first place. The problem, of course, is that the major supported stream cipher is RC4. I will leave it to the reader to decide if that’s a worthwhile tradeoff.

In conclusion

There’s probably a lot more that can be said about TLS record encryption, but really… I think this post is probably more than anyone (outside of the academic community and a few TLS obsessives) has ever wanted to read on the subject.

In the next posts I’m going to come back to the much more exciting Diffie-Hellman handshake and then try to address the $1 million and $10 million questions respectively. First: does anyone really implement TLS securely? And second: when can we replace this thing?

Notes:

* One thing I don’t mention in this post is the TLS 1.0 ’empty fragment’ defense, which actually works against BEAST and has been deployed in OpenSSL for several years. The basic idea is to encrypt an empty record of length 0 before each record goes over the wire. In practice, this results in a full record structure with a MAC, and prevents attackers from exploiting the residue bug. Although nobody I know of has ever proven it secure, the proof is relatively simple and can be arrived at using standard techniques.

** The typical security definition for a MACs is SUF-CMA (strongly unforgeable under chosen message attack). This result uses the stronger — but also reasonable — assumption that the MAC is actually a PRF.

On the (provable) security of TLS: Part 1

If you sit a group of cryptographers down and ask them whether TLS is provably secure, you’re liable to get a whole variety of answers. Some will just giggle. Others will give a long explanation that hinges on the definitions of ‘prove‘ and ‘secure‘. What you will probably not get is a clear, straight answer.

In fairness, this is because there is no clear, straight answer. Unfortunately, like all the things you really need to know in life, it’s complicated.

Still, just because something’s complicated doesn’t mean that we get to ignore it. And the security of TLS is something we really shouldn’t ignore —  because TLS has quietly become the most important and trusted security protocol on the Internet. While plenty of security experts will point out the danger of improperly configured TLS setups, most tend to see (properly configured) TLS as a kind of gold standard — and it’s now used in all kinds of applications that its designers would probably never have envisioned.

And this is a problem, because (as Marsh points out) TLS (or rather, SSL) was originally designed to secure $50 credit card transactions, something it didn’t always do very well. Yes, it’s improved over the years. On the other hand, gradual improvement is no substitute for correct and provable security design.

All of which brings us to the subject of this — and subsequent — posts: even though TLS wasn’t designed to be a provably-secure protocol, what can we say about it today? More specifically, can we prove that the current version of TLS is cryptographically secure? Or is there something fundamental that’s holding it back?

The world’s briefest overview of TLS

If you’re brand new to TLS, this may not be the right post for you. Still, if you’re looking for a brief refresher, here it is:

TLS is a protocol for establishing secure (Transport layer) communications between two parties, usually denoted as a Client and a Server.

The protocol consists of several phases. The first is a negotiation, in which the two peers agree on which cryptographic capabilities they mutually support — and try to decide whether a connection is even possible. Next, the parties engage in a Key Establishment protocol that (if successful) authenticates one or both of the parties, typically using certificates, and allows the pair to establish a shared Master Secret. This is done using a handful of key agreement protocols, including various flavors of Diffie-Hellman key agreement and an RSA-based protocol.

Finally, the parties use this secret to derive encryption and/or authentication keys for secure communication. The rest of the protocol is very boring — all data gets encrypted and sent over the wire in a (preferably) secure and authenticated form. This image briefly sums it up:
sy10660a
Overview of TLS (source)

So why can’t we prove it secure?

TLS sounds very simple. However, it turns out that you run into a few serious problems when you try to assemble a security proof for the protocol. Probably the most serious holdup stems from the age of the cryptography used in TLS. To borrow a term from Eric Rescorla: TLS’s crypto is downright prehistoric.

This can mostly be blamed on TLS’s predecessor SSL, which was designed in the mid-90s — a period better known as ‘the dark ages of industrial crypto‘. All sorts of bad stuff went down during this time, much of which we’ve (thankfully) forgotten about. But TLS is the exception! Thanks to years of backwards compatibility requirements, it’s become a sort of time capsule for all sorts of embarrassing practices that should have died a long slow death in a moonlit graveyard.

For example:

  1. The vast majority of TLS handshakes use an RSA padding scheme known as PKCS#1v1.5. PKCS#1v1.5 is awesome — if you’re teaching a class on how to attack cryptographic protocols. In all other circumstances it sucks. The scheme was first broken all the way back in 1998. The attacks have gotten better since.
  2. The (AES)-CBC encryption used by SSL and TLS 1.0 is just plain broken, a fact that was recently exploited by the BEAST attack. TLS 1.1 and 1.2 fix the glaring bugs, but still use non-standard message authentication techniques, that have themselves been attacked.
  3. Today — in the year 2012 — the ‘recommended’ patch for the above issue is to switch to RC4. Really, the less said about this the better.

Although each of these issues have all been exploited in the past, I should be clear that current implementations do address them. Not by fixing the bugs, mind you — but ratherby applying band-aid patches to thwart the known attacks. This mostly works in practice, but it makes security proofs an exercise in frustration.

The second obstacle to proving things about TLS is the sheer complexity of the protocol. In fact, TLS is less a ‘protocol’ than it is a Frankenstein monster of of options and sub-protocols, all of which provide a breeding ground for bugs and attacks. Unfortunately, the vast majority of academic security analyses just ignore all of the ‘extra junk’ in favor of addressing the core crypto. Which is good! But also too bad — since these options where practically every real attack on TLS has taken place.

Lastly, it’s not clear quite which definition of security we should even use in our analysis. In part this is because the TLS specification doesn’t exactly scream ‘I want to be analyzed‘. In part it’s because definitions have been evolving over the years.

So you’re saying TLS is unprovable?

No. I’m just lowering expectations.

The truth is there’s a lot we do know about the security of TLS, some of it surprisingly positive. Several academic works have even given us formal ‘proofs’ of certain aspects of the protocol. The devil here lies mainly in the definitions of ‘proof‘ and — worryingly — ‘TLS‘.

For those who don’t live for the details, I’m going to start off by listing a few of the major known results here. In no particular order, these are:

  1. If properly implemented with a secure block cipher, the TLS 1.1/1.2 record encryption protocol meets a strong definition of (chosen ciphertext attack) security. Incidentally, the mechanism is also capable of hiding the length of the encrypted records. (Nobody even bothers to claim this for TLS 1.0 — which everybody agrees is busted.)
  2. A shiny new result from CRYPTO 2012 tells us that (a stripped-down version of) the Diffie-Hellman handshake (DHE) is provably secure in the ‘standard model’ of computation. Moreover, combined with result (1), the authors prove that TLS provides a secure channel for exchanging messages. Yay!This result is important — or would be, if more people actually used DHE. Unfortunately, at this point more people bike to work with their cat than use TLS-DHE.
  3. At least two excellent works have tackled the RSA-based handshake, which is the one most people actually use. Both works succeed in proving it secure, under one condition: you don’t actually use it! To be more explicit: both proofs quietly replace the PKCS#v1.5 padding (in actual use) with something better. If only the real world worked this way.
  4. All is not lost, however: back in 2003 Jonnson and Kaliski were able to prove security for the real RSA handshake without changing the protocol. Their proof is in the random oracle model (no biggie), but unfortunately it requires a new number-theoretic hardness assumption that nobody has talked about or revisited since. In practice this may be fine! Or it may not be. Nobody’s been able to investigate it further, since every researcher who tried to look into it… wound up dead. (No, I’m just kidding. Although that would be cool.) 
  5. A handful of works have attempted to analyze the entirety of SSL or TLS using machine-assisted proof techniques. This is incredibly ambitious, and moreover it’s probably the only real way to tackle the problem. Unfortunately, the proofs hugely simplify the underlying cryptography, and thus don’t cover the full range of attacks. Moreover, only computers can read them.

While I’m sure there are some important results I’m missing here, this is probably enough to take us 95% of the way to the ‘provable’ results on TLS. What remains is the details.

And oh boy, are there details. More than I can fit in one post. So I’m going to take this one chunk at a time, and see how many it takes.

An aside: what are we trying to prove, anyway?

One thing I’ve been a bit fast-and-loose on is: what are we actually proving about these protocols in the first place, anyway? The question deserves at least a few words — since it’s received thousands in the academic literature.

One obvious definition of security can be summed up as ‘an attacker on the wire can’t intercept modify data’, or otherwise learn the key being established. But simply keeping the data secret isn’t enough; there’s also a matter of freshness. Consider the following attack:

  1. I record communications between a legitimate client and his bank’s server, in which the client requests $5,000 to be transferred from her account to mine.
  2. Having done this, I replay the client’s messages to the server a hundred times. If all goes well, I’m  a whole lot richer.

Now this is just one example, but it shows that the protocol does require a bit of thinking. Taking this a step farther, we might also want to ensure that the master secret is random, meaning even if (one of) the Client or Server are dishonest, they can’t force the key to a chosen value. And beyond that, what we really care about is that the protocol data exchange is secure.

Various works take different approaches to defining security for TLS. This is not surprising, given the ‘security analysis‘ provided in the TLS spec itself is so incomplete that we don’t quite know what the protocol was intended to do in the first place. We’ll come back to these definitional issues in a bit.

The RSA handshake

TLS supports a number of different ciphersuites and handshake protocols. As I said above, the RSA-based handshake is the most popular one — almost an absurd margin. Sure, a few sites (notably Google) have recently begun supporting DHE and ECDH across the board. For everyone else there’s RSA.

So RSA seems as good a place as any to start. Even better, if you ignore client authentication and strip the handshake down to its core, it’s a pretty simple protocol:

Clearly the ‘engine’ of this protocol is the third step, where the Client picks a random 48-byte pre-master secret (PMS) and encrypts it under the server’s key. Since the Master Secret is derived from this (plus the Client and Server Randoms), an attacker who can’t ‘break’ this encryption shouldn’t be able to derive the Master Key and thus produce the correct check messages at the bottom.

So now for the $10,000,000 question: can we prove that the RSA encryption secure? And the simple answer is no, we can’t. This is because the encryption scheme used by TLS is RSA-PKCS#1v1.5, which is not — by itself — provably secure.

Indeed, it’s worse than that. Not only is PKCS#1v1.5 not provably secure, but it’s actually provably insecure. All the way back in 1998 Daniel Bleichenbacher demonstrated that PKCS#1v1.5 ciphertexts can be gradually decrypted, by repeatedly modifying them and sending the modified versions to be decrypted. If the decryptor (the server in this case) reveals whether the ciphertext is correctly formatted, the attacker can actually recover the encrypted PMS.

              0x 00 02 { at least 8 non-zero random bytes } 00 { two-byte length } { 48-byte PMS }

RSA-PKCS #1v1.5 encryption padding for TLS

Modern SSL/TLS servers are wise to this attack, and they deal with it in a simple way. Following decryption of the RSA ciphertext they check the padding. If it does not have the form above, they select a random PMS and go forward with the calculation of the Master Secret using this bogus replacement value. In principle, this means that the server calculates a basically random key — and the protocol doesn’t fail until the very end.

In practice this seems to work, but proving it is tricky! For one thing, it’s not enough to treat the RSA encryption as a black box — all of these extra steps and the subsequent calculation of the Master Secret are now intimately bound up with the security of the protocol, in a not-necessarily-straightforward way.

There are basically two ways to deal with this. Approach #1, taken by Morrisey, Smart and Warinschi and Gajek et al., follows what I call the ‘wishful thinking‘ paradigm. Both show that if you modify the encryption scheme used in TLS — for example, by eliminating the ‘random’ encryption padding, or swapping in a CCA-secure scheme like RSA-OAEP — the handshake is secure under a reasonable definition. This is very interesting from an academic point of view; it just doesn’t tell us much about TLS.

Fortunately there’s also the ‘realist‘ approach. This is embodied by an older CRYPTO paper by Jonnson and Kaliski. These authors considered the entirety of the TLS handshake, and showed that (1) if you model the PRF (or part of it) as a random oracle, and (2) assume some very non-standard number-theoretic assumptions, and (more importantly) (3) the TLS implementation is perfect, then you can actually prove that the RSA handshake is secure.

This is a big deal!

Unfortunately it has a few problems. First, it’s highly inelegant, and basically depends on all the parts working in tandem. If any part fails to work properly — if for example, the server leaks any information that could indicate whether the encryption padding is valid or not — then the entire thing crashes down. That the combined protocol works is in fact, more an accident of fate than the result of any real security engineering.

Secondly, the reduction extremely ‘loose’. This means that a literal interpretation would imply that we should be using ginormous RSA keys, much bigger than we do today. We obviously don’t pay attention to this result, and we’re probably right not to. Finally, it requires that we adopt odd number-theoretic assumption involving a ‘partial-RSA decision oracle’, something that quite frankly, feels kind of funny. While we’re all guilty of making up an assumption from time to time, I’ve never seen this assumption either before or since, and it’s significant that the scheme has no straightforward reduction the RSA problem (even in the random oracle model), something we would get if we used RSA-OAEP.

If your response is: who cares — well, you may be right. But this is where we stand.

So where are we?

Good question. We’ve learned quite a lot actually. When I started this post my assumption was that TLS was going to be a giant unprovable mess. Now that I’m halfway through this summary, my conclusion is that TLS is, well, still mostly a giant mess — but one that smacks of potential.

Of course so far I’ve only covered a small amount of ground. We still have yet to cover the big results, which deal with the Diffie-Hellman protocol, the actual encryption of data (yes, important!) and all the other crud that’s in the real protocol.

But those results will have to wait until the next post: I’ve hit my limit for today.

Click here for Part 2.

EAX’, Knight Rider, and an admission of defeat

A few weeks back I found myself on the wrong side of Daniel Bernstein over something I’d tweeted the week before. This was pretty surprising, since my original tweet hadn’t seemed all that controversial.

What I’d said was that cryptographic standards aren’t always perfect, but non-standard crypto is almost always worse. Daniel politely pointed out that I was nuts — plenty of bad stuff appears in standards, and conversely, plenty of good stuff isn’t standardized. (As you can see, the conversation got a little weirder after that.)

Today I’m here to say that I’ve found religion. Not only do I see where Daniel’s coming from, I’m here to surrender, throw down my hat and concede defeat. Daniel: you win. I still think standards are preferable in theory, but only if they’re promulgated by reasonable standards bodies. And we seem to have a shortage of those.

My new convictions are apropos of an innocuous-looking ePrint just posted by Kazuhiko Minematsu, Hiraku Morita and Tetsu Iwata. These researchers have found serious flaws in an authenticated block cipher mode of operation called EAX’ (henceforth: EAXprime). EAXprime was recently adopted as the encryption mode for ANSI’s Smart Grid standard, and (until today) was practically a shoo-in to become a standalone NIST-certified mode of operation.

Ok, so standards get broken. Why I am I making such a big deal about this one? The simple reason is that EAXprime isn’t just another standard. It’s a slightly-modified version of EAX mode, which was proposed by Bellare, Rogaway and Wagner. And the important thing to know about EAX (non-prime) is that it comes with a formal proof of security.

It’s hard to explain how wonderful this is. The existence of such a proof means that (within limits) a vulnerability in EAX mode would indicate a problem with the underlying cipher (e.g., AES) itself. Since we’re pretty confident in the security of our standard block ciphers, we can extend that confidence to EAX. And the best part: this wonderful guarantee costs us almost nothing — EAX is a very efficient mode of operation.

But not efficient enough for ANSI, which decided to standardize on a variant called EAXprime. EAXprime is faster: it uses 3-5 fewer block cipher calls to encrypt each message, and (in the case of AES) about 40 bytes less RAM to store scheduled keys. (This is presumably important when your target is a tiny little embedded chip in a smart meter.)

Unfortunately, there’s a cost to that extra speed: EAXprime is no longer covered by the original EAX security proof. Which brings us towards the moral of the story, and to the Minematsu, Morita and Iwata paper.

Did you ever see that old episode of Knight Rider where the knight-riderbad guys figure out how to neutralize KITT‘s bulletproof coating? Reading this paper is kind of like watching the middle part of that episode. Everything pretty much looks the same but holy crap WTF the bullets aren’t bouncing off anymore.

The MMI attacks allow an adversary to create ciphertexts (aka forgeries) that seem valid even though they weren’t created by the actual encryptor. They’re very powerful in that sense, but they’re limited in others (they only work against very short messages). Still, at the end of the day, they’re attacks. Attacks that couldn’t possibly exist if the standards designers had placed a high value on EAX’s security proof, and had tried to maintain that security in their optimized standard.

And this is why I’m admitting defeat on this whole standards thing. How can I advocate for crypto standards when standards bodies will casually throw away something as wonderful as a security proof? At least when KITT lost his bulletproof coating it was because of something the bad guys did to him. Can you imagine the good guys doing that to him on purpose?