What is the random oracle model and why should you care? (Part 5)

This is part five of a series on the Random Oracle Model.  See here for the previous posts:

About eight years ago I set out to write a very informal piece on a specific cryptographic modeling technique called the “random oracle model”. This was way back in the good old days of 2011, which was a more innocent and gentle era of cryptography. Back then nobody foresaw that all of our standard cryptography would turn out to be riddled with bugs; you didn’t have to be reminded that “crypto means cryptography“. People even used Bitcoin to actually buy things.

That first random oracle post somehow sprouted three sequels, each more ridiculous than the last. I guess at some point I got embarrassed about the whole thing — it’s pretty cheesy, to be honest — so I kind of abandoned it unfinished. And that’s been a major source of regret for me, since I had always planned a fifth, and final post, to cap the whole messy thing off. This was going to be the best of the bunch: the one I wanted to write all along.

To give you some context, let me briefly remind you what the random oracle model is, and why you should care about it. (Though you’d do better just to read the series.)

The random oracle model is a bonkers way to model (reason about) hash functions, in which we assume that these are actually random functions and use this assumption to prove things about cryptographic protocols that are way more difficult to prove without such a model. Just about all the “provable” cryptography we use today depends on this model, which means that many of these proofs would be called into question if it was “false”.

And to tease the rest of this post, I’ll quote the final paragraphs of Part 4, which ends with this:

You see, we always knew that this ride wouldn’t last forever, we just thought we had more time. Unfortunately, the end is nigh. Just like the imaginary city that Leonardo de Caprio explored during the boring part of Inception, the random oracle model is collapsing under the weight of its own contradictions.

As promised, this post will be about that collapse, and what it means for cryptographers, security professionals, and the rest of us.

First, to make this post a bit more self-contained I’d like to recap a few of the basics that I covered earlier in the series. You can feel free to skip this part if you’ve just come from there.

In which we (very quickly) remind the reader what hash functions are, what random functions are, and what a random oracle is.

As discussed in the early sections of this series, hash functions (or hashing algorithms) are a standard primitive that’s used in many areas of computer science. They take in some input, typically a string of variable length, and repeatably output a short and fixed-length “digest”. We often denote these functions as follows:

${\sf digest} \leftarrow H({\sf message})$

Cryptographic hashing takes this basic template and tacks on some important security properties that we need for cryptographic applications. Most famously these provide  well-known properties like collision resistance, which is needed for applications like digital signatures. But hash functions turn up all over cryptography, sometimes in unexpected places — ranging from encryption to zero-knowledge protocols — and sometimes these systems demand stronger properties. Those can sometimes be challenging to put into formal terms: for example, many protocols require a hash function to produce output that is extremely “random-looking”.*

In the earliest days of provably security, cryptographers realized that the ideal hash function would behave like a “random function”. This term refers to a function that is uniformly sampled from the set of all possible functions that have the appropriate input/output specification (domain and range). In a perfect world your protocol could, for example, randomly sample one of vast number of possible functions at setup, bake the identifier of that function into a public key or something, and then you’d be good to go.

Unfortunately it’s not possible to actually use random functions (of reasonably-sized domain and range) in real protocols. That’s because sampling and evaluating those functions is far too much work.

For example, the number of distinct functions that consume a piddly 256-bit input and produce a 256-bit digest is a mind-boggling (2^{256})^{2^{256}}. Simply “writing down” the identity of the function you chose would require memory that’s exponential in the function’s input length. Since we want our cryptographic algorithms to be efficient (meaning, slightly more formally, they run in polynomial time), using random functions is pretty much unworkable.

So we don’t use random functions to implement our hashing. Out in “the real world” we use weird functions developed by Belgians or the National Security Agency, things like like SHA256 and SHA3 and Blake2. These functions come with blazingly fast and tiny algorithms for computing them, most of which occupy few dozen lines of code or less. They certainly aren’t random, but as best we can tell, the output looks pretty jumbled up.

Still, protocol designers continue to long for the security that using  truly random function could give their protocol. What if, they asked, we tried to split the difference. How about we model our hash functions using random functions — just for the sake of writing our security proofs —  and then when we go to implement (or “instantiate”) our protocols, we’ll go use efficient hash functions like SHA3? Naturally these proofs wouldn’t exactly apply to the real protocol as instantiated, but they might still be pretty good.

A proof that uses this paradigm is called a proof in the random oracle model, or ROM. For the full mechanics of how the ROM works you’ll have to go back and read the series from the beginning. What you do need to know right now is that proofs in this model must somehow hack around the fact that evaluating a random function takes exponential time. The way the model handles this is simple: instead of giving the individual protocol participants a description of the hash function itself — it’s way too big for anyone to deal with — they give each party (including the adversary) access to a magical “oracle” that can evaluate the random function H efficiently, and hand them back a result.

This means that any time one of the parties wants to compute the function $H({\sf message})$ they don’t do it themselves. They instead calling out to a third party, the “random oracle” who keeps a giant table of random function inputs and outputs. At a high level, the model looks like sort of like this:

Since all parties in the system “talk” to the same oracle, they all get the same hash result when they ask it to hash a given message. This is a pretty good standin for what happens with a real hash function. The use of an outside oracle allows us to “bury” the costs of evaluating a random function, so that nobody else needs to spend exponential time evaluating one. Inside this artificial model, we get ideal hash functions with none of the pain.

It absolutely is!

However — I think there are several very important things you should know about the random oracle model before you write it off as obviously inane:

1. Of course everyone knows random oracle proofs aren’t “real”. Most conscientious protocol designers will admit that proving something secure in the random oracle model does not actually mean it’ll be secure “in the real world”. In other words, the fact that random oracle model proofs are kind of bogus is not some deep secret I’m letting you in on.

2. And anyway: ROM proofs are generally considered a useful heuristic. For those who aren’t familiar with the term, “heuristic” is a word that grownups use when they’re about to secure your life’s savings using cryptography they can’t prove anything about.

I’m joking! In fact, random oracle proofs are still quite valuable. This is mainly because they often help us detect bugs in our schemes. That is, while a random oracle proof doesn’t imply security in the real world, the inability to write one is usually a red flag for protocols. Moreover, the existence of a ROM proof is hopefully an indicator that the “guts” of the protocol are ok, and that any real-world issues that crop up will have something to do with the hash function.

3. ROM-validated schemes have a pretty decent track record in practice. If ROM proofs were kicking out absurdly broken schemes every other day, we would probably have abandoned this technique. Yet we use cryptography that’s proven (only) in the ROM just about ever day — and mostly it works fine.

This is not to say that no ROM-proven scheme has ever been broken, when instantiated with a specific hash function. But normally these breaks happen because the hash function itself is obvious broken (as happened when MD4 and MD5 both cracked up a while back.) Still, those flaws are generally fixed by simply switching to a better function. Moreover, the practical attacks are historically more likely to come from obvious flaws, like the discovery of hash collisions screwing up signature schemes, rather than from some exotic mathematical flaw. Which brings us to a final, critical note…

4. For years, many people believed that the ROM could actually be saved. This hope was driven by the fact that ROM schemes generally seemed to work pretty well when implemented with strong hash functions, and so perhaps all we needed to do was to find a hash function that was “good enough” to make ROM proofs meaningful. Some theoreticians hoped that fancy techniques like cryptographic obfuscation could somehow be used to make concrete hashing algorithms that behaved well enough to make (some) ROM proofs instantiable.**

So that’s kind of the state of the ROM, or at least — it was the state up until the late 1990s. We knew this model was artificial, and yet it stubbornly refused to explode or produce totally nonsense results.

And then, in 1998, everything went south.

CGH98: an “uninstantiable” scheme

For theoretical cryptographers, the real breaking point for the random oracle model came in the form of a 1998 STOC paper by Canetti, Goldreich and Halevi (henceforth CGH). I’m going to devote the rest of this (long!) post to explaining the gist of what they found.

What CGH proved was that, in fact, there exist cryptographic schemes that can be proven perfectly secure in the random oracle model, but that — terrifyingly — become catastrophically insecure the minute you instantiate the hash function with any concrete function.

This is a really scary result, at least from the point of view of the provable security community. It’s one thing to know in theory that your proofs might not be that strong. It’s a different thing entirely to know that in practice there are schemes that can walk right past your proofs like a Terminator infiltrating the Resistance, and then explode all over you in the most serious way.

Before we get to the details of CGH and its related results, a few caveats.

First, CGH is very much a theory result. The cryptographic “counterexample” schemes that trip this problem generally do not look like real cryptosystems that we would use in practice, although later authors have offered some more “realistic” variants. They are, in fact, designed to do very artificial things that no “real” scheme would ever do. This might lead readers to dismiss them on the grounds of artificiality.

The problem with this view is that looks aren’t a particularly scientific way to judge a scheme. Both “real looking” and “artificial” schemes are, if proven correct, valid cryptosystems. The point of these specific counterexamples is to do deliberately artificial things in order to highlight the problems with the ROM. But that does not mean that “realistic” looking schemes won’t do them.

A further advantage of these “artificial” schemes is that they make the basic ideas relatively easy to explain. As a further note on this point: rather than explaining CGH itseld, I’m going to use a formulation of the same basic result that was proposed by Maurer, Renner and Holenstein (MRH).

A signature scheme

The basic idea of CGH-style counterexamples is to construct a “contrived” scheme that’s secure in the ROM, but totally blows up when we “instantiate” the hash function using any concrete function, meaning a function that has a real description and can be efficiently evaluated by the participants in the protocol.

While the CGH techniques can apply with lots of different types of cryptosystem, in this explanation, we’re going to start our example using a relatively simple type of system: a digital signature scheme.

You may recall from earlier episodes of this series that a normal signature scheme consists of three algorithms: key generation, signing, and verification. The key generation algorithm outputs a public and secret key. Signing uses the secret key to sign a message, and outputs a signature. Verification takes the resulting signature, the public key and the message, and determines whether the signature is valid: it outputs “True” if the signature checks out, and “False” otherwise.

Traditionally, we demand that signature schemes be (at least) existentially unforgeable under chosen message attack, or UF-CMA. This means that that we consider an efficient (polynomial-time bounded) attacker who can ask for signatures on chosen messages, which are produced by a “signing oracle” that contains the secret signing key. Our expectation of a secure scheme is that, even given this access, no attacker will be able to come up with a signature on some new message that she didn’t ask the signing oracle to sign for her, except with negligible probability.****

Having explained these basics, let’s talk about what we’re going to do with it. This will involve several steps:

Step 1: Start with some existing, secure signature scheme. It doesn’t really matter what signature scheme we start with, as long as we can assume that it’s secure (under the UF-CMA definition described above.) This existing signature scheme will be used as a building block for the new scheme we want to build.*** We’ll call this scheme S.

Step 2: We’ll use the existing scheme S as a building block to build a “new” signature scheme, which we’ll call ${\bf S_{\sf broken}}$. Building this new scheme will mostly consist of grafting weird bells and whistles onto the algorithms of the original scheme S.

Step 3: Having described the working of ${\bf S_{\sf broken}}$ in detail, we’ll argue that it’s totally secure in the ROM. Since we started with an (assumed) secure signature scheme S, this argument mostly comes down to showing that in the random oracle model the weird additional features we added in the previous step don’t actually make the scheme exploitable.

Step 4: Finally, we’ll demonstrate that ${\bf S_{\sf broken}}$ is totally broken when you instantiate the random oracle with any concrete hash function, no matter how “secure” it looks. In short, we’ll show that one you replace the random oracle with a real hash function, there’s a simple attack that always succeeds in forging signatures.

We’ll start by explaining how ${\bf S_{\sf broken}}$ works.

Building a broken scheme

To build our contrived scheme, we begin with the existing secure (in the UF-CMA sense) signature scheme S. That scheme comprises the three algorithms mentioned above: key generation, signing and verification.

We need to build the equivalent three algorithms for our new scheme.

To make life easier, our new scheme will simply “borrow” two of the algorithms from S, making no further changes at all. These two algorithms will be the key generation and signature verification algorithms So two-thirds of our task of designing the new scheme is already done.

Each of the novel elements that shows up in ${\bf S_{\sf broken}}$ will therefore appear in the signing algorithm. Like all signing algorithms, this algorithm takes in a secret signing key and some message to be signed. It will output a signature.

At the highest level, our new signing algorithm will have two subcases, chosen by a branch that depends on the input message to be signed. These two cases are given as follows:

The “normal” case: for most messages M, the signing algorithm will simply run the original signing algorithm from the original (secure) scheme S. This will output a perfectly nice signature that we can expect to work just fine.

The “evil” case: for a subset of (reasonably-sized) messages that have a different (and very highly specific) form, our signing algorithm will not output a signature. It will instead output the secret key for the entire signature scheme. This is an outcome that cryptographers will sometimes call “very, very bad.”

So far this description still hides all of the really important details, but at least it gives us an outline of where we’re trying to go.

Recall that under the UF-CMA definition I described above, our attacker is allowed to ask for signatures on arbitrary messages. When we consider using this definition with our modified signing algorithm, it’s easy to see that the presence of these two cases could make things exciting.

Specifically: if any attacker can construct a message that triggers the “evil” case, her request to sign a message will actually result in her obtaining the scheme’s secret key. From that point on she’ll be able to sign any message that she wants — something that obviously breaks the UF-CMA security of the scheme. If this is too theoretical for you: imagine requesting a signed certificate from LetsEncrypt, and instead obtaining a copy of LetsEncrypt’s signing keys. Now you too are a certificate authority. That’s the situation we’re describing.

The only way this scheme could ever be proven secure is if we could somehow rule out the “evil” case happening at all.

More concretely: we would have to show that no attacker can construct a message that triggers the “evil case” — or at least, that their probability of coming up with such a message is very, very low (negligible). If we could prove this, then our scheme ${\bf S_{\sf broken}}$ basically just reduces to being the original secure scheme. Which means our new scheme would be secure.

In short: what we’ve accomplished is to build a kind of “master password” backdoor into our new scheme ${\bf S_{\sf broken}}$. Anyone who knows the password can break the scheme. Everything now depends on whether an attacker can figure out that password.

So what is the “backdoor”?

The message that breaks the scheme ${\bf S_{\sf broken}}$ isn’t a password at all, of course. Because this is computer science and nothing is ever easy, the message will actually be a computer program. We’ll call it P.

More concretely, it will be some kind of program that can decoded within our new signing algorithm, and then evaluated (on some input) by an interpreter that we will also place within that algorithm.

If we’re being formal about this, we’d say the message contains an encoding of a program for a universal Turing machine (UTM), along with a unary-encoded integer t that represents the number of timesteps that the machine should be allowed to run for. However, it’s perfectly fine with me if you prefer to think of the message as containing a hunk of Javascript, an Ethereum VM blob combined with some maximum “gas” value to run on, a .tgz encoding of a Docker container, or any other executable format you fancy.

What really matters is the functioning of the program P.

A program P that successfully triggers the “evil case” is one that contains an efficient (e.g., polynomial-sized) implementation of a hash function. And not just any hash function. To actually trigger the backdoor, the algorithm P must a function that is identical to, or at least highly similar to, the random oracle function H.

There are several ways that the signing algorithm can verify this similarity. The MRH paper gives a very elegant one, which I’ll discuss further below. But for the purposes of this immediate intuition, let’s assume that our signing algorithm verifies this similarity probabilistically. Specifically: to check that P matches H, it won’t verify the correspondence at every possible input. It might, for example, simply verify that P(x) = H(x) for some large (but polynomial) number of random input values x.

So that’s the backdoor.

Let’s think briefly about what this means for security, both inside and outside of the random oracle mode.

Case 1: in the random oracle model

Recall that in the random oracle model, the “hash function” H is modeled as a random function. Nobody in the protocol actually has a copy of that function, they just have access to a third party (the “random oracle”) who can evaluate it for them.

If an attacker wishes to trigger the “evil case” in our signing scheme, they will somehow need to download a description of the random function from the oracle. then encode it into a program P, and send it to the signing oracle. This seems fundamentally hard.

To do this precisely — meaning that P would match H on every input — the attacker would need to query the random oracle on every possible input, and then design a program P that encodes all of these results. It suffices to say that this strategy would not be practical: it would require an exponential amount of time to do any of these, and the size of P would also be exponential in the input length of the function. So this attacker would seem virtually guaranteed to fail.

Of course the attacker could try to cheat: make a small function P that only matches H on a small of inputs, and hope that the signer doesn’t notice. However, even this seems pretty challenging to get away with. For example, to perform a probabilistic check, the signing algorithm can simply verify that P(x) = H(x) for a large number of random input points x. This approach will catch a cheating attacker with very high probability.

(We will end up using a slightly more elegant approach to checking the function and arguing this point further below.)

The above is hardly an exhaustive security analysis. But at a high level our argument should now be clear: in the random oracle model, the scheme ${\bf S_{\sf broken}}$ is secure because the attacker can’t know a short enough backdoor “password” that breaks the scheme. Having eliminated the “evil case”, the scheme ${\bf S_{\sf broken}}$ simply devolves to the original, secure scheme S.

Case 2: In the “real world”

Out in the real world, we don’t use random oracles. When we want to implement a scheme that has a proof in the ROM, we must first “instantiate” the scheme by substituting in some real hash function in place of the random oracle H.

This instantiated hash function must, by definition, be efficient to evaluate and describe. This means implicitly that it possesses a polynomial-size description and can be evaluated in expected polynomial time. If we did not require this, our schemes would never work. Moreover, we must further assume that all parties, including the attacker, possess a description of the hash function. That’s a standard assumption in cryptography, and is merely a statement of Kerckhoff’s principle.

With these facts stipulated, the problem with our new signature scheme becomes obvious.

In this setting, the attacker actually does have access to a short, efficient program P that matches the hash function H. In practice, this function will probably be something like SHA2 or Blake2. But even in a weird case where it’s some crazy obfuscated function, the attacker is still expected to have a program that they can efficiently evaluate. Since the attacker possesses this program, they can easily encode it into a short enough message and send it to the signing oracle.

When the signing algorithm receives this program, it will perform some kind of test of this function P against its own implementation of H, and — when it inevitably finds a match between the two functions with high probability — it will output the scheme’s secret key.

Hence, out in the real world our scheme ${\bf S_{\sf broken}}$ is always and forever, totally broken.

A few boring technical details (that you can feel free to skip)

If you’re comfortable with the imprecise technical intuition I’ve given above, feel free to skip this section. You can jump on to the next part, which tries to grapple with tough philosophical questions like “what does this mean for the random oracle model” and “I think this is all nonsense” and “why do we drive on a parkway, and park in a driveway?

All I’m going to do here is clean up a few technical details.

One of the biggest pieces that’s missing from the intuition above is a specification of how the signing algorithm verifies that the program P it receives from the attacker actually “matches” the random oracle function H. The obvious way is to simply evaluate P(x) = H(x) on every possible input x, and output the scheme’s secret key if every comparison succeeds. But doing this exhaustively requires exponential time.

The MRH paper proposes a very neat alternative way to tackle this. They propose to test the functions on a few input values, and not even random ones. More concretely, they propose checking that P(x) = H(x) for values of $x \in \{1, \dots, q\}$ with the specific requirement that q is an integer such that $q = 2|P| + k$. Here $|P|$ represents the length of the encoding of program P in bits, and k is the scheme’s adjustable security parameter (for example, k=128).

What this means is that to trigger the backdoor, the attacker must come up with a program P that can be described in some number of bits (let’s call it n) , and yet will be able to correctly match the outputs of H at e.g., q=2n+128 different input points. If we conservatively assume that H produces (at least) a 1-bit digest, that means we’re effectively encoding at least 2n+128 bits of data into a string of length n.

If the function H is a real hash function like SHA256, then it should be reasonably easy for the attacker to find some n-bit program that matches H at, say, q=2n+128 different points. For example, here’s a Javascript implementation of SHA256 that fits into fewer than 8,192 bits. If we embed a Javascript interpreter into our signing algorithm, then it simply needs to evaluate this given program on q = 2(8,192)+128 = 16,512 different input points, compare each result to its own copy of SHA256, and if they all match, output the secret key.

However, if H is a random oracle, this is vastly harder for the attacker to exploit. The result of evaluating a random oracle at q distinct points should be a random string of (at minimum) q bits in length. Yet in order for the backdoor to be triggered, we require the encoding of program P to be less than half that size. You can therefore think of the process by which the attacker compresses a random string into that program P to be a very effective compression algorithm, one takes in a random string, and compresses it into a string of less than half the size.

Despite what you may have seen on Silicon Valley (NSFW), compression algorithms do not succeed in compressing random strings that much with high probability. Indeed, for a given string of bits, this is so unlikely to occur that the attacker succeeds with at probability that is at most negligible in the scheme’s security parameter k. This effectively neutralizes the backdoor when H is a random oracle.

Phew.

So what does this all mean?

Judging by actions, and not words, the cryptographers of the world have been largely split on this question.

Theoretical cryptographers, for their part, gently chuckled at the silly practitioners who had been hoping to use random functions as hash functions. Brushing pipe ash from their lapels, they returned to more important tasks, like finding ways to kill off cryptographic obfuscation.

Applied academic cryptographers greeted the new results with joy — and promptly authored 10,000 new papers, each of which found some new way to remove random oracles from an existing construction — while at the same time making said construction vastly slower, more complicated, and/or based on entirely novel made-up and flimsy number-theoretic assumptions. (Speaking from personal experience, this was a wonderful time.)

Practitioners went right on trusting the random oracle model. Because really, why not?

And if I’m being honest, it’s a bit hard to argue with the practitioners on this one.

That’s because a very reasonable perspective to take is that these “counterexample” schemes are ridiculous and artificial. Ok, I’m just being nice. They’re total BS, to be honest. Nobody would ever design a scheme that looks so ridiculous.

Specifically, you need a scheme that explicitly parses an input as a program, runs that program, and then checks to see whether the program’s output matches a different hash function. What real-world protocol would do something so stupid? Can’t we still trust the random oracle model for schemes that aren’t stupid like that?

Well, maybe and maybe not.

One simple response to this argument is that there are examples of schemes that are significantly less artificial, and yet still have random oracle problems. But even if one still views those results as artificial — the fact remains that while we only know of random oracle counterexamples that seem artificial, there’s no principled way for us to prove that the badness will only affect “artificial-looking” protocols. In fact, the concept of “artificial-looking” is largely a human judgement, not something one can realiably think about mathematically.

In fact, at any given moment someone could accidentally (or on purpose) propose a perfectly “normal looking” scheme that passes muster in the random oracle model, and then blows to pieces when it gets actually deployed with a standard hash function. By that point, the scheme may be powering our certificate authority infrastructure, or Bitcoin, or our nuclear weapons systems (if one wants to be dramatic.)

The probability of this happening accidentally seems low, but it gets higher as deployed cryptographic schemes get more complex. For example, people at Google are now starting to deploy complex multi-party computation and others are launching zero-knowledge protocols that are actually capable of running (or proving things about the execution of) arbitrary programs in a cryptographic way. We can’t absolutely rule out the possibility that the CGH and MRH-type counterexamples could actually be made to happen in these weird settings, if someone is a just a little bit careless.

It’s ultimately a weird and frustrating situation, and frankly, I expect it all to end in tears.

Photo by Flickr user joyosity.

Notes:

* Intuitively, this definition sounds a lot like “pseudorandomness”. Pseudorandom functions are required to be indistinguishable from random functions only in a setting where the attacker does not know some “secret key” used for the function. Whereas hash functions are often used in protocols where there is no opporunity to use a secret key, such as in public key encryption protocols.

** One particular hope was that we could find a way to obfuscate pseudorandom function families (PRFs). The idea would be to wrap up a keyed PRF that could be evaluated by anyone, even if they didn’t actually know the key. The result would be indistinguishable from a random function, without actually being one.

*** It might seem like “assume the existence of a secure signature scheme” drags in an extra assumption. However: if we’re going to make statements in the random oracle model it turns out there’s no additional assumption. This is because in the ROM we have access to “secure” (at least collision-resistant, [second] pre-image resistant) hash function, which means that we can build hash-based signatures. So the existence of signature schemes comes “free” with the random oracle model.

**** The “except with negligible probability [in the adjustable security parameter of the scheme]” caveat is important for two reasons. First, a dedicated attacker can always try to forge a signature just by brute-force guessing values one at a time until she gets one that satisfies the verification algorithm. If the attacker can run for an unbounded number of time steps, she’ll always win this game eventually. This is why modern complexity-theoretic cryptography assumes that our attackers must run in some reasonable amount of time — typically a number of time steps that is polynomial in the scheme’s security parameter. However, even a polynomial-time bounded adversary can still try to brute force the signature. Her probability of succeeding may be relatively small, but it’s non-zero: for example, she might succeed after the first guess. So in practice what we ask for in security definitions like UF-CMA is not “no attacker can ever forge a signature”, but rather “all attackers succeed with at most negligible probability [in the security parameter of the scheme]”, where negligible has a very specific meaning.

Zero Knowledge Proofs: An illustrated primer

One of the best things about modern cryptography is the beautiful terminology. You could start any number of punk bands (or Tumblrs) named after cryptography terms like ‘hard-core predicate’, ‘trapdoor function’, or ‘impossible differential cryptanalysis’. And of course, I haven’t even mentioned the one term that surpasses all of these. That term is ‘zero knowledge‘.

In fact, the term ‘zero knowledge’ is so appealing that it leads to problems. People misuse it, assuming that zero knowledge must be synonymous with ‘really, really secure‘. Hence it gets tacked onto all kinds of stuff — like encryption systems and anonymity networks — that really have nothing to do with true zero knowledge protocols.

This all serves to underscore a point: zero-knowledge proofs are one of the most powerful tools cryptographers have ever devised. But unfortunately they’re also relatively poorly understood. In this series of posts I’m going try to give a (mostly) nonmathematical description of what ZK proofs are, and what makes them so special. In this post and the next I’ll talk about some of the ZK protocols we actually use.

Origins of Zero Knowledge

The notion of ‘zero knowledge’ was first proposed in the 1980s by MIT researchers Shafi Goldwasser, Silvio Micali and Charles Rackoff. These researchers were working on problems related to interactive proof systems, theoretical systems where a first party (called a ‘Prover’) exchanges messages with a second party (‘Verifier’) to convince the Verifier that some mathematical statement is true.*

Prior to Goldwasser et al., most work in this area focused the soundness of the proof system. That is, it considered the case where a malicious Prover attempts to ‘trick’ a Verifier into believing a false statement. What Goldwasser, Micali and Rackoff did was to turn this problem on its head. Instead of worrying only about the Prover, they asked: what happens if you don’t trust the Verifier?

The specific concern they raised was information leakage. Concretely, they asked, how much extra information is the Verifier going to learn during the course of this proof, beyond the mere fact that the statement is true?

It’s important to note that this is not simply of theoretical interest. There are real, practical applications where this kind of thing matters.

Here’s one: imagine that a real-world client wishes to log into a web server using a password. The standard ‘real world’ approach to this problem involves storing a hashed version of the password on the server. The login can thus be viewed as a sort of ‘proof’ that a given password hash is the output of a hash function on some password — and more to the point, that the client actually knows the password.

Most real systems implement this ‘proof’ in the absolute worst possible way. The client simply transmits the original password to the server, which re-computes the password hash and compares it to the stored value. The problem here is obvious: at the conclusion of the protocol, the server has learned my cleartext password. Modern password hygiene therefore involves a good deal of praying that servers aren’t compromised.

What Goldwasser, Micali and Rackoff proposed was a new hope for conducting such proofs. If fully realized, zero knowledge proofs would allow us to prove statements like the one above, while provably revealing no information beyond the single bit of information corresponding to ‘this statement is true’.

A ‘real world’ example

So far this discussion has been pretty abstract. To make things a bit more concrete, let’s go ahead and give a ‘real’ example of a (slightly insane) zero knowledge protocol.

For the purposes of this example, I’d like you to imagine that I’m a telecom magnate in the process of deploying a new cellular communications network. My network structure is represented by the graph below. Each vertex in this graph represents a cellular radio tower, and the connecting lines (edges) indicate locations where two cells overlap, meaning that their transmissions are likely to interfere with each other.

This overlap is problematic, since it means that signals from adjacent towers are likely to scramble reception. Fortunately my network design allows me to configure each tower to one of three different frequency bands to avoid such interference.

Thus the challenge in deploying my network is to assign frequency bands to the towers such that no two overlapping cells share the same frequencies. If we use colors to represent the frequency bands, we can quickly work out one solution to the problem:

Of course, many of you will notice that what I’m describing here is simply an instance of the famous theory problem called the graph three-coloring problem. You might also know that what makes this problem interesting is that, for some graphs, it can be quite hard to find a solution, or even to determine if a solution exists. In fact, graph three-coloring — specifically, the decision problem of whether a given graph supports a solution with three colors — is known to be in the complexity class NP-complete.

It goes without saying that the toy example above is easy to solve by hand. But what if it wasn’t? For example, imagine that my cellular network was very large and complex, so much so that the computing power at my disposal was not sufficient to find a solution. In this instance, it would be desirable to outsource the problem to someone else who has plenty of computing power. For example, I might hire my friends at Google to solve it for me on spec.

But this leads to a problem.

Suppose that Google devotes a large percentage of their computing infrastructure to searching for a valid coloring for my graph. I’m certainly not going to pay them until I know that they really have such a coloring. At the same time, Google isn’t going to give me a copy of their solution until I’ve paid up. We’ll wind up at an impasse.

In real life there’s probably a common-sense answer to this dilemma, one that involves lawyers and escrow accounts. But this is not a blog about real life, it’s a blog about cryptography. And if you’ve ever read a crypto paper, you’ll understand that the right way to solve this problem is to dream up an absolutely crazy technical solution.

A crazy technical solution (with hats!)

The engineers at Google consult with Silvio Micali at MIT, who in consultation with his colleagues Oded Goldreich and Avi Wigderson, comes up with the following clever protocol — one so elegant that it doesn’t even require any computers. All it requires is a large warehouse, lots of crayons, and plenty of paper. Oh yes, and a whole bunch of hats.**

Here’s how it works.

First I will enter the warehouse, cover the floor with paper, and draw a blank representation of my cell network graph. Then I’ll exit the warehouse. Google can now enter enter, shuffle a collection of three crayons to pick a random assignment of the three agreed-upon crayon colors (red/blue/purple, as in the example above), and color in the graph in with their solution. Note that it doesn’t matter which specific crayons they use, only that the coloring is valid.

Before leaving the warehouse, Google covers up each of the vertices with a hat. When I come back in, this is what I’ll see:

Obviously this approach protects Google’s secret coloring perfectly. But it doesn’t help me at all. For all I know, Google might have filled in the graph with a random, invalid solution. They might not even have colored the graph at all.

To address my valid concerns, Google now gives me an opportunity to ‘challenge’ their solution to the graph coloring. I’m allowed to pick — at random — a single ‘edge’ of this graph (that is, one line between two adjacent hats). Google will then remove the two corresponding hats, revealing a small portion of their solution:

Notice that there are two outcomes to my experiment:

1. If the two revealed vertices are the same color (or aren’t colored in at all!) then I definitely know that Google is lying to me. Clearly I’m not going to pay Google a cent.
2. If the two revealed vertices are different colors, then Google might not be lying to me.
Hopefully the first proposition is obvious. The second one requires a bit more consideration. The problem is that even after our experiment, Google could still be lying to me — after all, I only looked under two of the hats. If there are E different edges in the graph, then Google could fill in an invalid solution and still get away with it most of the time. Specifically, after one test they could succeed in cheating me with probability up to (E-1)/E (which for a 1,000 edge graph works out to 99.9% of the time).

Fortunately Google has an answer to this. We’ll just run the protocol again!

We put down fresh paper with a new, blank copy of the graph. Google now picks a new (random) shuffle of the three crayons. Next they fill in the graph with a valid solution, but using the new random ordering of the three colors.

The hats go back on. I come back in and repeat the challenge process, picking a new random edge. Once again the logic above applies. Only this time if all goes well, I should now be slightly more confident that Google is telling me the truth. That’s because in order to cheat me, Google would have had to get lucky twice in a row. That can happen — but it happens with relatively lower probability. The chance that Google fools me twice in a row is now (E-1)/E * (E-1)/(or about 99.8% probability for our 1,000 edge example above).

Fortunately we don’t have to stop at two challenges. In fact, we can keep trying this over and over again until I’m confident that Google is probably telling me the truth.

But don’t take my word for it. Thanks to some neat Javascript, you can go try it yourself.

Note that I’ll never be perfectly certain that Google is being honest — there’s always going to be a tiny probability that they’re cheating me. But after a large number of iterations (E^2, as it happens) I can eventually raise my confidence to the point where Google can only cheat me with negligible probability — low enough that for all practical purposes it’s not worth worrying about. And then I’ll be able to safely hand Google my money.

What you need to believe is that Google is also protected. Even if I try to learn something about their solution by keeping notes between protocol runs, it shouldn’t matter. I’m foiled by Google’s decision to randomize their color choices between each iteration. The limited information I obtain does me no good, and there’s no way for me to link the data I learn between interactions.

What makes it ‘zero knowledge’?

I’ve claimed to you that this protocol leaks no information about Google’s solution. But don’t let me get away with this! The first rule of modern cryptography is never to trust people who claim such things without proof.

Goldwasser, Micali and Rackoff proposed three following properties that every zero-knowledge protocol must satisfy. Stated informally, they are:

1. Completeness. If Google is telling the truth, then they will eventually convince me (at least with high probability).
2. Soundness. Google can only convince me if they’re actually telling the truth.
3. Zero-knowledgeness. (Yes it’s really called this.) I don’t learn anything else about Google’s solution.
We’ve already discussed the argument for completeness. The protocol will eventually convince me (with a negligible error probability), provided we run it enough times. Soundness is also pretty easy to show here. If Google ever tries to cheat me, I will detect their treachery with overwhelming probability.

The hard part here is the ‘zero knowledgeness’ property. To do this, we need to conduct a very strange thought experiment.

A thought experiment (with time machines)

First, let’s start with a crazy hypothetical. Imagine that Google’s engineers aren’t quite as capable as people make them out to be. They work on this problem for weeks and weeks, but they never manage to come up with a solution. With twelve hours to go until showtime, the Googlers get desperate. They decide to trick me into thinking they have a coloring for the graph, even though they don’t.

Their idea is to sneak into the GoogleX workshop and borrow Google’s prototype time machine. Initially the plan is to travel backwards a few years and use the extra working time to take another crack at solving the problem. Unfortunately it turns out that, like most Google prototypes, the time machine has some limitations. Most critically: it’s only capable of going backwards in time four and a half minutes.

So using the time machine to manufacture more working time is out. But still, it turns out that even this very limited technology can still be used to trick me.

 I don’t really know what’s going on here but it seemed apropos.

The plan is diabolically simple. Since Google doesn’t actually know a valid coloring for the graph, they’ll simply color the paper with a bunch of random colors, then put the hats on. If by sheer luck, I challenge them on a pair of vertices that happen to be different colors, everyone will heave a sigh of relief and we’ll continue with the protocol. So far so good.

Inevitably, though, I’m going to pull off a pair of hats and discover two vertices of the same color. In the normal protocol, Google would now be totally busted. And this is where the time machine comes in. Whenever Google finds themselves in this awkward situation, they simply fix it. That is, a designated Googler pulls a switch, ‘rewinds’ time about four minutes, and the Google team recolors the graph with a completely new random solution. Now they let time roll forward and try again.

In effect, the time machine allows Google to ‘repair’ any accidents that happen during their bogus protocol execution, which makes the experience look totally legitimate to me. Since bad challenge results will occur only 1/3 of the time, the expected runtime of the protocol (from Google’s perspective) is only moderately greater than the time it takes to run the honest protocol. From my perspective I don’t even know that the extra time machine trips are happening.

This last point is the most important. In fact, from my perspective, being unaware that the time machine is in the picture, the resulting interaction is exactly the same as the real thing. It’s statistically identical. And yet it’s worth pointing out again that in the time machine version, Google has absolutely no information about how to color the graph.

What the hell is the point of this?

What we’ve just shown is an example of a simulation. Note that in a world where time runs only forward and nobody can trick me with a time machine, the hat-based protocol is correct and sound, meaning that after E^2 rounds I should be convinced (with all but negligible probability) that the graph really is colorable and that Google is putting valid inputs into the protocol.

What we’ve just shown is that if time doesn’t run only forward — specifically, if Google can ‘rewind’ my view of time — then they can fake a valid protocol run even if they have no information at all about the actual graph coloring.

From my perspective, what’s the difference between the two protocol transcripts? When we consider the statistical distribution of the two, there’s no difference at all. Both convey exactly the same amount of useful information.

Believe it or not, this proves something very important.

Specifically, assume that I (the Verifier) have some strategy that ‘extracts’ useful information about Google’s coloring after observing an execution of the honest protocol. Then my strategy should work equally well in the case where I’m being fooled with a time machine. The protocol runs are, from my perspective, statistically identical. I physically cannot tell the difference.

Thus if the amount of information I can extract is identical in the ‘real experiment’ and the ‘time machine experiment’, yet the amount of information Google puts into the ‘time machine’ experiment is exactly zero — then this implies that even in the real world the protocol must not leak any useful information.

Thus it remains only to show that computer scientists have time machines. We do! (It’s a well-kept secret.)

Getting rid of the hats (and time machines)

Of course we don’t actually want to run a protocol with hats. And even Google (probably?) doesn’t have a literal time machine.

To tie things together, we first need to bring our protocol into the digital world. This requires that we construct the digital equivalent of a ‘hat’: something that both hides a digital value, while simultaneously ‘binding’ (or ‘committing’) the maker to it, so she can’t change her mind after the fact.

Fortunately we have a perfect tool for this application. It’s called a digital commitment scheme. A commitment scheme allows one party to ‘commit’ to a given message while keeping it secret, and then later ‘open’ the resulting commitment to reveal what’s inside. They can be built out of various ingredients, including (strong) cryptographic hash functions.******

Given a commitment scheme, we now have all the ingredients we need to run the zero knowledge protocol electronically. The Prover first encodes its vertex colorings as a set of digital messages (for example, the numbers 0, 1, 2), then generates digital commitments to each one. These commitments get sent over to the Verifier. When the Verifier challenges on an edge, the Prover simply reveals the opening values for the commitments corresponding to the two vertices.

So we’ve managed to eliminate the hats. But how do we prove that this protocol is zero knowledge?

Fortunately now that we’re in the digital world, we no longer need a real time machine to prove things about this protocol. A key trick is to specify in our setting that the protocol is not going to be run between two people, but rather between two different computer programs (or, to be more formal, probabilistic Turing machines.)

What we can now prove is the following theorem: if you could ever come up with a computer program (for the Verifier) that extracts useful information after participating in a run of the protocol, then it would be possible to use a ‘time machine’ on that program in order to make it extract the same amount of useful information from a ‘fake’ run of the protocol where the Prover doesn’t put in any information to begin with.

And since we’re now talking about computer programs, it should be obvious that rewinding time isn’t such an extraordinary feat at all. In fact, we rewind computer programs all the time. For example, consider using virtual machine software with a snapshot capability.

 Example of rewinding through VM snapshots. An initial VM is played forward, rewound to an initial snapshot, then execution is forked to a new path.

Even if you don’t have fancy virtual machine software, any computer program can be ‘rewound’ to an earlier state, simply by starting the program over again from the beginning and feeding it exactly the same inputs. Provided that the inputs — including all random numbers — are fixed, the program will always follow the same execution path. Thus you can rewind a program just by running it from the start and ‘forking’ its execution when it reaches some desired point.

Ultimately what we get is the following theorem. If there exists any Verifier computer program that successfully extracts information by interactively running this protocol with some Prover, then we can simply use the rewinding trick on that program to commit to a random solution, then ‘trick’ the Verifier by rewinding its execution whenever we can’t answer its challenge correctly. The same logic holds as we gave above: if such a Verifier succeeds in extracting information after running the real protocol, then it should be able to extract the same amount of information from the simulated, rewinding-based protocol. But since there’s no information going into the simulated protocol, there’s no information to extract. Thus the information the Verifier can extract must always be zero.

Ok, so what does this all mean?

So let’s recap. We know that the protocol is complete and sound, based on our analysis above. The soundness argument holds in any situation where we know that nobody is fiddling with time — that is, the Verifier is running normally and nobody is rewinding its execution.

At the same time, the protocol is also zero knowledge. To prove this, we showed that any Verifier program that succeeds in extracting information must also be able to extract information from a protocol run where rewinding is used and no information is available in the first place. Which leads to an obvious contradiction, and tells us that the protocol can’t leak information in either situation.

There’s an important benefit to all this. Since it’s trivial for anyone to ‘fake’ a protocol transcript, even after Google proves to me that they have a solution, I can’t re-play a recording of the protocol transcript to prove anything to anyone else (say, a judge). That’s because the judge would have no guarantee that the video was recorded honestly, and that I didn’t simply edit in the same way Google might have done using the time machine. This means that protocol transcripts themselves contain no information. The protocol is only meaningful if I myself participated, and I can be sure that it happened in real time.

Proofs for all of NP!

If you’ve made it this far, I’m pretty sure you’re ready for the big news. Which is that 3-coloring cellphone networks isn’t all that interesting of a problem — at least, not in and of itself.

The really interesting thing about the 3-coloring problem is that it’s in the class NP-complete. To put this informally, the wonderful thing about such problems is that any other problem in the class NP can be translated into an instance of that problem.In a single stroke, this result — due to Goldreich, Micali and Wigderson — proves that ‘efficient’ ZK proofs exists for a vast class of useful statements, many of which are way more interesting than assigning frequencies to cellular networks. You simply find a statement (in NP) that you wish to prove, such as our hash function example from above, then translate it into an instance of the 3-coloring problem. At that point you simply run the digital version of the hat protocol.

In summary, and next time

Of course, actually running this protocol for interesting statements would be an insanely silly thing for anyone to do, since the cost of doing so would include the total size of the original statement and witness, plus the reduction cost to convert it into a graph, plus the E^2 protocol rounds you’d have to conduct in order to convince someone that the proof is valid. Theoretically this is ‘efficient’, since the total cost of the proof would be polynomial in the input size, but in practice it would be anything but.

So what we’ve shown so far is that such proofs are possible. It remains for us to actually find proofs that are practical enough for real-world use.

In the next post I’ll talk about some of those — specifically, the efficient proofs that we use for various useful statements. I’ll give some examples (from real applications) where these things have been used. Also at reader request: I’ll also talk about why I dislike SRP so much.

See here for Part 2.

Notes:

* Formally, the goal of an interactive proof is to convince the Verifier that a particular string belongs to some language. Typically the Prover is very powerful (unbounded), but the Verifier is limited in computation.

** This example is based on the original solution of Goldwasser, Micali and Rackoff, and the teaching example using hats is based on an explanation by Silvio Micali. I take credit only for the silly mistakes.

****** A simple example of a commitment can be built using a hash function. To commit to the value “x” simply generate some (suitably long) string of random numbers, which we’ll call ‘salt’, and output the commitment C = Hash(salt || x). To open the commitment, you simply reveal ‘x’ and ‘salt’. Anyone can check that the original commitment is valid by recomputing the hash. This is secure under some (moderately strong) assumptions about the function itself.

Cryptographic obfuscation and ‘unhackable’ software

I have a thing for over-the-top cryptography headlines — mostly because I enjoy watching steam come out of researchers’ ears when their work gets totally misrepresented. And although I’ve seen quite a few good ones, last week WIRED managed a doozy.

The headline in question, Cryptography Breakthrough Could Make Software Unhackable, managed to accomplish something that few cryptography headlines do. It sent its own protagonist, Amit Sahai, into the comments section to perform intellectual garbage pickup.

In contrast to the headline, which is quite bad, the article is actually pretty decent. Still, the discussion around it has definitely led to some confusion. As a result, many now think an amazing breakthrough has taken place — one that will finally make software secure. They’re probably right about the first part. They may be disappointed about the rest.

The truth, as usual, is complicated. There is, indeed, something very neat going on with the new obfuscation results. They’re just not likely to make software ‘unhackable’ anytime soon. They might, however, radically expand what we can do with cryptography. Someday. When they’re ready. And in ways we don’t fully understand yet.

But before I go into all that, it’s probably helpful to give some background.

Program obfuscation

The Wired article deals with the subject of ‘program obfuscation‘, which is a term that software developers and cryptographers have long been interested in. The motivation here is pretty simple: find a way that we can give people programs they can run — without letting them figure out how the programs work.

Note that the last part necessarily covers a lot of ground. In principle it includes aspects ranging from the nature of specific secret algorithms used — which may be proprietary and thus worth money — to secret information like passwords and cryptographic keys that might be hardcoded into the program.

For a simple example, consider the following routine:

// Check a password and print out top secret information if it’s correct
//
print(“Congratulations. Here’s some super secret private information: ….\n”);
} else {
}
}

If you’re like me, you probably wrote a program like this at some point in your life. You may have eventually realized how ineffective it would be — against a smart attacker who could dump the program code. This is because most programs (and even compiled binaries) are pretty easy to read. An attacker could just look at this program to recover the secret password.

Program obfuscation is motivated by the idea that many useful programs would benefit if we could somehow ‘stop’ people from doing this, while still letting them possess and run the code on their own computers.

In real world software systems, ‘obfuscation’ usually refers to a collection of ad-hoc techniques that turn nice, sensible programs into a morass of GOTOs and spaghetti code. Sometimes important constants are chopped up and distributed around the code. Some portions of the code may even be encrypted — though only temporarily, since decryption keys must ship with the program so it can actually be run. Malware authors and DRM folks love this kind of obfuscation.

 A chunk of ‘birken’, one of the winning entries in the 2013 obfuscated C contest.

While these techniques are nice, they’re not really ‘unhackable’ or even secure against reverse-engineering in the strong sense we’d like. Given enough time, brains and tools, you can get past most common software obfuscation techniques. If you have, say, Charlie Miller or Dion Blazakis on your side, you probably do it quickly. This isn’t just because those guys are smart (though they are), it’s because the techniques are not mathematically rigorous. Most practical obfuscation amounts to roadblocks — designed to slow reverse-engineers down and make them give up.

So what does it mean to securely ‘obfuscate’ a program?

The poor quality of existing software obfuscation set cryptographers up with a neat problem. Specifically, they asked, could we define a strong definition of program obfuscation that would improve on, to put it politely, the crap people were actually using? Given such an obfuscator I could hand you my obfuscated program to run while provably protecting all partial information — except for the legitimate inputs and outputs.

If this could be done, it would have an incredible number of applications. Stock traders could obfuscate their proprietary trading algorithms and send them to the cloud or to end-customers. The resulting programs would still work — in the sense that they produced the right results — but customers would never learn anything about ‘how they worked’. The internals of the program would be secret sauce.

Unfortunately, when the first researchers started looking at this problem, they ran into a very serious problem: nobody had any idea what it meant to securely ‘obfuscate a program in the first place.

Do think about it for a second before you roll your eyes. What do you think it means to obfuscate a program? You’re probably thinking something like ‘people shouldn’t learn stuff‘ about the program. But can you explain what stuff? And does ‘stuff‘ depend on the program? What about programs where you can efficiently learn the whole program from sending it inputs and seeing the results? What does it mean to obfuscate those?

Clearly before progress could begin on solving the problem, cryptographers needed to sit down and figure out what they were trying to do. And indeed, several cryptographers immediately began to do exactly this.

Black-box cryptographic obfuscation

The first definitions cryptographers came up addressed a very powerful type of obfuscation called ‘virtual black box obfuscation‘. Roughly speaking, it starts from the following intuitive thought experiment.

Imagine you have a program P, as well as some code obfuscation technique you’ve developed. It’s important that the obfuscation be efficient, meaning it doesn’t slow the program down too much. To determine whether the obfuscation is successful, you can conduct the following experiment.

1. Give Alice a copy of the obfuscated program code Obf(P), in a form that she can look at and run on her own computer.
2. Take the original (unobfuscated) program P, and seal it in a special computer located inside an armored ‘black box’. Let a user Sam interact with the program by sending it inputs and receiving outputs. But don’t let him access the code.

Roughly speaking, what we want from secure obfuscation is that Alice should learn ‘no more‘ information after seeing the obfuscated program code, than could some corresponding user Sam who interacts with the same program locked up in the black box.

What’s nice here is that this is we have the beginnings of an intuitive definition. In some sense the obfuscation should render the program itself basically unintelligible — Alice should not get any more information from seeing the obfuscated program than Sam could get simply by interacting with its input/output interface. If Sam can ‘learn’ how the program works just by talking to it, even that’s ok. What’s not ok is for Alice to learn more than Sam.

The problem with this intuition is, that of course, it’s hard to formulate. One thing you may have noticed is that the user Alice who views the obfuscated program truly does learn something more than the user Sam, who only interacts with it via the black box. When the experiment is over, the user who got the obfuscated program still has a copy of the obfuscated program. The user who interacted with the black box does not.

To give a practical example, let’s say this program P is a certificate signing program that has a digital signing key hard-coded inside of it — say, Trustwave’s — that will happily attach valid digital signatures on certificates (CSRs) of the user’s devising. Both Alice and Sam can formulate certificate signing requests and send them into the program (either the obfuscated copy or the version in the ‘black box’) and get valid, signed certificates out the other end.

But here’s the thing: when the experiment is over and we shut down Sam’s access to the black box, he won’t be able to sign any more certificates. On the other hand, Alice, who actually who learned the obfuscated program Obf(P), will still have the program! This means she can keep signing certificates on and forevermore. Knowing a program that can sign arbitrary certificates is a pretty darn significant ‘piece’ of information for Alice to have!

Worse, there’s no way the ‘black box’ user Sam can ever learn a similar piece of information — at least not if the signature scheme is any good. If Sam could ask the black box for a bunch of signatures, then somehow learn a program that could make more signatures, this would imply a fundamental weakness in the digital signature scheme. This cuts against a basic property we expect of secure signatures.

Barak et al. proposed a clever way to get around this problem — rather than outputting an arbitrary piece of information, Alice and Sam would be restricted to outputting a single bit at the end of their experiment (i.e., computing a predicate function). This helps to avoid the problems above and seems generally like the “right” definition.

An impossibility result

Having proposed this nice definition, Barak et al. went on to do an irritating thing that cryptographers sometimes do: they proved that even their definition doesn’t work. Specifically, they showed that there exist programs that simply can’t be obfuscated under this definition.

The reason is a little wonky, but I’m going to try to give the flavor for it below.

Imagine that you have two programs A and B, where A is similar to our password program above. That is, it contains a hard-coded, cryptographic secret which we’ll denote by password. When you run A(x), the program checks whether (x == password) and if so, it outputs a second cryptographically strong password, which we’ll cleverly denote by password_two. If you run A on any other input, it doesn’t output anything.

Now imagine the second program B works similarly to the first one. It contains both password and password_two hardcoded within it. A major difference is that B doesn’t take a string as input. It takes another computer program. You feed a program in as input to B, which then:

1. Executes the given program on input password to get a result r.
2. If (r == password_two)it outputs a secret bit.

It should be apparent to you that if you run the program B on input the program A — that is, you compute B(A) — it will always output the secret bit.* It doesn’t even matter if either B or A has been obfuscated, since obfuscation doesn’t change the behavior of the programs.

At the same time, Barak et al. pointed out that this trick only works if you actually have code for the program A. If I give you access to a black box that contains the programs A, B and will simply let you query them on chosen inputs, you’re screwed. As long as password and password_two are cryptographically strong, you won’t be able to get A to output anything in any reasonable amount of time, and hence won’t be able to get B to output the secret bit.

What this means is that Alice, who gets the obfuscated copies of both programs, will always learn the secret bit. But if Sam is a reasonable user (that is, he can’t run long enough to brute-force the passwords) he won’t be able to learn the secret bit. Fundamentally, the obfuscated pair of programs always gives more information than black-box access to them.

It remains only to show that the two programs can be combined into one program that still can’t be obfuscated. And this is what Barak et al. did, completing their work and showing the general programs can’t be obfuscated using their definition.

Now you can be forgiven for looking askance at this. You might, for example, point out that the example above only shows that it’s hard to obfuscate a specific and mildly ridiculous program. Or that this program is some kind of weird exception. But in general it’s not. There are many other useful programs that also can’t obfuscate, particularly when you try to use them in real systems. These points were quickly pointed out by Barak et al., and later in other practical settings by Goldwasser and Kalai, among others.

You might also think that obfuscation is plain impossible. But here cryptography strikes again — it’s never quite that simple.

We can obfuscate some things!

Before we completely write off black box obfuscation, let’s take a moment to go back to the password checking program I showed at the beginning of this post.

Here’s a slightly simpler version:

// Check a password and return true if it’s correct, false otherwise
//
return true;
} else {
return false;
}
}

This function is an example of a ‘point function‘: that is, a functions that returns false on most inputs, but returns true at exactly one point. As you can see, there is exactly one point (the correct password) that makes this routine happy.

Point functions are interesting for a two simple reasons: the first is that we use them all the time in real systems — password checking programs being the obvious example. The second is that it turns out we can obfuscate them, at least under some strong assumptions. Even better, we can do it in a way that should be pretty familiar to real system designers.

Let be a secure hash function — we’ll get back to what that means in a second — and consider the following obfuscated password checking routine:

// Check a password and return ‘true’ if it’s correct, ‘false’ otherwise
// But do it all *obfuscated* and stuff
//

static string HARDCODED_SALT          = 0x……; // this is a salt value

static string HARDCODED_PASSWORD_HASH = 0x……; // this value is H(salt + target password)

// First, hash the input password with the salt
hashedPass = H(HARDCODED_SALT + passwd);

return true;
} else {
return false;
}
}

Note that our ‘obfuscated’ program no longer stores the plaintext of the password. Instead we store its hash only (and salt), and wrap it inside a program that simply compares this hash to a hash of the user’s input. This should be familiar, since it’s the way you should be storing password files today.**

A number of cryptographers looked at formulations like this and showed the following: if the password is very hard to guess (for example, it’s secret and drawn at random from an exponentially-sized space of passwords) and the hash function is ‘strong’ enough, then the hash-checking program counts as a secure ‘obfuscation’ of the basic password comparison program above it.

The intuition for this is fairly simple: imagine Alice and Sam don’t know the password. If the password is strong, i.e., it’s drawn from a large enough space, then their probability of ever getting the program to output ‘true’ is negligible. If we assume an ideal hash function — in the simplest case, a random oracle — then the hard-coded hash value that Alice learns is basically just a useless random string, and hides all partial input about the real password. This means, in practice, that any general question Alice can answer at the end of the experiment, Sam can answer with about the same probability.***

Finding better definitions

More than a decade after the definition was formulated, there are basically two kinds of results about ‘strong’ virtual-black-box obfuscation. The first set shows that it’s impossible to do it for general programs, and moreover, that many of the interesting functions we want to obfuscate (like some signatures and pseudorandom functions) can’t be obfuscated in this powerful sense.

The second class of results shows that we can black-box obfuscate certain functions, but only very limited ones like, say, point functions and re-encryption. These results are neat, but they’re hardly going to set the world on fire.

What cryptographers have been trying to achieve since then is a different — and necessarily weaker — definition that could capture interesting things we wanted from obfuscating general programs, but without all the nasty impossibility results.

And that, finally, brings us to the advances described in the WIRED article.

Indistinguishability Obfuscation

In shooting down their own proposed definitions, Barak et al. also left some breadcrumbs towards a type of obfuscation that might actually work for general programs. They called their definition “indistinguishability obfuscation” (IO) and roughly speaking it says the following:

Imagine we have two programs C1, C2 — which we’ll describe as similarly-sized circuits — that compute the same function. More concretely, let’s say they have exactly the same input/output behavior, although they may be implemented very differently inside. The definition of indistinguishability obfuscation states that it should be possible to obfuscate the two circuits C1, C2 such that no efficient algorithm will be able to tell the difference between Obf(C1) from Obf(C2).

While this idea was proposed years ago, nobody actually knew how to build any such thing, and it was left as one of those ‘open problems’ that cryptographers love to tear their hair out over. This remained the case until just last year, when a group of authors from UCLA, UT Austin and IBM Research proposed a ‘candidate construction’ for building such obfuscators, based on the new area of multilinear map-based cryptography.

Another interesting variant of this notion is called extractability obfuscation (EO), which implies not only that you can’t distinguish between Obf(C1) and Obf(C2), but moreover, that if you could distinguish the two, then you could necessarily find an input value on which both C1 and C2 would produce different outputs. Moreover, other work indicates IO and EO give essentially the ‘best possible’ obfuscation you can provide for general programs.

The question you’re probably asking is: so what? What can we do with indistinguishability obfuscation?

And this is where the WIRED article differs substantially from the reality. The truth is that IO will probably bring us major cryptographic and software advances — it’s already been shown to bring about succinct functional encryption for all circuits, as well as advances in deniable encryption. Moreover it can be used to build known forms of encryption such as public-key encryption based on new techniques that differ from what we use today — for example, by obfuscating ‘symmetric’ primitives like pseudorandom functions.****

And if this doesn’t seem quite as exciting as the WIRED article would imply, that’s because we’re still learning what the possibilities are. The new techniques could, for example, allow us to build exciting new types of security systems — or else show us radical new ways to build systems that we already have. Even the latter is very important, in case advances in our field result in ‘breaks’ to our existing constructions.

What IO and EO will probably not do is make programs unhackable, since that implies something a whole lot stronger than either of these techniques currently provide. In fact, it’s not clear what the heck you’d need to make programs unhackable.

And the reason for that is quite simple: even obfuscated software can still suck.

Notes:

Thanks to Susan Hohenberger and Zooko for comments on this post.

The bit in the Barak et al. proof doesn’t have to be secret.

** With a pretty damn big caveat, which is that in real life (as opposed to cryptographic examples) users pick terrible passwords, which makes your password hashes vulnerable to dictionary attacks. We have a variety of countermeasures to slow down these attacks, including salt and computationally intensive password hashing functions.

*** God I hate footnotes. But it’s worth unpacking this. Imagine that Alice learns ‘some information’ from seeing an average program containing the hash of a strong password. If the password is strong, then Alice can only guess the right input password with negligible probability. Then Sam can ‘use’ Alice to get the same information as follows. He formulates a ‘fake’ program that contains a completely bogus password hash and mails it to Alice. Whatever information she learns from his program, he takes and outputs as the information he ‘learned’. It remains to show that if the password hash is strong (a random oracle), Alice isn’t going to be able to learn more from a random hash than she would from the hash of a strong password.

**** The basic idea here is that symmetric encryption (like AES, say) can’t be used for public key encryption, since anyone who learns your encryption key also learns your decryption key. But if you can obfuscate an encryption program that contains your symmetric encryption key, then the new program becomes like a public key. You can hand it out to people to encrypt with.

Is the cryptopocalypse nigh?

I’ve been traveling a bit over the past couple of weeks, so I haven’t had much of a chance to keep up on blogging. One consequence is that I completely missed my chance to say something about, well, anything that happened at BlackHat or Def Con.

Which is too bad, since a surprising amount of crypto stuff did go down! One thing that I wish I’d had a chance to talk about was a presentation at BlackHat called ‘The Factoring Dead‘ by Tom Ritter, Javed Samuel and Alex Stamos. (Thomas Ptacek was also involved, but he insists that he did nothing and they only included him out of pity.)

Although I don’t quite agree with the premise of this presentation, talks like it are fun — in the way that zombie movies are fun — because you get to abandon reality and think about some non-serious things for a while.

Factually, the presentation addresses some new results on the discrete logarithm problem published by Antoine Joux and his co-authors Razvan Barbulescu, Pierrick Gaudry and Emmanuel Thomé — developments the presenters cite as a very serious reason for people to get worried. And we’re not talking about the usual ‘you’re going to use crypto wrong’ kind of worry, but a more earthshaking kind: namely that RSA and Diffie-Hellman and DSA are soon going to be broken altogether.

Now let me be clear that Ritter, Samuel and Stamos and even lame, non-contributing Ptacek (henceforth RSSp) are all smart guys. In fact, I would venture to say they’re probably much more familiar with the Joux et al. work than I am since they seem interested in the details. Me, I like my hardness assumptions the way I like my hamburgers: neatly ground up and without the mooing. I could live my whole life without voluntarily digging into the efficiency of discrete logarithm solvers in fields of small characteristic.

Moreover, it’s hard to really argue with the content of RSSp’s presentation, since the bulk of what they do is to simply present facts. There really have been some major recent advances in solving discrete logarithms over certain special fields. There have been major attacks in the more distant past that took us by surprise. And yes, it would really awesome if people stopped fiddling with 1024-bit RSA keys and moved to elliptic curve crypto.

What’s concerning is the conclusions they (and other sources) have reached: namely, that factoring-based cryptosystems could be dead in just a few years. This kind of thing could incite panic! (I mean, it could if people actually cared about cryptography. Which unfortunately they mostly don’t.)

So let’s spend some time examining this.

 Razvan Barbulescu, Emmanuel Thomé and Antoine Joux hungrily eye a defenseless discrete logarithm instance. (Source: Steven Galbraith)

The background

The jumping off point for RSSp’s slides is a set of recent advances made by Joux and subsequently by Barbulescu, Gaudry, Joux and Thomé. The discrete logarithm problem (which you can learn about in this highly informative video) is noteworthy for two reasons. First, it’s believed that in many settings, the discrete logarithm problem is difficult to solve. Second: that assumption is critical to the security of many of the cryptosystems we know and love — for example, Diffie-Hellman and DSA.

Now the Joux and Barbulescu et al. results are important work, and really do deserve attention from cryptographers and non-cryptographers alike. What they show is that there exist relatively efficient algorithms for solving discrete logarithms in certain very specific types of field. Even more amazingly, the new algorithms are efficient enough to actually implement and run — against parameters that were previously thought to have cryptographic security!

Indeed this has already had some (limited) impact on practitioners in the research community. For example, many of the pairing-based cryptography libraries I work with ship with parameters that are now deemed to be too risky thanks to these new attacks. However — and this is key — these are research libraries. To my knowledge, none of these fields is actually being used in deployment, let alone standardized cryptography.

In other words, this is the kind of result that should receive (and has received!) lots of attention from cryptographers. But not necessarily from people who use cryptography. And here’s why.

You see, while the Joux and Barbulescu et al. algorithms really are efficient, they only work in fields with very specific properties. Namely, the fields must have small characteristic. Indeed, this feature of the field is critical to certain key steps of the algorithm. Take this property away and you still get some advances over the previous state of the art, but the advances are decidedly more theoretical.

Which brings us to the payoff: all of the fields we use to implement most cryptography — things like (non-elliptic-curve) DSA, Diffie-Hellman, and even the fields we use to implement NIST standard elliptic curves — are prime fields and hence don’t have the necessary properties to make the Joux results meaningful. Hence these attacks don’t seem to apply. Moreover there’s really no good reason to believe that they will anytime soon.

The BlackHat presentation

Which brings us to the RSSp BlackHat presentation. The overall premise of RSSp’s presentation is that advances happen rapidly. It’s not unprecedented for theoretical attacks in the literature to rapidly morph into real things that keep security engineers up at night. They also point out that attacks on the DLP have closely tracked attacks on factoring, both in the classical and the quantum world. (Ok, they don’t say the last part but it’s also true.)

RSSp also correctly imply that we should be switching away from cryptosystems that rely on the hardness of the (field-based) discrete logarithm problem, and should instead be moving to cryptosystems based on the elliptic curve discrete logarithm problem (ECDLP).* This is because none of the efficient attacks on DLP — including Joux’s algorithms — seem to apply in the (standardized) EC setting.

Lastly, they correctly point out that cryptosystems based on factoring and (field-based) Discrete Logarithms are already being deprecated by organizations like NIST for a variety of good — though not panic-related — reasons. Mostly this is because our current pre-Joux algorithms against those settings have made it too costly to get long-term (256-bit) security; you just need enormous keys. This was the case before Joux came along, and it’s still the case now.

The last point RSSp make is also absolutely correct: we should be switching to elliptic curve cryptography (ECC) as soon as possible, in part just so people can start using high-security cryptosystems without paying performance and bandwidth through the nose for the privilege. This isn’t totally academic, since — as Thomas Ptacek reminds me — we’re getting close to the death of 1024-bit keys. If your adversary is the NSA, anyway.

(It also doesn’t hurt that getting more people on this bandwagon will reduce the number of people rolling their own RSA implementation.)

So should we stock up on canned goods and move to Vermont?

Vermont is lovely. But you shouldn’t move there because of this presentation.

In fact this is hardly the first time we’ve seen a major breakthrough against an important cryptographic problem. In the 90s it was fast number field sieving against factoring-based systems and slightly later, things like the MOV attack on the ECDLP. In both cases, there was a small degree of panic, but ultimately a pretty mild result: cryptographers carefully examined the attack and chose new parameters that made it impractical. Then everyone went back to business.

In this case it looks like we’ve already got a set of parameters that keep us safe, so it’s even more unlikely — that except for a few researchers doing what researchers do — any of us will have to think about this in three years or five or even ten.

And by the way, you should not believe this because I say so — that would be foolish. You should believe it because the people who work in this area also don’t seem to think it’s an issue. If you doubt this, go to CRYPTO this week look for people running around with their hair on fire. The number should be barely higher than usual.

What would we do if there was a real cryptpocalypse?

Right! If we’re going to gin up a cryptpocalypse let’s have a real one. What if in 2015, Joux and his co-authors publish a new algorithm that efficiently solves the DLP in prime fields and at realistic key sizes, and moreover has a factoring analog that breaks RSA? Well, this would certainly be very bad for everything we’ve encrypted in the past, but at least we’d have an immediate solution: a rapid transition to elliptic curve crypto. Whew!

But this is not much fun: like watching an alien invasion movie where the aliens are allergic to water.

So let’s go way out on a limb and imagine that in 2017, after everyone has piled into ECC, Joux et al. and a team from the University of Waterloo team up to publish a revolutionary new attack that reduces ECDLP to roughly the hardness of field-based DLP. What would happen then?

Well, this would be really bad.

Let me reiterate that there’s a reason we like our current batch of public-key cryptosystems — EC, field-based and factoring-based systems. They’re relatively easy to understand, they’ve all been studied quite a bit. But most importantly: they’re really efficient.

Once you leave this domain you enter a region that the maps label with ‘here be dragons‘. Not because this space is empty. It’s just that there are relatively few efficient schemes that have received anywhere near the level of study that our beloved ones have.

Probably the oldest and most familiar of the alternative encryption schemes is the McEliece cryptosystem, which was developed way back in 1978 (that’s one year after RSA, in case you’re keeping score at home). McEliece and its modern variants are based on problems in algebraic coding theory: they depend for security on the hardness of decoding general codes, as well as some assumptions about the specific code used.

McEliece is surprisingly fast and (so far as we know) quite secure. There’s only one major problem: the public keys are big. According to a 2008 analysis by Bernstein, Lange and Peters, achieving security equivalent to a 3072-bit RSA key (aka the ‘128 bit’ symmetric-equivalent security level) requires a stunning 187 kilobyte McEliece public key. Moving up to 256-bit security — notably hard even for RSA — takes this to nearly 1MB. Recent improvements may cut that down a bit, but they’re still relatively unstudied.

Another possibility is to use Lattice-based cryptosystems. While there are several in the research literature, one of the most studied is the NTRU cryptosystem. I won’t confess to caring much about NTRU, except to note that it’s relatively well-studied by the standards of such alternative schemes and even shows up in some standards. Unfortunately that doesn’t mean everyone loves it. The inventors also hold a patent on it.

Lastly, for signatures at least we can always fall back on old standbys such as hash based signatures, which should hold us as long as Joan Daemen’s team can think up new hash functions.

Conclusion

We live in frightening times and yes, it’s always tempting to peek under the bed and imagine scary monsters. In practice, the reality is probably a bit more mundane.

As much as we love to read stories of solitary mathematicians making revolutionary leaps in their unlit apartment, this is rarely how things go. Even the most significant advances are usually telegraphed via a series of public, widely-read research papers.

In other words: when RSA and DSA really are in danger you’ll know about it. Just look for a bunch of cryptographers running around with their hair on fire.

Notes:

* By ‘based on’ I don’t mean that these cryptosystems necessarily reduce to the ECDLP, but rather that their security depends upon the hardness of the ECDLP.

The Ideal Cipher Model (wonky)

A friend who’s learning cryptography writes with a few questions about block ciphers:

(1) Let’s say we’re using AES-128 — 128 bit keys, 128 bit blocks.

• For a given 128 bit block of plaintext “P” – if I was to iterate through all 2**128 key permutations and encrypt the same plaintext P with each key, would the outputs all be unique, or would there be collisions?
• For a given 128 bit key “K”, if I was to use K to encrypt every possible (2**128) plaintext, would the outputs all be unique, or would there be collisions?
These are all reasonable questions with simple answers. But I’m not going to give them. Why bother with two simple answers when we can give one really complicated intuition?

What’s Claude Shannon got to do with it?

Back in the late 1940s when people were still thinking on this whole information theory business, a gentleman named Claude Shannon came up with a model for what a block cipher should do. His idea was — and yes, I’m embellishing a lot here — that an ‘ideal’ block cipher would work kind of like a magical elf.*

You could ask the elf to encipher and decipher things for you. But instead of using a mathematical algorithm, it would just keep a blackboard with a big table. The table would have three columns (Key, Plaintext, Ciphertext), and it would start off empty.

When you asked the elf to encipher a plaintext P under a key K, it would do the following:
1. Check to see if (K, P) were already recorded in some row of the table. If they were, it would read off the corresponding ciphertext value C from the same row, and return C.
2. If no matching row was found, it would generate a perfectly random ciphertext C.
3. It would then check for an existing table entry containing the same key and ciphertext (K, C). If this entry was found in the table, it would throw C away and repeat step (2).
4. Otherwise it would add (K, P, C) to the table and return C to the caller.

Now I realize this looks complicated, but if you think about this for a little while you’ll realize that this little thought experiment does ‘work’ a lot like a real block cipher.

Just like a block cipher, it will always give the same output when you pass it a given key and plaintext. Furthermore — for a given key K, no two plaintexts will ever encipher to the same ciphertext. This models the fact that a block cipher is a permutation.

Lastly, the output of the encipherment is very ‘random’ — in the sense that it’s intuitively not linked to the input. Calling the cipher on different inputs should produce values that are very much unrelated.

Of course, you also need the elf to decipher stuff as well. Decipherment works mostly like the process above. When you ask to decipher (K, C), it checks to see whether the given key and ciphertext are already in the table (i.e., they were previously enciphered). If so it looks up and returns the corresponding plaintext. Otherwise it generates a new random plaintext, makes sure it hasn’t previously appeared in the table with that key, and returns the result.

Under the assumption that AES works like our elf, we can now answer my friend’s questions. Clearly if I encrypt the same plaintext with many different keys, I will get many different ciphertexts. They’ll each be random and 128 bits long, which means the probability of any two ciphertexts being the same (colliding) is quite small. But there’s no rule preventing it, and over a large enough set it’s actually quite likely.

Similarly, the second question is answered by the fact that the cipher is a permutation, something that’s captured in our elf’s rule (3).

This is an awful lot of work to answer a few simple questions…

Yes, it certainly is. But the ‘Elf’ model is useful for answering much more interesting questions — like: is a given encryption scheme secure?

For example, take the CBC mode of operation, which is a way to turn a block cipher into an encryption scheme:

CBC encryption takes in a random key (K) and a random ‘Initialization Vector’ (IV), both of which are chosen by the encryptor. Now let’s see what happens if we CBC-encrypt an N-block message using our elf.

1. The encryptor XORs the first plaintext block with the (random) Initialization Vector (IV), which should give a randomly-distributed value. We’ll call this P’.
2. He then sends (K, P’) to be enciphered by the elf. Since the elf has never seen (K, P’) — meaning it’s not in the table — it will generate a random value (C1), which will become the first ciphertext block.
3. The encryptor now XORs the next plaintext block with C1, which again should yield a randomly-distributed value which we’ll call P”.
4. He sends (K, P”) to be enciphered by the elf. Since P” is also random (and, say, 128 bits long), the probability that it’s already in the table is astronomically low. Thus with high probability the elf will again generate a random value (C2), which becomes the second ciphertext block.
5. He then repeats steps (3) and (4) for all the remaining blocks.
Note that with overwhelming probability — and unless you encrypt really long ciphertexts** — the elf will generate a random value for each ciphertext block. Hence the entire ciphertext will consist of a random string, which means it really shouldn’t leak anything about the message (except for the length).

Of course, an attacker might also talk to the elf to try to learn something about the message. But note that even this attack isn’t useful unless he can guess the (random) key K, since the elf will give him random results unless he asks for a value that includes K. Since K is a randomly-chosen secret key, the attacker should not be able to guess it.

Do ideal ciphers exist?

Now all of this is well and good, but it leaves us with an important question: is AES actually as good as an ideal cipher? Unfortunately, the resounding answer to this question is no.

The problem here is that ideal ciphers are totally unworkable. Obviously we can’t have an actual elf randomly filling in a bulletin board as we encrypt things. I want to carry a copy of the block cipher around with me (in software or hardware). I also want my copy of the cipher to be consistent with your copy, so that I can send you messages and you can decrypt them.

To make the elf idea work, we would each need to carry around a copy of the elf’s table that’s completely filled in from the start — meaning it would already contain entries for every possible plaintext (resp. ciphertext) and key I might ever need to encipher/decipher. When you consider that there would be 2^128 rows just for a single key, you realize that, no, this is not a question of running out to Best Buy and picking up some more SD cards. It’s fundamentally hard.

So carrying ideal ciphers around is not possible. The question then is: is there something that’s as good as an ideal cipher, without requiring us to carry around an exponentially-sized table.

The answer to that question is a mixed bag. The bad news is that nothing really works as well as an ideal cipher. Worse yet, there exists schemes that would be provably secure with an ideal cipher, but would fail catastrophically if you implemented them with any real block cipher. So that sucks.

On the other hand, those are theoretical results. Unless you’re doing some very specific things, the ideal cipher model is a moderately helpful tool for understanding what block ciphers are capable of. It’s also a good jumping off point for understanding the real proofs we actually use for modes like CBC. These proofs use more ‘realistic’ assumptions, e.g., that the block cipher is a pseudorandom permutation.

But those proofs will have to wait for another time. I’ve reached my wonkery limit for today.

Notes:

* For the record, at no point did Claude Shannon ever refer to an ‘elf’. At least not in writing.

** The probability of a ‘collision’ (i.e., asking the elf to encipher a value that’s already been enciphered) goes up as you put encipher more blocks. At a certain point (for AES, close to 2^64 blocks) it becomes quite high. Not coincidentally, this roughly coincides with the number of blocks NIST says you should encrypt with CBC mode.

Why I hate CBC-MAC

If you’re like most people, you don’t have a strong opinion about CBC-MAC. In fact, if you’re like most people, you don’t have a strong opinion about any crypto primitive.

This is healthy. Keep up the good work.

I’m not most people. I’ve spent the last week thinking about and dealing with CBC-MAC — or more specifically, code that uses it in various contexts — and I need to share with you how much I despise this little algorithm. And beg you never to use it.

Oh yes, I know the temptation. You have this nice block cipher just sitting around — maybe you’re encrypting something — and you’ve heard how serious this whole message authentication thing is. Maybe you’ve even thought about using one of those fancy authenticated encryption modes, but found them to be too exotic and complicated.

Then it comes to you that all your problems would be solved if you just used CBC-MAC. This is too bad, because now your troubles are just beginning.

Now a quick note: there’s nothing really wrong with CBC-MAC, when implemented correctly. And it’s not even that hard to implement properly. The problem is that many people who use CBC-MAC (rather than HMAC or a proper AEAD mode) seem incapable of actually doing this. They get it wrong in hilariously, embarassingly, stupid, complicated ways.

But of course you wanted examples. Ok, let’s give some.

1. Your implementation doesn’t handle variable-length messages.

A quick reminder. CBC-MAC is very similar to the classic CBC mode for encryption, with a few major differences. First, the Initialization Vector (IV) is a fixed value, usually zero. Second, CBC-MAC only outputs the last block of the ciphertext — this single value forms the MAC.

Many dumb implementations stop here. And that leads to big problems.

Most notably, if your system allows for variable-length messages — as it should — there is a simple attack that allows you to forge new messages. First, get a MAC T on a message M1. Now XOR the tag T into the first block of some arbitrary second message M2, and get a MAC on the modified version of M2.

The resulting tag T’ turns out to be a valid MAC for the combined message (M1 || M2). This is a valid forgery, and in some cases can actually be useful.

The standard fix to prepend the message length to the first block of the message before MACing it. But a surprisingly large number of (dumb) implementations skip this extra step. And many CBC-MAC implementations are dumb implementations.

2. Your implementation uses a random Initialization Vector.

If CBC-MAC with a fixed IV is great, surely CBC-MAC with a random IV must be super-great. But no, it isn’t.

Using a random (or variable IV) is bad for the simple reason that verifying a CBC-MAC requires you to know the IV, and to know the IV you probably need to read it from somewhere. Typically this means the same untrusted place where you were storing your message.

If the attacker can change the CBC-MAC IV, they can also change the first block of the MACed message in an equivalent manner. This works because the first step of CBC-MAC is to XOR the IV with the message. There are all kinds of silly variants of this problem, and all of them hurt.

3. You’ve used the same key for MAC and encryption.

A general rule in cryptography is that you shouldn’t use the same key for two different cryptographic primitives — encryption and signature, for example. Or encryption and MAC.

Some people figure that rules were made to be broken.

Note that shared keys can actually be ok, in some cases. Combined modes like CCM (short for CTR + CBC-MAC) actually do use the same key for both operations. However, these modes do it in a very careful, thoughtful manner. Your garden-variety implementation doesn’t.

One particularly ugly pattern I’ve seen is to use (dumb) CBC-MAC on a plaintext, then to encrypt said plaintext in CTR mode using some initial counter (C). This is insecure for a bunch of reasons, but specifically because I might be able to completely decrypt your ciphertext.

To do this, I simply ask you to encrypt a series of small files corresponding to the counter values C, C+1, etc. of the ciphertext I want to attack. The CBC-MAC of each of these files lets me recreate the CTR-mode keystream I need to decrypt the original ciphertext. Now I have your message.

4. You’ve used CBC-MAC as a hash function.

This one isn’t really a problem with CBC-MAC, but it does crop up. In fact, it happened recently to the file sharing site Mega.

To make a long story short: cryptographic hash functions are public functions (i.e., no secret key) that have the property of collision-resistance (it’s hard to find two messages with the same hash). MACs are keyed functions that (typically) provide message unforgeability — a very different property. Moreover, they guarantee this only when the key is secret.

If you attempt to use CBC-MAC with a non-secret key, it becomes a very bad candidate for anything. In fact, you can trivially find useful collisions in the output, something that’s very bad if you’re using it to authenticate code. Which is what Mega was doing with it.

This isn’t true of all MACs — HMAC, for example, should retain the collision resistance of the underlying hash function even if the MAC key is compromised. This is yet another reason to prefer it for cases where cryptographic expertise is not a sure bet.

In summary

I’ll repeat that none of these are really problems with CBC-MAC, which is a perfectly lovely algorithm if implemented and used correctly. The problems above only crop up when people try to whip it up themselves, without using a standard construction.

If you must write your own code, my recommendation is to use HMAC — which is extremely hard to screw up. If you’re doing combined encryption/MAC and you only have a block cipher, then look into the CCM spec, which is a patent free AEAD mode. This should address all of these problems and give you some nice test vectors too.

What you shouldn’t do is code up some half-assed CBC-MAC thing and expect you’ll be ok. The fact is, you probably won’t.

Indifferentiability

After umpteen weeks writing about broken stuff, I’m thrilled to tell you that for once, nothing awful is happening in the crypto world. It won’t last. But while it does, let’s grab the opportunity to talk about something constructive.

Now a word of warning: what I’m going to talk about today is fairly wonky and (worse), involves hash functions. If you’re not feeling up for this, this is your cue to bail and go read something nice about buffer overflows.

For those still with me, the subject of this post is the design of hash functions, and more specifically: the indifferentiability proofs that designers write to argue for their security. I was surprised to find that most people have never heard of these proofs, and thus have no idea why they’re useful. That’s too bad, since they’re extremely important to the way we analyze hash functions today.

Merkle-Damgård

 This is not Ivan Damgård. (Seriously Google?)

The best way to begin any discussion of hash function design is to take a quick peek inside of the hash functions we actually use. Since the most popular hashes today are MD5 (ugh) and SHA, the right place to start is with the ‘Merkle-Damgård’ paradigm.

To understand Merkle-Damgård, you need to understand that cryptographers love to build complicated things out of simpler components. Under the hood of most block ciphers you’ll find S-boxes. Similarly, if you take the lid off a Merkle-Damgård hash function — surprise! — you find block ciphers. Or at least, something very much like them.

This approach dates to a 1979 proposal by a young cryptographer named Ralph Merkle. What Merkle showed is a way to build hash functions with a variable-length input, using any fixed one-way compression function (a one-way function that spits out fewer bits than it takes in). While Merkle wasn’t specific about the function, he suggested that DES might be a good candidate.

Expressed as a colorful diagram, the Merkle construction looks something like this:

 Merkle-Damgård Construction (source: Wikipedia because I’m too lazy to draw my own diagrams). IV is a fixed value. f is a one-way compression function.

The beauty of Merkle’s proposal is that it’s relatively simple to understand. You simply chop your message into blocks, then feed each block into the function f along with the output of the previous function evaluation. Throw in a finalization stage and you’re done.

Of course there’s a difference between proposing a technique and showing that it actually works. It would take ten more years, but at CRYPTO 1989, Merkle and another cryptographer named Ivan Damgård independently submitted formal analyses of Merkle’s proposal. What they showed is that as long as the function f has certain ideal properties, the resulting hash function is guaranteed to be collision-resistant.  The rest, as they say, is history.

The popularity of Merkle-Damgård can be attributed in part to its security proof. But it also owes something to some major practical advantages:

1. You can use any secure block cipher as the function f, with just a few tweaks.
2. M-D hash functions can be pretty darn fast, again depending on f and how you use it.
3. M-D hashes allow you to digest ‘live’ data streams, where you don’t know in advance how much data you’re going to be hashing.
Of course, Merkle-Damgård hashes also have serious weaknesses. The most famous is the ‘length extension attack‘ in which an attacker, given only H(M) for some unknown message M, can ‘tack on’ additional blocks of her own choosing. This issue spells big trouble for people who think that H(key || message) is a good What’s interesting about the length-extension issue is not that it leads to broken MACs. I mean, that is interesting, and it’s why you should use HMAC. But what’s really interesting is that this flaw doesn’t represent a violation of the collision-resistance guarantee. The two issues are in fact completely orthogonal. And this tells us something fundamental. Namely: collision-resistance is not enough.Today’s implementers do all kinds of crazy things with hash functions, and many of those applications require much more than collision-resistance. To achieve the necessary properties, we first need to figure out what they are. And that requires us to think hard about the following question:

What the heck is a secure hash function?

If you crack a typical security textbook (or visit the Wikipedia page on hash functions), you’ll see a long list of things of things a hash function ‘must’ accomplish. The list usually starts with these:
1. Collision resistance. It should be hard to find any pair of messages M1, M2 such that H(M1) == H(M2).
2. Pre-image resistance. Given only h it should be hard to find a ‘pre-image’ M2 such that H(M2) == h.

Now leave aside the technical fact that none of the unkeyed hash functions we use today are ‘truly’ collision-resistant. Or that the above definition of pre-image resistance implies that I can hash my cat’s name (‘fluffy’) and nobody can invert the hash (note: not true. Go ask LinkedIn if you don’t believe me.) The real problem is that these definitions don’t cover the things that people actually do with hash functions.

For example, take the construction of PRNGs. A common PRNG design hashes together large pools of gathered entropy in the hope that the result will be sufficiently uniform for cryptographic work. This is so common that it’s probably happening somewhere on your computer right now. And yet, absolutely nothing in the definitions above implies that this technique is safe!* Similar problems exist for key derivation functions, and even for signature schemes like ECDSA which clearly require hash functions that are more than just collision-resistant.
The more you look into the way that people use hash functions, the more you realize that they really need something that produces ‘random-looking’ output. Unfortunately, this notion is surprisingly hard to formalize. Hash functions are unkeyed, so they’re not pseudo-random functions. What in the world are people asking for?

Random oracles and indifferentiability

The answer, if you dig hard enough, is that people want hash functions to be random oracles.

Random oracles are cryptographers’ conception of what an ‘ideal’ hash function should be. Put succinctly, a random oracle is a perfectly random function that you can evaluate quickly. Random functions are beautiful not just because the output is random-looking (of course), but also because they’re automatically collision-resistant and pre-image resistant. It’s the only requirement you ever need.

The problem with random functions is that you just can’t evaluate them quickly: you need exponential storage space to keep them, and exponential time to evaluate one. Moreover, we know of nothing in the ‘real’ world that can approximate them. When cryptographers try to analyze their schemes with random functions, they have to go off into an imaginary fantasy world that we call the.

But ok, this post is not to judge. For the moment, let’s imagine that we are willing to visit this fantasy world. An obvious question is: what would it take to build a random oracle? If we had a compression function that was good enough — itself a random function — could we use a technique like Merkle-Damgård to get the rest of the way?

In 2004, Maurer, Renner and Holenstein gave us a powerful tool for answering this question. What they showed is that it’s always possible to replace functionality A (e.g., a random oracle) with another functionality B (e.g., an ideal compression function) provided that the following rules are satisfied:

1. There exists a way to ‘construct’ something ‘like’ A out of B.
2. There exists a way to ‘simulate’ something ‘like’ B using A.
3. An attacker who interacts with {constructed A-like thing, B} cannot tell the difference (i.e., can’t differentiate it) from {A, simulated B-like thing}

The definition of simulation gets a bit wonky. but expressed in simpler language all this means is: if you can show that your hash function, instantiated with an ‘ideal’ compression function, looks indistinguishable from a random oracle. And you can show that a manufactured compression function, built using a random oracle as an ingredient, looks indistinguishable from an ideal compression function, then you can always replace one with the other. That is, your hash function is ‘good enough’ to be a random oracle.

The following year, Coron, Dodis, Malinaud and Puniya applied this framework to Merkle-Damgård-hash functions. Their first result was immediate: such a proof does not work for Merkle-Damgård. Of course this shouldn’t actually surprise us. We already know that Merkle-Damgård doesn’t behave like a random oracle, since random oracles don’t exhibit length-extension attacks. Still it’s one thing to know this, and another to see a known problem actually turn up and screw up a proof. So far, no problem.

What Coron et al. showed next is much more interesting:
• They proved formally that Merkle-Damgård can be made indifferentiable from a random oracle, as long as you apply a prefix-free encoding to the input before hashing it. Prefix-free encodings prevent length-extensions by ensuring that no message can ever be a prefix of another.
• Next, they proved the security of HMAC applied to a Merkle-Damgård hash.
• Finally, and best of all, they showed that if you simply drop some bits from the last output block — something called a ‘chop’ construction — you can make Merkle-Damgård hashes secure with much less work.

The best part of Coron et al.‘s findings is that the chop construction is already (inadvertently) in place on SHA384, which is constructed by dropping some output bits from its big-brother hash SHA512. The modern hash variants SHA512/224 and SHA512/256 also have this property.** So this theoretical work already has one big payoff: we know that (under certain assumptions) these hashes may be better than some of the others.

And these results have bigger implications. Now that we know how to do this, we can repeat the process for just about every candidate hash function anyone proposes. This lets us immediately weed out obvious bugs, and avoid standardizing another hash with problems like the length extension attack. This process has become so common that all of the SHA3 candidates now sport exactly such an indifferentiability proof.

Of course, in the real world, indifferentiability only takes you so far. It does tell us something, but it doesn’t tell us everything. Sure, if the compression function is perfect, you obtain a strong result about the hash function. But compression functions are never perfect. Real compression functions have glitches and oddities that can make these theoretical results irrelevant. This is why we’ll always need smart people to arm wrestle over which hash we get to use next.

In conclusion

If I had it in me, I’d go on to talk about the SHA3 candidates, and the techniques that each uses to achieve security in this model. But this has already been a long post, so that will have to wait for another time.

I want to say only one final thing.

This is a practical blog, and I admit that I try to avoid theory. What fascinates me about this area is that it’s a great example of a place where theory has directly come to the aid of practice. You may think of hash functions as whizzing little black boxes of ad-hoc machinery, and to some extent they are. But without theoretical analysis like this, they’d be a whole lot more ad-hoc. They might not even work.

Remember this when NIST finally gets around to picking Keccak BLAKE.

Notes:

* For a ridiculous example, imagine that you have a secure (collision-resistant, pre-image resistant) hash function H. Now construct a new hash function H’ such that H'(M) = {“long string of 0s” || H(M)}. This function is as collision-resistant as the original, but won’t be very useful if you’re generating keys with it.

** Thanks to Paulo Barreto for fixing numerous typos and pointing out that SHA512/256 and /224 make excellent candidates for chop hashes!