Thursday, September 29, 2011

What is the Random Oracle Model and why should you care? (Part 1)

This is the first in a series of posts on the Random Oracle Model.  This might get a little wonky for some people's tastes, so if you're not interested in provable security, no problem.  I'll be back with more on software and physical security once I get this out of my system.

An oracle.
For the most part I like to talk about how cryptography gets implemented.  But implementation doesn't matter a lick if you're starting with insecure primitives.

It happens that I'm scheduled to teach a class today on provable security, and something called the "Random Oracle Model".  While getting my thoughts together, it occurred to me that a) this subject might be of interest to people who aren't my grad students, and b) there doesn't seem to be a good "layperson's" introduction to it.

So at the risk of alienating my tiny readership, I'm going to take a break from our regularly scheduled "how cryptographic systems go bad" programming to talk about this wonkish subject --- which, as we'll see in a bit, is itself just another flavor of how crypto goes wrong.

Hash Functions for Dummies

Before we talk about Random Oracles, we first need to discuss hash functions.  These are functions that take a (potentially) long input, and 'hash' it down into a fixed-size value that we refer to as a Message Digest.

We use hash functions a lot in cryptography.  They get the most press for turning up in digital signature schemes, but they also make an appearance in many encryption schemes.  In fact, throughout the field, it's pretty hard to throw a stone without hitting one.

For the most part, a cryptographic hash function has to satisfy at least some of the following properties.  First of all, we expect it to be "one way", meaning that it's hard to 'invert' the function, i.e., to find the original message given only a Message Digest.

Secondly, a hash function should be "collision resistant".  In other words, it should be hard to find two different messages (M, M') such that Hash(M) == Hash(M').  This property is very important for digital signatures, since we typically don't sign the message directly, we sign a hash of the message.  If an adversary can find two messages that hash to the same value, then a signature on (the hash of) one message is also a signature on the other.  In practice this can lead to bad consequences.

Sometimes we demand even more from our hash functions.  The tricky part is knowing how much more we can ask.  To illustrate, let me give an example.

Encryption in Delphia

Imagine that you live in Delphia, a nation that bans all encryption algorithms.  Despite this, you have secrets to keep.  You notice that your government has not banned hash functions.  This gives you an idea: perhaps you can 'hack' an encryption scheme out of a hash function.

This isn't crazy.  The outputs of many hash algorithms look pretty random.  Maybe you could use a hash function to build a stream cipher.  Basically, you'll use a hash function to hash a secret key (along with some kind of counter value); this will result in a stream of 'random looking' bits which you can then XOR with your plaintext.

The question is: would a "one-way" and "collision-resistant" hash function be sufficient to ensure that this scheme is secure?  I'll give you a blunt hint: no.  Technically neither of these properties implies that the output of a hash function be 'random looking'.  You can conceive of hash functions that meet those guarantees, and yet produce outputs that would be useless for building a stream cipher (they might, say, contain long strings of predictable values).  If you use these to build your cipher, you'll be terribly disappointed.

Spherical horses

When cryptographers first started pondering problems like the above, their question was what they wanted from an "ideal" hash function.  To a mathematician, the answer turned out to be obvious.  They wanted it to behave like a random function.  That term has a very specific mathematical definition; for the purposes of this discussion we'll use an equivalent one that's easier to talk about.

Imagine that we wanted to implement an ideal hash function.  Instead of designing a small algorithm that uses confusion and diffusion (as we do with most real hash functions), we might instead create a big hard-coded table.  The table would have the input messages in one column, and the corresponding outputs on the other.

The important thing about this table is that every single output value is an independent, random string.

       Input (binary)                  Output                      
       0 Totally random bitstring 1
       1 Totally random bitstring 2
       00   Totally random bitstring 3
       01 Totally random bitstring 4
       10 Totally random bitstring 5
       11 Totally random bitstring 6
       and so on...

Obviously the full table for a useful hash function would be pretty big.  How big?  Well, SHA1 accepts messages up to 2^64 bytes in length.  So the answer is: too big to fit in this blog.  In fact, there would be more entries in the full table than there are atoms in the universe.

Even if we had room to store such a table, the process of hashing --- looking up an input and finding the output --- would be awfully time consuming.  If we stored our hash table on the tape of a Turing machine (the quintissential computer), just moving the tape head to the right place would take (on average) a number of time steps that's exponential in the size of the input.

From time to time when I mention all of this in my intro course, some ambitious student tries to come up with a clever way to shrink the table down into something that we can carry around.  But here's the problem: the set of output strings are perfectly random.  Finding a way to represent them in a compact hash would be equivalent to compressing random data.  Unfortunately, information theory tells us that this is a big no-no.

Doubling down

Let's take stock for a moment.  So far I hope that I've convinced you of at least two things.  First, cryptographers would really like their hash functions to behave like random functions.

Second, "real" hash functions can't be random functions, because that would be totally unworkable and impractical.  If you look at practical hash functions like SHA1, SHA512 and RIPEMD160 --- all of which have nice, short implementations consisting of a few KB of code --- it should be obvious that these are not random functions, no matter how 'random' the outputs look.

Cambridge, Massachusetts, 10:35am

So if we can't use random functions in practice, what about an alternative approach?  Perhaps we could model our hash functions as random functions, just for the purposes of the security proof.  Then in real life we'd substitute in SHA or RIPEMD or some practical hash function.  It's not totally clear that the security proof would still mean anything, but at least we'd be doing something.

I'll describe a scene as Hollywood would present it: an unrealistically well-lit office at MIT.  A noted cryptographer, played by Steve Buscemi, has just proposed to model hash functions as random functions.  His colleague, played by Kate Winslet, sets him straight.

"Can't do it," she explains, shaking her head sadly, "it's not just inconvenient that it takes exponential time to evaluate a random function.  If we allowed this sort of hashing in our security proof, we would have to let the parties --- including the Adversary --- run for an exponential number of time steps.  But we can't do that.  Our entire security proof is based on the idea that the parties can only perform limited computations, specifically those that can be conducted in polynomial-time.  If let the parties peform arbitrary exponential-time computations, then adversary could do anything: he could even brute-force the key."

(Hollywood would probably punch up this dialog a bit.)

The breakthrough comes from a passing janitor (Matt Damon, reprising his role from Good Will Hunting): "What if you just create an imaginary world where everyone is limited to polynomial time computations, but there's a special exception for hashing.  Hashing, and hashing alone, doesn't take any time at all.  When you try to hash something, the clock stops."

Towards a paradigm

And so this is where we are.  The perfect hash function would be a random function.  But random functions are impractical --- we can't use them in real life, or even in our security proofs without breaking them.

Cryptographers are stubborn people, and we have good imaginations.  So we've come up with a daydream, a way to pretend that random functions are practical --- just for the purposes of our security proof.

The implications of this daydream turn out to be both wonderful and terrible.  On the one hand, it allows us to prove the security of schemes that can't be proven in other ways.  Furthermore, we can do it very efficiently.  On the other hand, it leads to security proofs that are, in a technical sense, totally meaningless.

After all, we will ultimately have to implement the scheme with a real hash function like SHA, which is decidedly not random.  What does a random oracle security proof buy us then?  This is one of the big questions that has plagued cryptographers since this idea came about.

And if you think that this is all academic, you'd be wrong.  The security proofs for some of the most widely-used cryptographic primitives are set in this model: this includes most RSA signing and encryption, as well as the Elgamal signature scheme upon which DSA is based.

If you take nothing else away from this post, you should notice that the schemes listed above are important.  If we're doing something funny in the security proofs, people should understand what it is.

In my next post I'll actually explain what a "random oracle" is, and how it relates to the crazy assumptions we made above.  I'll also explain how we can prove things in this model, and talk about why cryptographers love to hate it so much.

For the next post in this series, click here.

6 comments:

  1. I can understand that you put on a lot of efforts to assmble these series of post.I admire the work you done.Its marvellous post for information on this toic.Keep it up
    electronic signatures

    ReplyDelete
  2. Your writing is great, it would make a lay man understand the stuff easily

    ReplyDelete
  3. nice written about some good concepts

    ReplyDelete
  4. great explanation, thx a looooooot :)

    ReplyDelete
  5. Hi Matthew,

    I am wondering about what you wrote for computation time of random functions. It appears to me you're saying a random function, one implemented using a Turing machine, would take too long to run. That does not seem right - why are we using a Turing machine to calculate computational complexity? What about [word-]RAM etc.?

    I need to mention - I do understand the idea that a program necessary to generate a random function, based on Kolmogorov's complexity principles and that the function has to be deterministic, would have to be practically as long as the sum of all possible outputs, i.e. we can't do better than storing all values in a table for example. That said, in a "reasonable" computation model, element lookup can be done instantaneously.

    So that's my question. Great post BTW :)

    Cheers,
    - Mateusz

    ReplyDelete