On symbol signs, the adversary, and announcing a contest

Update 11/26/2011: I’ve had two excellent responses to my challenge. You can see them all on this page.

A couple of people have asked me about the symbols that I use to illustrate protocol diagrams. Let me first say that I claim no particular credit or blame for this style; all of that goes to Thomas Ptacek.

These symbols are part of the AIGA Symbol Signs package, a collection of freely downloadable stencils for the standard international symbols. They’re wonderful for someone as graphically challenged as I am. You can download a stencil set for Omnigraffle, or get the EPS templates from the AIGA site.

As Thomas points out, there’s all kinds of useful security stuff in them: the international symbols for Alice and Bob, the passport agent and keys, and some less-useful things like pictures of ships and planes. (Though even these turned out to be useful once, when I reviewed a security system for a cruise ship.)

However, I think what prompts questions is not Symbol Signs, but rather, it’s the symbol I’ve been using to represent the adversary. Some people seem to think the baby is an odd choice. (My guess is that these people simply don’t have one.)

However you feel about my choices, I think we can all agree that there’s a tremendous need for better design style in scientific presentations. It’s 2011, and Comic Sans + Microsoft clip art no longer cuts the mustard.

Announcing a contest. In order to do my small part to improve the appearance of security presentations, and to resolve this baby-as-adversary issue, I have decided to offer a whopping $200 (plus fame and glory) to any artist, or person of artistic bent, who can come up with a better AIGA-style symbol for the adversary, and/or other security-related symbols if they’re exceptional. This is a one-time prize, and will be offered to a single winner of my choice. I realize this isn’t a whole ton of money, but if I like the result I promise I’ll do my level best to praise your talents far and wide.

What is the Random Oracle Model and why should you care? (Part 4)

This is part four of a series on the Random Oracle Model.  See here for the previous posts:

Part 1: An introduction
Part 2: The ROM formalized, a scheme and a proof sketch
Part 3: How we abuse the ROM to make our security proofs work

This is the penultimate post in a series on the Random Oracle Model. I originally meant for this to be short and tidy, something that could be easily digested by a non-expert who was interested in what cryptographers were up to. More to the point, I expected to be done by now.

But the end is in sight, and I feel that there are only a few things left that I want to cover. Even better, one of these topics is of a very practical bent, which is in keeping with the theme of this blog.

Specifically: I keep talking about the random oracle model, but what impact does it have on anyone’s life? Where is this thing used?

Giving a complete answer to that question is hard — it’s kind of like coming up with a list of all the foods that have sodium benzoate in their ingredients list. So in this post I’m going to at least hit at least a couple of the high points, meaning: the schemes that are probably most relevant to our daily security lives.

RSA Signing and Encryption

RSA is probably the most famous cryptosystem in use today. I don’t know exactly why it’s so popular, but I suspect there are few reasons. First of all, RSA is flexible; it provides signature and encryption in one scheme. It’s relatively simple — you can compute simple encryptions with a calculator. It’s been around for a while and hasn’t been broken yet. And I would be remiss if I didn’t mention that there are no current patents on RSA.

This is all great, but in some sense it’s also too bad. Over the past twenty years, attacks on the factoring problem (and by implication, RSA) have been getting better at the margins. This means that RSA key sizes have had to get bigger just to keep up. Sooner or later we’re going to lose this game.*

More interesting (to me) is RSA’s status as a sort-of-provably-secure cryptosystem. There’s a general understanding that RSA encryption is provably secure under the hardness of the RSA problem. However, this really isn’t true, at least given the way that most people have traditionally used RSA. You can get a reduction to the RSA assumption, but you need to take some very specific steps. Most importantly: you need to invoke the random oracle model.

To explain all this, I need to take you back in time to the late 1990s and I need to talk about padding.

If you’ve ever looked at a real RSA implementation, you probably know that we don’t use plain-old ‘textbook’ RSA (“m^e mod N“) to encrypt or sign raw messages. There are lots of reasons for this. One is that plain-old RSA isn’t randomized, meaning that a given message will always encrypt to the same ciphertext.

Another reason is that RSA is malleable, meaning that you can mess with ciphertexts and signatures in creative ways — for example, by multiplying them by constants of your choice. When you do this, the decrypted (resp: verified) message will be altered in predictable ways. This is bad for encryption. It’s downright awful for signatures.

We deal with these problems by applying padding to the message before encrypting or signing it. Padding schemes can add randomness, apply hashing in the case of signature, and most importantly, they typically add structure that lets us know when a message has been tampered with.

Some historical background: RSA before the ROM

The first widely-used, standard padding schemes were described in RSA’s PKCS#1 standard. Nowadays we refer to these as the “PKCS #1 v1.5” standards, with the implication that the standard (now at version 2.1) has left them behind.

Back in the 90s these standards were the only game in town. Sure, there were a few pedantic cryptographers who objected to the lack of formal security justification for the standards. But these troublemakers were mainly confined to academic institutions, which meant that the real security professionals could get on with business.

And that business was to implement the PKCS padding standards everywhere. PKCS #1 v1.5 padding still turns up in products that I evaluate. Most notably at the time, it was all over a young standard known as SSL, which some people were using to encrypt credit card numbers traveling over the Internet.

Danielbleichen
Daniel Bleichenbacher
shortly after his arrest. Or
possibly just posing for a
Bell Labs publicity photo.

Now, I said that almost nobody of a practical bent had a problem with PKCS. One notable exception was a bright cryptographer from Bell Labs named Daniel Bleichenbacher.

Dr. Bleichenbacher had a real problem with the PKCS padding standards. In fact, one could say that he made it his life’s mission to destroy them. This vendetta reached its zenith in 2006, when he showed how to break common implementations of PKCS signature with a pencil and paper, shortly before driving a fireworks-laden minivan into the headquarters of RSA Data Security.

(Ok, the last part did not happen. But the rest of it did.)

Bleichenbacher took his first crack at PKCS in CRYPTO 1998. In a surprisingly readable paper, he proposed the first practical adaptive chosen ciphertext attack against “protocols using the PKCS #1” encryption standard. Since, as I’ve just mentioned, the major protocol that met this description was SSL, it was a pretty big deal.

The attack goes something like this.

Whenever somebody sends their credit card info over an SSL connection to a web server, their browser and server execute a handshake protocol to derive a secret key. In particular, at one step of the protocol, the browser encrypts a very important value called the “pre-master secret” under the server’s RSA public key, and ships the resulting ciphertext to it over net.

This encryption is important, because anyone who decrypts that ciphertext — immediately, or five years later — can ultimately recover the SSL transport key, and hence decrypt all communications between the server and browser.

What Bleichenbacher pointed out was very specific to how most web servers processed RSA messages. Specifically, every time the server receives an RSA ciphertext, it first decrypts it, and then checks that the padding is valid. For a typical RSA key, PKCS #1 padding looks like this:

0x 00 02 { at least 8 non-zero random bytes } 00 { message }
When an SSL webserver doesn’t like the padding it sees, it may kick back a specific ‘bad padding’ error. Bleichenbacher first showed that if he could intercept a legitimate RSA ciphertext “C = M^e mod N” on the wire, then he could ‘maul’ it by multiplying it by a chosen value. In other words, given some C, he would pick some “s” and compute:
C' = C * s^e mod N

This value C’ will decrypt to “M * s mod N”. Bleichenbacher showed that for some values of “s”, this mauled message might also be a valid PKCS-padded message, meaning that it would begin with “0x 00 02” etc. This wasn’t all that unlikely, mainly because just looking at a couple of bytes was a pretty crappy padding check in the first place.

It was quite common for servers to leak the result of the padding check back to the sender, by way of some specialized error code like “BAD PADDING”. Even if they didn’t do that, you could sometimes detect a failure in the padding check by measuring the time it took the server to get back to you.

Bleichenbacher’s main trick was noting that the “s” values that led to a successful padding check could also be used to learn something about the original message M. To make a long story short, he showed how to choose these values adaptively in such a way as to gradually zero in on the actual plaintext, until he had just one candidate. And that was the ballgame.

The actual attack could require a couple of million decryption queries, on average. This sounds like a lot, but it meant you could decrypt an SSL session overnight, as long as nobody was paying attention to the server logs.

Random Oracles to the Rescue

The Bleichenbacher attack was a big deal for a whole variety of reasons. First of all, it was practical. It worked on a widely-deployed system, and it could be implemented and executed even by non-experts. It completely negated the protections offered by SSL, unless you were willing to monitor your SSL webserver literally night and day.

But to cryptographers the attack had special significance. First, it demonstrated that adaptive-chosen ciphertext attacks were a real problem, not just some airy academic topic. I suspect that even the cryptographers who worked in this area were a bit surprised by this. Some probably started looking for less useful topics to work on.

More importantly, the Bleichenbacher demonstration was seen as a victory for proponents of provable security. Up until now, their concerns had mostly fallen on deaf ears. Now a lot of people were listening.

And even better, the crypto research community had something to say to them. The overwhelming consensus was to immediately switch to a new RSA padding scheme that had been proposed all the way back in 1994. This scheme was called Optimal Asymmetric Encryption Padding (OAEP), and it was only a little bit more complex than PKCS #1. Even better, and unlike PKCS #1, RSA encryption with OAEP could be provably reduced to the hardness of the RSA problem, in the random oracle model. 

The success of OAEP also led to the adoption of the PSS padding scheme, which did essentially the same thing for RSA signatures.

OAEP padding, courtesy Wikipedia.  G and H are independent hash functions with different output lengths.

Now, to hear some people tell it, this is where the story ends. And it’s a good story — bad, unproven security protocol gets broken. New, provably-secure protocol steps in and saves the day.

And for the most part the story is true. With some important caveats.

The first of these is implementation-related. Just as PKCS #1 was broken because the server gave too many detailed errors, even OAEP encryption can break if the server leaks too much info on its error conditions. And when OAEP implementations go bad, they can really go bad. James Manger showed that you can break 1024-bit RSA-OAEP in a mere 1100 queries, assuming your server leaks a little bit too much information on why a given decryption attempt went wrong.****

But even if you do everything correctly, the OAEP security proof only holds in the random oracle model. That means that whatever guarantees it supposedly offers are only valid if the hash function is ideal. That’s a good heuristic, meaning that OAEP is unlikely to be laden with obvious flaws.

But if you dig into the OAEP security proof, you’ll see that it’s using random oracles in about the strongest ways imaginable. The essential concept of OAEP is to use a pair of hash functions to construct a tiny, invertible Feistel network. The beauty is that every time you encrypt, you’re essentially sending the message (and some randomness) directly “through” the random oracle, in a way that permits all sorts of nonsense.

The reduction proof can then see what the adversary’s encrypting, which turns out to be very useful when you’re building a system that’s secure against chosen ciphertext attacks. This would never work if you implemented OAEP with a real hash function, like the ones that are actually used to implement it in real life.

So in one very real sense, we still don’t really have a practical, provably-secure way to encrypt with RSA. Moreover, there’s little reason to believe that we ever will, given a slew of impossibility results that seem to keep getting in the way.*****

All of which raises an important question. Given all the limitations of RSA, why are we so darn attached to it?

The Digital Signature Algorithm (DSA/DSS)

You might find it a bit strange that I’m mentioning DSA in a discussion about provable security. This is mainly because, unlike the Elgamal signature on which DSA is based, the DSA standard has no security proof whatsoever.******

There’s no really defensible reason for this. As best I can tell, NIST took a look at the Elgamal signature and thought it was cute, but the signatures were a little bit too long. So they took some scissors to it and produced something that looks basically the same, but doesn’t have the very nice reduction to the Discrete Logarithm assumption that Elgamal did.

This kind of behavior is par for the course, and honestly it gets me a little down on government standards bodies.

For the record, while DSA isn’t really provably-secure at all, the Elgamal scheme is. But unfortunately, the Elgamal scheme is only provably secure in the random oracle model. This means that if your hash function has the very nice property of being a perfectly random function and programmable, Elgamal signatures can be reduced to the hardness of the Discrete Logarithm problem. But otherwise you’re on your own.

Key Derivation and Beyond

There are probably a zillion cryptosystems that ought to be mentioned here, and we can’t cover them all. Still, there’s one more class of ‘things that hash functions are used for’ that probably isn’t really secure at all, unless you’re willing to make some pretty strong assumptions about the properties of your hash function.

The example I’m thinking of is the use of hashing to derive cryptographic keys. For example, many password-based encryption schemes use a hash function to combine a password and some salt to derive a key for encryption. This approach seems to work ok in most cases (assuming a high entropy password), but it doesn’t have any strong theoretical legs to stand on.

In general, if you’re trying to extract a uniform random key from from a ‘lumpy’, non-uniform source like a password, the proper tool to use is something called a randomness extractor. These can be implemented in the standard model (no random oracles), but all of the existing constructions pretty much suck. This sounds rude, but it’s actually a technical term that cryptographers use to indicate that a scheme is complicated and slow.

So in real life, nobody uses these things. Instead they just hash the password and salt together, and assume that the result will be pretty good. If they’re feeling frisky they might even hash it a bunch of times, which is the crypto equivalent of shooting a turkey with an AK-47. It’s probably just fine, although technically you can never rule out bulletproof turkeys.*******

Whenever people bother to argue formally about the security of schemes like this (which is, essentially, never), the usual argument goes like this: let’s just assume the hash function is a random oracle. In the random oracle model, of course, every hash function is a perfect extractor. Stick any high-entropy source into an ideal hash function, no matter how ‘lumpy’ or ‘weird’ the input is, and you get a fantastic key out the other end.

So even if your favorite key derivation has no formal security proof whatsoever, you can bet somebody, somewhere has made this argument, if only to help themselves sleep better at night.

In Summary

This was supposed to be a post about the random oracle model, but I feel that it took on a life of its own. That’s not a bad thing. I promised that I’d spend some time talking about practical crypto, and I’ve certainly accomplished that.

Although I could say a million things about the random oracle model, I only have one more post left in me. And in some ways, this is going to be the most important post of all.

You see, we always knew that this ride wouldn’t last forever, we just thought we had more time. Unfortunately, the end is nigh. Just like the imaginary city that Leonardo de Caprio explored during the boring part of Inception, the random oracle model is collapsing under the weight of its own contradictions.

This collapse — and what it means for cryptographers, security professionals, and everyone else — will be the subject of my final post.

This series concludes in Part 5.

Notes:

* NIST has begun to deprecate RSA in its FIPS standards. You can use it in some places, but it’s no longer a major supported algorithm.

**** Specifically, the OAEP decoding algorithm has a step in which it decodes the padded plaintext (a value between 0 and N-1) into a bitstring. This requires it to check that the decrypted RSA value is less than 2^n, where n is some number of bits that can be reliably encoded within the range. The Manger attack assumes that the decoding algorithm might leak the result of this check.

***** Although, on the signature side things are a little better. Hohenberger and Waters recently proposed a hash-and-sign RSA signature that’s secure under the RSA assumption without random oracles. This is really a very neat result. Unfortunately it’s inefficient as hell.

****** This is not to say people haven’t tried to piece a proof together. You can do it, if you’re willing to work in the random oracle model, and also assume the existence of leprechauns and unicorns and similarly unrealistic things.

******* In fact, repeated hashing is usually just as as much about making the hash process slow as it is about achieving a strong key. This can be helpful in thwarting dictionary attacks.

Attack of the week: XML Encryption

Unfortunately I had to skip this year’s CCS because I was traveling for work. This means I missed out on a chance to go to my favorite Chicago restaurant and on the architecture cruise. I’m told that there were also some research papers being given somewhere.

The Internet can’t make up for every missed opportunity, but it can help with the research. In fact, today’s news is all about one of those papers. I’m talking about Tibor Jager and Juraj Somorovsky’s work entitled “How To Break XML Encryption“.*

In case I had any doubts, it only took one glance at the first page to confirm that we’re back in “someone screwed up basic symmetric encryption” territory. But really, to describe Jager and Somorovsky’s findings like this does them no justice. It’s like comparing the time that kid built a nuclear reactor in his garage to, say, the incident at Fukushima.

Some background is in order.

What is XML Encryption and why should I care?

You probably already know that XML is the world’s most popular way to get structured data from point A to point B.** Following the success of HTML, the computing world decided that we needed a single “flexible” and ASCII-based format for handling arbitrary data types. (We’ve spent much of the subsequent period regretting this decision.)

The one upshot of this XML epidemic was the emergence of some standards and libraries, which — among other things — were supposed to help application developers process data in a reliable and secure way. One of those standards is SOAP, which is used to transport data in web services frameworks.

Another standard is the W3C XML Encryption Standard, which was dropped like a log in 2002 and doesn’t seem to have been updated since. The point of this standard was to give developers a uniform (and secure!) way to protect their XML documents so that we wouldn’t wind up with five thousand incompatible, insecure ways of doing it. (Oh the irony.)

Finally, a very common implementation of both standards can be found in the Apache Axis2 web services framework and in the RedHat JBoss framework. These are probably the most common open-source SOAP frameworks. They power a number of systems you don’t necessarily think about, but you probably should because they carry a lot of your personal information.

So what about the encryption?

To make a very long story short, the W3C standard recommends to encrypt messages using a block cipher configured using (our old friend) CBC-mode. Here’s a quick recap on CBC mode, courtesy Wikipedia:

CBC mode encryption. The message is first subdivided into equal-length blocks and encrypted as in the diagram.  The circular plus symbol denotes XOR.

There are two basic things you need to know about CBC mode, and ought to know if you ever plan to use it.***

First: CBC requires that every plaintext be an even multiple of the block size. In the case of AES, this means 16 bytes. If the message is not the right length, then CBC implementations will pad the message with additional bytes. Of course it must be possible to recognize this padding so that it can be stripped off after decryption. There are various schemes for doing this.

The W3C standard uses the following padding approach. Let “MM” indicate message bytes, and let “NN” be the total number of padding bytes being added. “XX” represents arbitrary padding bytes, which can hold any value you want. A padded block will look like this:

0x MM MM MM MM MM MM MM MM MM MM XX XX XX XX XX NN
When a decryptor encounters a padded message, it looks at the last byte to figure out how many bytes to strip off. If it encounters padding that doesn’t make sense, i.e., NN > 16 or NN < 1, then it should see something funny going on and reject the message.
Second: CBC ciphertexts are malleable. This means that you can modify a CBC-encrypted ciphertext such that your modifications will carry through decryption, and have a meaningful effect on the resulting decrypted plaintext.
In the simplest case, you can truncate the message by stripping blocks off the end. This does the same thing to the resulting plaintext.
More interestingly, you can flip any bit (or set of bits) in the IV of a CBC-encrypted message, and upon decryption you’ll find that the same bits have been flipped in the first block of the decrypted plaintext. You can also do a lot more, but it’s not that important right now.

Anything else I should know?

Yup. You need to know how XML messages are formatted. They’re encoded using UTF-8 — essentially ASCII — with some special rules.

These are described in the paper. Briefly: any XML message that contains bytes ranging from 0x00-0x1F (with the exception of tabs, LF and CR) may be considered invalid. Similarly, the ampersand (&) character is used as an escape character, and must be followed by a valid string. Lastly, open brackets “<” must be properly closed. Messages that don’t meet this standard should be rejected.  ****
Of course, the message will have to be decrypted (using the appropriate key) before any of these checks can be run.

This is all fascinating, but how does it lead to an attack?

There’s one more detail I haven’t given you. You see, in addition to the details above, the Axis2 server (ditto JBoss) is kind enough to let you know when you haven’t met its standards for an XML message.

Specifically, after it decrypts a message, it kicks back an error if the message either: (1) has bad padding, or (2) contains invalid characters. In both cases, the error is the same. And this error is your way in the door.

I’m not going to completely describe the attack, but I’ll try to give an overview.

Imagine that you’ve intercepted a legitimately-encoded, encrypted XML message (IV, C1, …, CN) and you want to know what it says. You also have access to an Axis2 or JBoss server that knows the key and can decrypt it. The server won’t tell you what the message says, but it will give you an error if the encoding doesn’t meet its standards.

Sending the original, intercepted message won’t tell you much. We know that it’s encoded correctly. But what if you tamper with the message? This is precisely what Jager and Somorovsky proceed to do.

Step 1: Truncate the ciphertext. A good way to start is to chop off everything after the first ciphertext block, and send the much-shorter message consisting of (IV, C1) in to be decrypted. Chances are good that this new message will produce a decryption error, since the server will interpret the last byte of the decrypted message as padding — a number that should be between 0x01 and 0x10.

Step 2: Tweak the padding. I already said that if you flip bits in the IV, this will result in a similar change to the decrypted plaintext. Using this concept, you can force the last byte of the IV through all possible values and ask the server to decrypt each version of the ciphertext. More formally, the ‘new’ last IV byte can be computed as (‘original last IV byte’ ⊕ i) for i from 0 to 255.

In the best case, 16 of these test ciphertexts will decrypt without error. These correspond to the decrypted values 0x01 through 0x10, i,e., valid padding. If fewer than 16 of your test ciphertexts decrypt, this means that there’s something wrong with the message: probably it contains an unclosed “<” tag.

Step 3: Squish the bug(s). This is no problem. If exactly only one of your ciphertexts decrypts successfully, that means the open “<” character must in the first byte of the message. You caused the last byte of the message to decrypt to 0x10 (16 decimal), and the decryptor treated the whole block as padding. There are no errors in an empty message.

If two ciphertexts decrypt successfully, then the “<” character must be in the second position of the message, because now there are only two padding lengths (16 and 15) that would cause it to be stripped off. And so on. The number of successful decryptions tells you where the “<” is.

Now that you know where it is, kill it by flipping the last bit of the appropriate byte in the IV. This turns “<” into a harmless “=”.

Step 4: Learn the last byte of the block. You should now be able to come up with 16 test ciphertexts that decrypt successfully. This means you have 16 values of “i” such that ‘last byte of the original IV’ ⊕ i results in a successful decryption. One of these “i” values will differ from the others in the fourth most significant bit. This one will correspond to the padding value 0x10.

If “x” is the original plaintext byte and “i” is the value you just identified, you know now the last byte of the block. Since we have x ⊕ i = 0x10, then through the very sophisticated process of rearranging a simple equation we have 0x10 ⊕ i = x.

Now, using our knowledge of the last byte of the plaintext, manually tweak the IV so that the last byte (padding) of the plaintext will decrypt to 0x01.

Step 5: Learn everything else. The rest of the attack hinges on the fact that all of the bytes in the message should be ‘acceptable’ UTF-8 characters. Thanks to our trick with the IV, we can now flip arbitrary bits in any given byte, and see whether or not that leads to another ‘acceptable’ character or not.

Believe it or not, the results of this experiment tell us enough to figure out each character in turn. I won’t belabor this, because it takes a little bit of squinting at the ASCII table, but fundamentally there’s not much magic beyond that. It’s very simple and elegant, and it works.

Step 6: Finish it. You’ve done this for one block. Now do it for the rest. Just set your ‘ciphertext’ to be the next block in the message, and your ‘IV’ to be the ciphertext block immediately preceding it. Now go back to Step 1.

And that’s the ballgame.

Obviously all of this takes a lot of tests, which you can reduce a bit using some of the optimizations suggested in the paper. Jager and Somorovsky are able to recover plaintexts with “14 requests per plaintext byte on average.” But at the end of the day, who cares if it takes 50? The box is sitting there ready and willing to decrypt ciphertexts. And it’s incredibly unlikely that anyone is paying attention.

Ok. Can’t you fix this by authenticating the ciphertexts?

This entire attack is a special example of an adaptive chosen ciphertext attack. (Specifically, it’s a super-duper variation of Vaudenay’s padding oracle attack, which he discovered in 2002, the same year the W3C standard hit!)

These attacks can almost always be prevented with proper authentication of the ciphertexts. If the decryptor checks that the ciphertexts are valid before decrypting them, the attacker can’t tamper with them. Hence, no useful information should leak out to permit these attacks at all.

The designers could have added a mandatory MAC or even a signature on the ciphertext, or they could have used an authenticated mode of operation.

But the W3C standard does provide for MACs and signatures!

Indeed. As the authors point out, there’s an optional field in the specification where you can add a signature or a MAC over the ciphertext and its meta data. But there’s one tiny, hilarious thing about that signature…

Did I mention it’s optional?

You can quite easily take a signed/MACed ciphertext and ‘convert’ it into one that’s not signed or MACed at all, simply by stripping the ciphertext contents out and placing them into a section that does not claim to have a signature. Since the spec doesn’t mandate that the authentication be on every message, the decryptor will be totally cool with this.

So in summary: optional MACs and signatures don’t buy you much.

Is there anything else to say about this?

Lots. Stop using unauthenticated CBC mode. Stop using any encryption standard designed before 2003 that hasn’t been carefully analyzed and recently updated. Stop using any unauthenticated encryption at all. Wrap your SOAP connections with TLS (a recent version!) if you can.

People didn’t really think much about active chosen ciphertext attacks on symmetric encryption before 2000, but now they’re the new hip thing. If your system is online and doesn’t have a solid, well-analyzed protection against them, don’t pretend that you’re doing anything at all to secure your data.

I wish I had a funny, pithy way to sum this all up. But honestly, I’m just a little depressed. It’s not Jager and Somorosky’s fault at all — this is a great paper. It’s just that there’s too much embarrassingly bad crypto going around.

While I love reading these papers, I’m starting to feel that this one-off approach is not sufficient to the problem. Maybe what the community needs is a centralized clearinghouse for this stuff, a website where suspect crypto standards (and implementations) can be identified and listed. Crowdsourced and analyzed. Probably many are just fine, but I doubt most have been looked at (or even thought about in a while).

Anyone looking for a project?

Notes:
  
* Tibor Jager and Juraj Somorovsky, “How To Break XML Encryption”. In ACM CCS 2011. ACM Press.

** Someone is inevitably going to tell me that JSON is the world’s most popular way to move structured data. Ok. I don’t care.

*** But better yet, don’t use it. Use a standard authenticated encryption scheme like CCM, OCB or GCM, and use someone else’s implementation according to the standards. These modes should prevent all of this nonsense.

**** These details are quoted straight from the Jager and Somorovsky paper. I’m not 100% sure if all implementations enforce this, but Axis2 does.

What is the Random Oracle Model and why should you care? (Part 3)

This is part 3 of a series on the Random Oracle Model.  See here for the previous posts:

Part 1: An introduction
Part 2: The ROM formalized, a scheme and a proof sketch

I began this series in the hope of writing a true layperson’s introduction to the Random Oracle Model. At the outset I figured I probably knock this thing out in a couple of not-too-technical posts. Yet, here I am at post three, and there’s still a big chunk of technical left to go.

Please bear with me. We’re going to get back to some solid, practical crypto with the next post in the series.

A quick recap

In the last two posts we’ve covered a bit of ground. We talked about cryptographic hash functions and their properties, and discussed why the cryptographer’s ‘ideal’ hash function would be a random function. We noted that random functions would never work in real life (too big to store, too slow to compute), so we came to a strange resolution: we would pretend that our hashes were random functions, just for the purposes of our security analysis. Then of course we would use real hash functions like SHA.

Of course, even this required us to tamper with the fabric of the universe, since we want to restrict our ‘adversaries’ to running in a reasonable amount of time, and they can’t do that if they’re busy evaluating random functions. So we created a new model of computation (aka ‘thought experiment’) called the Random Oracle Model. In this model all hashing is computed by a magical third party called an oracle.

If you know a little about the random oracle model, say from reading the previous posts, you could be forgiven for thinking that it isn’t all that bad. Basically we’re assuming that hash functions are super-duper “random”. Maybe a little more so than a real function can be, but big deal.

But unfortunately it’s much more than that.

The world’s shortest lesson on provable security

To explain what I mean, we first need to understand a little about provable security, and particularly, a type of proof called a reduction proof.

When we say that an encryption scheme is “provably secure”, we usually mean that it’s secure under some assumption. For most practical schemes the assumption is mathematical in nature. For example, we might found an encryption scheme on the assumption that it’s hard to factor large composite numbers, or to find the shortest vector in a lattice. In this case ‘hard’ doesn’t mean ‘I can’t figure out how to do it’. It means: there exists no (efficient) algorithm that can do it.1

Now, I mentioned that these are assumptions. We don’t absolutely know that these problems are hard. However, we like to think that if some bright people have looked at the problem for a while, and nobody has made any progress towards solving it, that means it’s probably not easy. In some cases the problem may even fall into a useful class of problems, i.e., it may be NP-Hard.

Even if you’re not willing to blindly trust in these assumptions, security proofs are still useful. Analyzing a new cryptosystem can soak up huge amounts of time and energy. If we can at least ‘reduce’ each new scheme to one of a small number of mathematical problems, we can pour our brainpower into attacking just those problems, knowing that our work will apply to every scheme that uses them.

All of this brings me to my main point, namely, how do these proofs work. Getting specific, the general flow of such a proof is as follows:

  1. Assume that problem X is hard.1
  2. Show that if there exists an (efficient) algorithm that “breaks” our cryptographic scheme, then there exists an (efficient) algorithm that solves problem X.
  3. Snottily point out that (2) means if the scheme can be broken, then problem X can’t be hard.

Clearly point (3) cannot be true if point (1) is. Following the logic, this allows us to conclude that if Problem X is hard, then there can’t be an algorithm that breaks our encryption scheme.

Of course I haven’t explained how to actually accomplish step (2), which is pretty important. It’s also the beautiful part. You see, in most reductions, step (2) consists of actually writing an efficient algorithm that solves problem X.

You probably think I’m crazy. I just claimed that problem X can’t be efficiently solved, now I’m telling you that we write an algorithm for solving it. But the genius of it is that our solver algorithm won’t quite work. It’s missing an important subroutine.

That subroutine is identical — at the API level — to an algorithm that ‘breaks’ our encryption scheme.

So here is how the argument breaks down: if there was, hypothetically, an algorithm that efficiently ‘broke’ our encryption scheme, then we could plug it into the hole we left in our ‘solver’ algorithm. This would yield an algorithm that efficiently solves problem X. Hence, it completely satisfies what we’re asking for in point (2).

Black Boxes

With this foundation laid, I only need to make one more point. There’s a basic restriction that standard reduction proofs must obey. Our ‘Problem X solver’ algorithm must use the ‘encryption scheme breaker’ subroutine as a black box. It cannot make assumptions about how the breaker algorithm works internally.

Let me be more clear about this. We cannot, for example, have a step in our ‘solver’ algorithm where we decompile the breaker algorithm, or peek inside its internal memory registers, and then assume that they’ll contain a partial result we need.

This is because our reduction only works if it works with every possible algorithm that breaks the scheme. Since we can’t guess a priori how such an algorithm might work internally, we can’t depend on its internal workings. The breaker algorithm could hypothetically be horrifically obfuscated, to the point where we couldn’t understand it even if we held the code in our hands.

Thus in a traditional black box reduction, we can count on the breaker algorithm having a standard “API” (which we’ll define at some point). This typically takes in public keys and/or ciphertexts (or signatures, etc.) It outputs some useful information that constitutes our definition of ‘breaking’. And though it doesn’t always need to succeed, we expect that it does so with reasonable probability. But everything that happens inside the algorithm must be hidden from us.

Dirty laundry

You wouldn’t think this has
anything to do with provable
security.

Let’s pause for a moment to consider some related problems in the area of celebrity journalism.

Imagine you’re a muckraking celebrity reporter who’s been assigned to get a scoop on Brad Pitt and Angelina Jolie’s latest adopted baby. The Pitt/Jolies are extremely private, and won’t even acknowledge that they’re adopting. You’re a clever reporter, and more importantly, you have no morals.

You know that if you could just get a peek at the family laundry, you’d have your whole story. Sex, size, weight, favorite color. Even diet (Angelina insists on eco-safe cloth diapers). In a perfect world you could drive over to Pitt/Jolie’s laundry company, slip someone $500, and lo — your whole story would be there in the clothes. If you were really devious, you could start your own celebrity laundry service and trick them into sending it right to you.

But this is all wishful thinking. Pitt and Jolie don’t send out their laundry — they have it done at home.  In truth, you’re not sure that they even do laundry. You heard a rumor that Angelina has her clothes ritually burned after each use.

In case you don’t see the relevance, this is an analogy to reduction proofs. To you, the ‘breaker’ algorithm is a black box. You can see inputs go in, and see what comes out, but you can’t see what it’s doing in between. If your reduction depends on peeking at its dirty laundry, you’d better come up with a better reduction. It does its own laundry.

Back to the Random Oracle Model

The Random Oracle Model is different. In my previous post I explained that it’s just like the real world, with one exception: nobody can compute hashes by themselves. Every party — including the adversary (synonymous with ‘breaker’ algorithm) — must hash by sending messages to a special outside party. This party computes the hashes using a random function, and sends them back to the caller.

To make this abundantly clear, in the Random Oracle Model, everyone sends out their laundry.

The Random Oracle Model. The oracle computes hashes for all parties, including the adversary.

Speaking more formally, in the Random Oracle Model, an adversary (aka our ‘breaker’ algorithm) is no longer quite a black box. In one very important way we have visibility into it. It must call outside of itself if it ever wants to hash something.

If we’re building a reduction, that means we can take advantage of the ‘extra’ info that leaks from these calls, just as our gossip reporter could take advantage of celebrities’ laundry. We know that the algorithm has to make these hash calls, so there must exist a ‘port’ in the API for it to make them. Once we make this stipulation, our reduction can tap into this port and intercept the adversary’s hash calls to see which messages it’s asking for, and possibly even tamper with the results it gets back.

In an extreme case, our reduction can even run its own fake ‘oracle’ (aka laundry service), as long as we always hand back convincing responses.

And while this may not seem a big deal, it really is. Just as Brangelina’s dirty laundry could reveal major details about the stars’ private life, the choice of which messages a ‘breaker’ algorithm chooses to hash can reveal major details about what it’s thinking internally. By tampering with those responses we can also fundamentally change its operation. These details can be the difference between a reduction proof that works, and one that doesn’t.

Taking Stock

At this point, I think it’s worth taking a breath and asking what we’ve gotten ourselves into here. When we started with this nonsense, all we wanted was a better hash function. One that looked slightly more ‘random’ than any real function we could construct.

Along the way we realized that random functions were totally impractical. Even to pretend we could use them, we had to tamper with our computational model and “pull the random function” outside of the parties. After all, they run in polynomial time and couldn’t possibly evaluate such a function.

Each of these steps seemed perfectly reasonable, and yet somewhere we jumped the shark. We took the most sacrosanct principle of reductionist security and violated it. Instead of “our argument works with any adversary”, we now say “our argument works only with adversaries who kindly tell us whatever and whenever they need to hash”.

Real adversaries don’t work like this. If someone comes up with an algorithm that breaks an implementation of RSA-OAEP that uses a real hash function (say SHA256), that algorithm won’t need to tell us what it’s hashing. It will just do it internally. And if its code is obfuscated, we won’t be even able to connect it to our reduction, if our reduction depends on knowing (or tampering with) the hash calls.

An Example: RSA Signing

At this point I’ve run out of ways to talk about the Random Oracle Model without actually invoking a real scheme. So let’s bring this back to something concrete.

The RSA scheme is probably the most famous cryptosystem of the modern age. Many people mistakenly believe that RSA is based on the hardness of factoring large integers. In fact, that’s not quite accurate. RSA is actually based on a slightly different, but related assumption not-creatively named the “RSA assumption“.

This assumption holds that the following problem is fundamentally hard, at least when the numbers get sufficiently large.1 Given three values (N, e, M), where:

  1. N is the product of two prime numbers (p, q) that you don’t know.
  2. 2 < e < N and gcd(e, (p-1)(q-1)) = 1.
  3. 0 <= M < N.

The problem is to find a value “S” such that S^e mod N = M.  2

It’s conjectured that this problem is hard1 if you only have the values given above. However, if you know (p, q) or an additional ‘secret key’ derived from those values, it’s quite easy to solve.

The RSA signature scheme depends on this. In a textbook RSA signature, you “sign” a message by first encoding it as an integer m, such that 0 <= m < N.  Then, using your secret key you compute a ‘signature’ “s”, such that s^e mod N = m. (See here for the exact algorithm.)

The intuitive argument for RSA signatures rests on three legs:

  1. Since only you know the secret key, it’s easy for you to “sign” messages of your choice.
  2. Anybody else can verify your signatures given (m, s) and the “public key” (N, e).
  3. Intuitively, nobody else can sign messages on your behalf (this is known as ‘forging’), because doing so without the secret key would be equivalent to solving the RSA problem.

This is, of course, total nonsense.

If you look carefully, what the RSA assumption actually implies is that it’s hard to come up with a “signature”(S) on a particular message (M). If the adversary has freedom to choose the message himself, then it’s easy for him to come up with a whole bunch of ‘signed messages’. This makes textbook RSA signatures pretty useless.3

More importantly, to prove that RSA signatures are secure under the RSA assumption, we can’t just have the adversary spitting out any old message/signature pair. To make a reduction work, we must use the adversary to solve a specific RSA problem instance (N, e, M). That means whatever the adversary outputs, we must be able to use that value in order to compute an “S” that satisfies S^e mod N = M.

And now here, finally is where the hash functions come through for us.

Adding a Hash

I’ve mentioned previously that most signature schemes don’t literally sign a raw message. Rather, they sign the hash of a message. This is convenient, since messages can be large and hashes are usually small. But in the Random Oracle Model it has a special significance.

Let’s say that we have a hash function H() with the odd property that it outputs an integer between 0 and N-1. This is called a “full domain hash”. This is a little unusual for a hash function, but I promise that such things can be built out of more “traditional” hash functions like SHA.

Instead of signing the message directly, we could have our signer first hash the message (which can now be an arbitrary string), and sign the resulting hash. More formally, a signature “s” must now satisfy:

s^e mod N = H(message)

Can we prove that this scheme is secure under the RSA assumption?

A proof sketch

Imagine that A is some algorithm that breaks the signature scheme, meaning that it takes in a public key (N, e), and eventually outputs an “existential forgery”, i.e., any pair (message, s’) that satisfies the verification equation above.

Our goal is to design a solving algorithm for that can take in an arbitrary RSA problem instance (N, e, M) and somehow force A to help us solve it. Our goal is to eventually spit out a value S where S^e mod N = M.

First, our solver will send A a “public key” consisting of (N, e). Since we’re in the Random Oracle Model, A will eventually have to hash something. Whenever A asks to hash a message, we’ll intercept its attempt to call the oracle, pick a random value 0 <= r < N and answer its query as follows:

H(message) = M * r^e mod N

I will claim that since “r” is random, then our “hash” result is also randomly-distributed, and thus “looks right” to A. So he won’t twig to what we’re doing. Most importantly, we’ve managed to tie every single hash value back to the one value “M” that we took from the RSA problem instance. Of course, we’ll also store each value “r” so we can retrieve it later.

We assumed that eventually A will output a valid “forged” message/signature pair of the form (message, s’), where this pair satisfies s’^e mod N = H(message). If this is the case, then all we have to do is look up the “r” value used to compute H(message) and output:

S = s’ / r mod N         (or more accurately, s’ * inverse(r) mod N)

This is the answer to our RSA problem instance, since S^e mod N = M. 4, 5

Why does this work? Well, let’s simplify the equation “s’^e mod N = H(message)” a bit:

s’^e mod N                            = H(message)
s’^e mod N                            = M * r^e mod N         (this is the hash we computed)
s’                                            = M^{1/e} * r mod N  (raising both sides to the inverse of e)
        s’                                            = S * r mod N              (because S = M^{1/e}.)

A full version of this proof can be found here. None of it works in the real world, of course. The minute you sub in a real hash function, this whole proof is bunk.

A few notes and a conclusion

This has been a long post; even my footnotes are getting out of control. I do hope I was able to make my main point, which is that the Random Oracle Model is both deeply unnatural, and deeply useful to cryptographers writing security proofs.

This is a dangerous combination, since it allows us to take shortcuts and make assumptions, that simply can’t be supported in the real world we live in. And yet, it’s so powerful that we can find it hard to stop.

More to the point, even when cryptographers stop relying on this crutch (and it’s started to happen), we have a hard time convincing end users to get rid of schemes that have been proven in the ROM. Heck, we have a hard time convincing our friends in the systems security community. They think we’re nuts.

And it’s hard to blame them. Even when we have proper replacements, they’re usually funny looking, almost always more “costly”, and sometimes rest on bizarre new mathematical assumptions that people don’t trust. Plus, it’s not like the random oracle schemes are broken.

In my next post I’m going to talk about all this, pull back from the crypto a bit, and focus on this dispute. I’ll also talk a bit about the state of the art in ROM/non-ROM schemes.

And oh yes, I’ll explain why cryptographers have started to get religion on this point.  Hint: someone proved something. Something very bad for the ROM.

Click here for the next post in this series.

Notes:

1. More formally, saying that a problem is “hard” means that that all probabilistic polynomial time Turing machines solve it with at most negligible probability.

2. For a quick explanation of modular arithmetic, see here.

3. For example, he can pick an arbitrary number as “S”, then — using only the public key — work backwards to compute “M” as M = S^e mod N. This pair (M, S) is a valid message/signature pair. Similarly, he can mangle legitimate message/signature pairs in various ways, to get new ones.

4. In practice, there’s no division mod N. Instead, we compute the inverse of r” mod N, then multiply by that number.

5. In a real proof, we would also consider the possibility that the adversary can ask for and obtain signatures on chosen messages. Our reduction would have to answer these requests. This can be done, but it adds a bit of complexity. See the full proof for some details on this process.

Oh my…

Neal Koblitz is not a happy camper:

At Crypto 2011 the Program Chair, Phil Rogaway, explained that the paper [2] was chosen for the Best Paper Award “overwhelmingly” by the Program Committee, which “praised the work for its broad appeal…and its potential impact.”

And, indeed, the treatment of hashed ElGamal encryption in [2] is in some sense a remarkable achievement. Usually one has to leave the sciences entirely if one wishes to find works of scholarship — for example, Michel Foucault’s turgid 760-page three-volume philosophical treatise on sex [14] — that have been so successful in turning something that should be interesting and accessible to everyone into something lengthy, unreadable, and boring.

DESFire

If you skipped today’s crypto news, you missed some good press for Daviddesfire Oswald and Christof Paar’s recent side-channel attack on the Mifare DESFire MF3ICD40 chip.* I’m hardly an expert on side-channel attacks, but being uninformed has never stopped me before, and I figured it might be fun to read the paper and talk about this very neat attack.

What’s the DESFire MF3ICD40 and where is it used?

I’m guessing MF3ICD40 probably isn’t a household name where you live, but you might be familiar with one of the systems that use it.  These include the San Francisco Clipper card, the Czech “in-karta”, and a cute Australian card transit card known as the myki.  It also powers a number of access control systems, like the prox cards that let you into some high security buildings.
Clipper card.

The MF3ICD40, henceforth “MF3”, is a capable little chip.  It’s a Radio Frequency Identification (RFID) device, meaning that it communicates wirelessly with a reader — e.g., a Clipper card turnstyle.  It can store a certain amount of information and retrieve it on demand.  Like the DST chip I mentioned in an earlier post, every MF3 contains a cryptographic key and a cipher, which it can use to (among other things) perform a challenge/response authentication protocol.

Unlike the DST, the MF3 uses a pretty reasonable cipher for its authentication protocol. Specifically, it employs Triple-DES with a 112-bit key.  While 3DES is definitely getting a bit long in the tooth (it’s been deprecated in FIPS), it’s not thought to be vulnerable to any truly practical attacks.  Nor, to the best of my knowledge, is there anything wrong with the way that the MF3 uses 3DES.

So if the crypto is ok, what’s the problem?

Just like the DST, the MF3 chip contains no internal power source.  Instead, power for the chip is extracted from a magnetic field generated by the reader.  This is mostly a good thing — by powering the chip this way, the designers were able to dramatically reduce its size and cost.  Moreover, you don’t need to replace your MF3 when some crappy battery dies.

Unfortunately, the use of an external power source opens the device up to a special class of side-channel attack known as Correlation Power Analysis, or CPA.  Since the MF3 draws power from the reader, a malicious reader can actually tell — from moment to moment — how much power the device is using.

Ordinarily this wouldn’t be very interesting. However, if you know exactly how the device performs the computation, and you can repeatedly measure the device’s power consumption while sending it carefully crafted input values, you can actually learn quite a lot.  Specifically, when the device encrypts (or decrypts) your chosen inputs, those power fluctuations can actually leak the bits of the cryptographic key.  In the case of MF3, you can obtain the full 112-bit key in about seven hours.

Isn’t side channel analysis pretty well known?

Yes, and that’s what makes this interesting.  With caveats.

Side channel attacks themselves are hardly a new thing.  The concept of using power analysis to attack cipher implementations was first proposed in CRYPTO ’99 by Paul Kocher, Joshua Jaffe and Benjamin Jun, all of whom worked for Paul’s company, Cryptography Research.  In a truly inspired move, Paul and his researchers turned around and patented the standard set of countermeasures to the very attack they had invented.  These countermeasures can be found in just about every high-security smart card manufactured today.

This story is important because the MF3 clearly didn’t have the latest and greatest hardware countermeasures built into it.  If it had, the Oswald and Paar attacks wouldn’t have gone anywhere. (Though interestingly, it did have one countermeasure, see below.)

This seems to be a common situation for a certain class of RFID devices, of which the MF3 is one.  These devices toe a funny line between “contactless smart card” (where SCA countermeasures would be expected) and “plain-old RFID” where security isn’t thought to be as important. Unfortunately, the people who select these systems may not realize this, and that’s when bad things can happen.

I should also add that — unlike a recent side-channel attack on KeeLoq — this attack does not require the adversary to physically crack open the device. It’s implemented purely via the EM field.

So how does the attack work?

This is clearly the crux of the matter, and unfortunately this is where I’m going to have to let you off with a warning.  Side channel analysis is just not my area, and there’s little I can offer here that would improve on reading the paper itself.**, ***

But to make a very short go of it, DES works by breaking the key up into smaller pieces called “subkeys”, each of which is used in a different round of the cipher.  Portions of these subkeys are combined with bits of of an attacker-supplied input and sent to a series of small functions called S-boxes.  Since only a few bits go into each S-box, there is a relatively small set of possible input values, each of which results in a slightly different power consumption profile.

By observing the power consumed when calculating these S-boxes on different inputs — both for known and unknown keys, and over many, many experiments — the attacker can correlate these traces to figure out the bits that come from an unknown key.  The full key recovery attack takes ~250,000 traces, which require about 7 hours to collect.

The most interesting aspect of the paper is that the authors didn’t actually know how the DES implementation in the MF3 worked.  Much of the paper is devoted to reverse-engineering the necessary details of the operation by examining power traces.  What’s most interesting about this discussion is that along the way, they discovered that the MF3 actually does implement a countermeasure to CPA — specifically, it randomly adds up to eight “dummy rounds” to DES. Unfortunately for the MF3, this protection was not sufficient to prevent the attack.

Figure from the Oswald-Paar paper.  These charts show the operation of the DES cipher  on several input values.  The authors used this information to reverse-engineer the operation of the device.

So what’s next for MF3?

A quick glance at the MF3 website will tell you that the manufacturer “has begun phasing out the MF3ICD40”, but that “orders … will continue to be taken until December 31st, so start planning your new year’s party now!” (Ok, I made up the last bit.) From here on out, it looks like it’s going to be AES-based cards all the way.

The note doesn’t specifically mention this attack, but I assume that’s what’s behind it.  If so, that’s a pretty good response for a manufacturer.  To pull a profitable card off of the market probably cost them quite a bit, but it protects their reputation, not to mention their customers.  Unfortunately, there’s no indication that the newer AES-based cards actually have stronger protections against side channel analysis; they’re just using a different cipher.  But I guess that’s what next year’s CHES is for.

Notes:

* This post brought to you by David Oswald and Christof Paar.  Breaking Mifare DESFire MF3ICD40: Power Analysis and Templates in the Real World.  In CHES ’11.

** This is an extended version of the conference paper that I read. Thanks to Bart Coppens for finding the link.

*** If you want to hear from people who do know their side channels, see Luke Mather’s excellent post on DPA and distinguishers from the Bristol Cryptography Blog.

What is the Random Oracle Model and why should you care? (Part 2)

This is part 2 of a series on the Random Oracle Model.  See “Part 1: An introduction” for the previous post.

In a previous post I promised to explain what the 4430509846_717f9aab2b_zRandom Oracle Model is, and — more importantly — why you should care about it.  I think I made a good start, but I realize that I haven’t actually answered either of these questions.

Instead what I did is talk a lot about hash functions. I described what an ideal hash function would look like: namely, a random function.  Then I spent most of the post explaining why random functions were totally impractical (recap: too big to store, too slow to compute).

At the end of it all, I left us with the following conclusions: random functions would certainly make great hash functions, but unfortunately we can’t use them in real life. Still, if we can’t get a security proof any other way, it might just be useful to pretend that a hash function can behave like a random function. Of course, when we actually deploy the scheme we’ll have to use a real hash function like SHA or RIPEMD — definitely not random functions — and it’s not clear what our proof would mean in that case.

Still, this isn’t totally useless. If we can accomplish such a security proof, we can at least rule out the “obvious” attacks (based on the kind of glitches that real scheme designers tend to make). Any specific attack on the scheme would essentially have to have something to do with the properties of the hash function. This is still a good heuristic.

As you can see, this has gotten somewhat technical, and it’s going to get a bit more technical through to the end of the post. If you’ve gotten enough background to get the gist of the thing, feel free to skip on to the next post in this series, in which I’ll talk about where this model is used (hint all modern RSA signing and encryption!), and what it means for cryptography in practice.

Brass tacks

A security proof typically considers the interaction of two or more parties, one of whom we refer to as the adversary. In a typical proof, most of the players are “honest”, meaning they’ll operate according to a defined cryptographic protocol. The adversary, on the other hand, can do anything she wants. Her goal is to break the scheme.

Usually we spell out a very specific game that the parties will play.  For an encryption scheme, the rules of this game allow the adversary to do things that she might be able to do in a “real world” environment. For example: she might be able to obtain the encryption of chosen plaintexts and/or the decryption of chosen ciphertexts. In most cases she’ll intercept a legitimate ciphertext transmitted by an honest encryptor. Typically her goal is to learn (something about) the underlying plaintext.

It’s standard to assume that the parties will all know the scheme or protocol, including how to compute the hash function if one is used. This is a common-sense assumption, sometimes referred to as Kerckhoffs’s Principle.  Moreover, we generally assume that the adversary is limited in some way — she doesn’t have unlimited computational power or time, which means she can’t just brute force the decryption key.

Security proof in the random oracle model.  All parties (Encryptor, Decryptor and Adversary) can call out to the Random Oracle, which computes the hash function for them and provides consistent responses to all parties.

A Random Oracle security proof is just like this, but we make one further tweak. Even though everyone is allowed to compute the hash function, they can’t do it by themselves. Hashing is done by a new magical party called the “oracle”. If any party needs to hash something, they ship the message (over an imaginary secure channel) to the oracle, who computes the hash and sends it back to them.

Why the oracle? Well, since the oracle is “outside” of the parties, they no longer need to carry around a description of the hash function, and they don’t need to do the work of computing it.  This is good because — as I mentioned in the previous post — the full description of a random function is ridiculously large, and consequently takes an insane of amount of time to compute. By sticking it into the oracle, we can handwave away that ugliness.

We’ll place two restrictions on the random oracle. First, it must implement a genuine random function (sampled at the beginning of the game), with the same input/output profile as a realistic hash function we might use (e.g., 256-bit output strings if we’re eventually going to implement with SHA256).

Secondly, the responses given by the oracle must be consistent.  In other words, just as hashing “Peter Piper” through SHA1 will always produce 1c6244929dc8216f5c1024eebefb5cb73c378431,* regardless of who does it, sending a given string through the random oracle will produce the same result, regardless of which party asks for it.

Lazy Afternoon

There’s one last (and kind of neat) trick related to how we model the Random Oracle. Internally a random oracle just implements a huge table mapping input values to random output strings.  Since this is the case, our security proofs can fill this table in “lazily”. Rather than starting with an entire table all filled in, we can instead begin our analysis with an empty table, and generate it as we go. It works like this:
  1. At the beginning of the game the oracle’s table is empty.
  2. Whenever some party asks the oracle to hash a message, the oracle first checks to see if that input value is already stored in the table.
  3. If not, it generates a random output string, stores the input message and the new output string in its table, and returns the output string to the caller.
  4. If the oracle does find the input value in the table, it returns output value that’s already stored there.
If you think about this for a minute, you’ll realize that, to the various parties, this approach “looks” exactly the same as if the oracle had begun with a complete table. But it makes our lives a little easier.

Back to Delphia

In the last post I proposed to build a stream cipher out of a hash function H(). To get a little bit more specific, here’s a version of what that algorithm might look like:
1. Pick a random secret key (say 128 bits long) and share it with both the sender and receiver.

    Set IV to 1.

2. Cut a plaintext message into equal-length blocks M1, …, MN,

    where every block is exactly as long as the output of the hash function.

3. Using || to indicate concatenation and ⊕ for exclusive-OR, encrypt the message as follows:

      C1 = H(key || IV)       ⊕ M1

      C2 = H(key || IV+1)   ⊕ M2

      …

      CN = H(key || IV+N) ⊕ MN

4. Output (IV, C1, C2, ..., CN) as the ciphertext.

5. Make sure to set IV = (IV+N+1) before we encrypt the next message.
If you’re familiar with the modes of operation, you might notice that this scheme looks a lot like CTR mode encryption, except that we’re using a hash function instead of a block cipher.

A proof sketch!

Let’s say that I want to prove that this scheme is secure in the Random Oracle Model. Normally we’d use a specific definition of “secure” that handles many things that a real adversary could do.  But since this is a blog, we’re going to use the following simplified game:

Some honest party — the encryptor — is going to encrypt a message.  The adversary will intercept the ciphertext.  The adversary “wins” if he can learn any information about the underlying plaintext besides its length.

Recall that we’re in the magical Random Oracle Model. This is just like the real world, except whenever any party computes the hash function H(), s/he’s actually issuing a call out to the oracle, who will do all the work of hashing using a random function. Everyone can call the oracle, even the adversary.

Following the scheme description above, the encryptor first chooses a random key (step 1). He divides his plaintext up into blocks (step 2). For each block i = 1 to N, he queries the oracle to obtain H(key || i).  Finally, he XORs the received hashes with the blocks of the plaintext (step 3), and sends the ciphertext (C1, …, CN) out such that the adversary intercepts it (step 4).

We can now make the following observations.
  1. Internally, the oracle starts with an empty table.
  2. Each time the oracle receives a new query of the form “key || i” from the encryptor, it produces a new, perfectly random string for the output value. It stores both input and output in its table, and sends the output back to the encryptor.
  3. The encryptor forms the ciphertext blocks (C1, …, CN) by XORing each message block with one of these random strings.
  4. The encryptor never asks for the same input string twice, because the IV is always incremented.And finally, we make the most important observation:
  5. XORing a message with a secret, perfectly random string is a very effective way to encrypt it. In fact, it’s a One Time Pad. As long as the adversary does not know the random string, the resulting ciphertext is itself randomly-distributed, and thus reveals no information about the underlying message (other than its length).
Based on observation (5), all we have to do is show that the values returned when the oracle hashes “key || i” are (a) each perfectly random, and (b) that the adversary does not learn them. If we can do this, then we’re guaranteed that the ciphertext (C1, …, CN) will be a random string, and the adversary can learn nothing about the underlying plaintext!

The Final Analysis

Demonstrating part (a) from above is trivial. When the encryptor asks the oracle to compute H(key || i), provided that this value hasn’t been asked for previously, the oracle will generate a perfectly random string — which is all spelled out in the definition of how the oracle works. Also, the encryptor never asks for the same input value twice, since it always updates its IV value.

Part (b) is a little harder.  Is there any way the Adversary could learn the oracle’s output for H(key || i) for even one value of i from 1 to N? The oracle does not make its table public, and the adversary can’t “listen in” on the hash calls that the encryptor made.

Thinking about this a little harder, you’ll realize that there’s exactly one way that the adversary could learn one of those strings: she can query the oracle too. Specifically, all she has to do is query the oracle on “key || i” for any 1 <= i <= N and she’ll get back one of the strings that the encryptor used to encrypt the ciphertext. If the adversary learns one of these strings, she can easily recover (at least part of) the plaintext, and our argument doesn’t work anymore.

There’s one obvious problem with this, at least from the adversary’s point of view: she doesn’t know the key.

So in order to query on “key || i”, the adversary would have to guess the key.  Every time she submits a guess “guess || i” to the oracle she’ll either be wrong — in which case she gets back a useless random string. Or she’ll be right, in which case our scheme won’t be secure anymore.

But fortunately for us, the key is random, and quite big (128 bits). This tells us something about the adversary’s probability of success.

Specifically: if she makes exactly one guess, her probability of guessing the right key is quite small: one in the number of possible keys.  For our specific example, her success probability would be 1/(2^128), which is astronomically small. Or course, she could try guessing more than once; if makes, say q guesses, her success probability rises to q/(2^128).

The basic assumption we started with is that the adversary is somehow limited in her computational power. This means she can only perform a reasonable number of hashes. In a formal security proof this is where we might invoke polynomial vs. exponential time. But rather than do that, let’s just plug in some concrete numbers.

Let’s assume that the adversary only has time to compute 2^40 (1.1 trillion) hashes. Assuming we used a 128-bit key, then her probability of guessing the key is at most (2^40)/(2^128) = 1/(2^88).  This is still a pretty tiny number.

Think a real-world adversary might be able to compute more than 2^40 hashes? No problem.  Plug in your own numbers — the calculation stays the same. And if you think the scheme isn’t secure enough, you can always try making the key longer.

The point here is that we’ve demonstrated that in the Random Oracle Model the above scheme is ‘as secure as the key’.** In other words, the adversary’s best hope to attack the scheme is to find the key via brute-force guessing. As long as the key is large enough, we can confidently conclude that a “realistic” (i.e., time-bounded) adversary will have a very low — or at least calculable — probability of learning anything about the message.

In Summary

Depending on your point of view, this may have seemed a bit technical. Or maybe not. In any case, it definitely went a bit past the “layperson’s introduction” I promised in the previous post. Moreover, it wasn’t technical enough, in that it didn’t capture all of the ways that cryptographers can “cheat” when using the random oracle model.***

I’m going to do my best to up for that in the next post, where we’ll talk a bit about some real schemes with proofs in the random oracle model, and — more importantly — why cryptographers are willing to put up with this kind of thing.

See the next post: Part 3, in which the true nature of the Random Oracle Model is revealed.

Notes:

* I got this result from an online SHA1 calculator of untrusted veracity.  YMMV.

** Of course, in a real proof we’d be more formal, and we’d also consider the possibility that an adversary might at least be able to obtain the encryption of some chosen plaintexts.  This would be captured in the security game we use.

*** Some will note that I didn’t exactly need to model the hash function as a random oracle to make this example work.  Had I invoked a keyed hash function, I might have used a weaker assumption (and in this case, weak is good!) that H() is pseudo-random.  But this is really a topic for another post.  I chose this example for pedagogical reasons (it’s easy to understand!) rather than its unique need for a random oracle analysis.

How standards go wrong: constructive advice edition

The other day I poked a little fun at people who use non-standard encryption schemes in major Internet standards.  I admit that my flowchart was a tad snarky.  Moreover you could have gotten essentially the same advice by traveling back to 1998 and talking to Bruce Schneier.*

So to atone for my snark, I wanted to offer some more serious thoughts on this problem.  This is important, since TLS 1.0 (which I discussed the other day) is hardly the first widely-deployed spec to use bogus cryptography.  In my mind, the prevalence of this kind of thing raises two important questions:

  1. How do these bugs creep into major protocol specs?
  2. What can can specification designers do stop it?

Below I’m going to speculate wildly about the first question, and hopefully in the process generate some answers to the second.  But first, an important caveat: I’m far too lazy to join any standards committees myself,** so feel free to take my advice with appropriate salt.

Problem #1: Too much committee specialization and delegation.

A few years ago I became involved in a legal dispute related to an IEEE wireless security standard.  My role in this case required me to read the minutes of every single meeting held by the standards committee.  (This is a funny thing about litigation — lawyers will often read these documents more carefully than any committee member ever will.)

The good news: this committee did a very nice job, especially given the magnitude of the problem they were trying to solve — which I will analogize to fixing a levee breach with four sandbags and a staple remover.

What struck me about the process, however, is that from a large committee, most of the critical security decisions were ultimately delegated to one or two domain experts, with the rest of the committee assuming a passive approach.

“How should we arrange the sandbags? Ok, Bob’s our sandbag expert. Jerry, should the staple remover be red? Ok, let’s vote on it. Excellent.”

The committee got lucky in that it had some fantastic domain experts.  But not every committee gets this lucky.  Furthermore, in this case the arrangement of the sandbags was critical to the stability of the entire system, so there was pressure to get it right.

On the other hand, many standards have little used features that nobody much cares about.  When the details of this feature are farmed out, the pressure is rarely as high, and this is when things can go poorly.

Problem #2: A lack of appreciation for security proofs.

Provable security is probably the neatest development in the history of computer security.  Rather than simply conjecture that an encryption scheme is secure, we can often prove it exactly.  Even when our proofs rest on unrealistic assumptions (like the existence of ideal ciphers or random oracles), they can still be effective at ruling out many “obvious” attacks on a cryptosystem.

With this power at our fingertips, we should never have to worry about something like the recent BEAST attack on SSL and TLS 1.0.  And yet stuff like this still happens all the time.

By way of a reminder: BEAST relies on a glitch in the way that SSL/TLS 1.0 implements CBC-mode encryption.  In normal CBC-mode, every ciphertext is constructed using a random Initialization Vector (IV).  In SSL/TLS 1.0, the designers “optimized” by setting the IV to be the final block of the previous ciphertext, sometimes known as a “residue”.

Presumably the designers thought this was ok.  It wasn’t in line with best practices, but it saved a few bytes.  Moreover, nobody had proposed any practical attacks on this approach (and wouldn’t, until nearly five years after the TLS 1.0 spec was finalized).

Still, what if the designers had been obliged to offer a proof the security of this approach?  If they had, they would have quickly realized that there was a problem.  You see, it’s quite possible to prove that standard CBC mode is secure under chosen plaintext attack.***  But you can’t do the same thing with the residue optimization.  Any reasonable proof attempt will crash and burn.

Requiring a proof would have immediately knocked this optimization out of the protocol, long before anyone had thought of a specific attack on it.

Now, you might argue that TLS 1.0 was written way back in 1999, and that it inherited this nonsense from an even older protocol (SSL).  Furthermore, back then “practice oriented” provable security wasn’t well understood; people were too busy buying tech stocks.  If you were really pedantic, you might even remind me that things eventually got better in 2006 when TLS 1.1 came out.

And yet…  Even in TLS 1.1, two whole years after an attack on TLS 1.0 was pointed out, the designers were still putting goofy, unproven stuff into the spec:

      The following alternative procedure MAY be used; however, it has
not been demonstrated to be as cryptographically strong
as the
above procedures
.  The sender prepends a fixed block F to the
plaintext (or, alternatively, a block generated with a weak PRNG).
He then encrypts as in (2), above, using the CBC residue from the
previous block as the mask for the prepended block.  Note that in
this case the mask for the first record transmitted by the
application (the Finished) MUST be generated using a
cryptographically strong PRNG.

This kind of thing may be right, it may be wrong.  But that’s my point.  If you don’t know, don’t put it in the spec.

Problem #3: What, no point releases?

A couple of months ago I installed MacOS 10.7 on my Macbook Pro.  Just as I was getting used to all the new UI “improvements”, a software update arrived.  Now I’m running MacOS 10.7.1.  Even at this moment, my Mac would really like to talk to me about another security update — which I’m too lazy to install (cobbler’s children, shoes, etc.).

And this is the Macintosh experience.  I shudder to think what it’s like on Windows.****

On the flipside, some major Internet standards seem too respectable to patch.  Take our friend TLS 1.0, which “shipped” in 1999 with a bad cipher mode inherited from SSL.  In 2002, the OpenSSL project had the foresight to apply a band-aid patch.  In 2004 some specific attacks were identified.  And it took still another two years for TLS version 1.1 to propose a formal patch to these issues.

How would you feel if your operating system went seven years between security patches?  Especially when experts “in the know” already recognized the problems, and had functional workarounds.  Where was TLS 1.0.1?  It would have been nice to get that to the folks working on NSS.

Problem #4: Too many damn options.

I mentioned this once before on this blog, but a good point can never be made too many times.  It’s very common for major protocol specifications to include gobs and gobs of options.  In one sense this makes sense — you never know precisely what your users will require.  If you don’t cater to them, they’ll hack in their own approach, or worse: design a totally new protocol.

But every option you add to a protocol spec implicitly creates a new protocol.  This protocol has to be analyzed.  And this analysis takes time and adds complexity.  Add this to all the other problems I mention here, you can have some interesting times.

Problem #5: Don’t worry, implementers won’t follow the spec!

Anyone who’s ever written a specification knows that it’s like a platonic form.  Just as your kitchen table is never quite as good as the “perfect” table (mine has three Splenda packets propping up one leg), the implementation of your spec will never be quite exact.

Usually this is a bad thing, and it’s frequently how protocols break in practice. However, there are some very strange cases — such as when the specification is broken — where incorrect implementation may actually be perceived as a good thing.

Case in point: a few years back I was asked to look at one portion of a new proposed standard for securing consumer electronics devices.  The spec was already public, and some devices had even been manufactured to support it.

The bug I found was not a showstopper, but it did completely negate the effectiveness of one particular aspect of the protocol.  The discussion I had with the standards body, however, was quite strange.  Good news, they told me, none of the existing devices actually implement the protocol as written!  They’re not vulnerable!  

Ok, that’ is good news — for now.  But really, what kind of solution is this?  If a protocol spec is bad, then fix it!  You’re much better off fixing the spec once than hoping that none of the eventual implementers will do what you tell them to do.

In summary

This has been a long post.  There’s more I could probably say, but I’m exhausted. The truth is, snark is a lot more fun than “real advice”.

I have no idea if these thoughts really sum up the totality of how standards go wrong.  Maybe I’m still being a bit unfair — many security standards are extremely resilient.  It’s not like you hear about a “0-day” attack on TLS every other day.

But committees do screw up, and there are a few simple things that should happen in order to reduce the occurrence of such problems.  Mandate security proofs where possible.  Get outside advice, if you need it.  Reduce the number of features you support.  Update your specifications aggressively.  Make sure your implementers understand what’s important.  Specification design is hard work, but fixing a bad spec is a million times worse.

Anyway, maybe one of these days I’ll be crazy enough to get involved with a standards committee, and then you can all laugh at how badly I screw it up.

Notes:

If you do this, please call 1998-me and tell him not to waste two years trying to sell music over the Internet.  Also, tell him to buy Apple stock.

** I can’t handle the meetings.

*** A common approach is to assume that the block cipher is a pseudo-random permutation.  See this work for such an analysis.

**** Cruel Windows joke.  Honestly, I hear it’s gotten much better for you guys.

Bram Cohen Corrected

Update 11/24/11: Bram has responded my criticisms in this post. See the comments section below for his thoughts and my response. Also, see this new post for my more detailed response.

I was innocently Googling the title of my blog (yes, I do this, it’s sad) when I came across this May 2011 post by Bram Cohen entitled “Practical Cryptography Corrected“.

Ostensibly the post is a criticism of Bruce Schneier’s book Practical Cryptography. Instead, it’s proof that inventing the world’s most useful peer-to-peer protocol does not make you a cryptographer.  More worrisome than that, it’s an example of someone with a big megaphone using it to give really bad advice.

Bram’s post has ten points, of which four are (somewhat) problematic.  Let’s take them one at a time.

Bram’s Practical Advice #1: For an RSA exponent, you should always use 2. Technically that’s Rabin-Williams, and requires slightly different implementation, but that actually works in its favor. Rabin-Williams has a reduction to factoring, RSA does not.

No, no, no.  A thousand times no.

If you’re going to encrypt with RSA, you should encrypt with RSA.  That means using a standard, well-tested RSA implementation with the traditional RSA public exponent (e = 0x10001).*

Unless you’re an expert, do not go switching your exponents around.  Rabin-Williams (public exponent 2) looks like RSA, but it’s a very different cryptosystem — both in how decryption works and in what goes wrong if you screw up your implementation.

For one thing, when you implement Rabin-Williams incorrectly, it becomes vulnerable to a chosen ciphertext attack that completely recovers the scheme’s secret key.  RSA does not have this problem.  This issue can be managed through skillful use of padding**, but your typical from-scratch implementer is not skillful.  Even bad implementations of good padding can lead to problems.  Why chance it?

Which brings me to my first law: The probability that your RSA ciphertext will be broken via some number-theoretical attack is vanishingly small.  The probability that it will be broken because you screwed up your implementation, on the other hand, is high.  All current approaches to solving the RSA problem work by attacking the factoring problem anyway, so it’s unclear what advantage we’re buying ourselves here.

Bram’s Practical Advice #2: You should always do encryption as a layer outside of authentication.

Cryptographers have been arguing the order of encryption and authentication for nearly thirty years.  Should you encrypt first, then apply authentication (a MAC)? Or should you authenticate first, then encrypt?

By 2000 we had basically put this question to bed.  The recommended best practice? Always encrypt first (with a strong cipher and mode of operation), then apply a MAC to the resulting ciphertext.  This isn’t just easy to remember, it’s provably secure against adaptive chosen ciphertext attacks.  That means you’re protected against very real issues like padding oracle attacks.  Bram’s technique doesn’t give you this, and that can spell disaster.

Bram’s Practical Advice #3: For an encryption mode, you should always use CTR, and always use a nonce of zero, and never reuse keys.

Bram’s Practical Advice #4: You should never use the same key for both encryption and authentication. If you need both encryption and authentication, you should use two keys. One for encryption, and one for authentication.

This one’s a twofer, and technically none of it is bad advice.  I’m a fan of CTR mode, and zero IVs are just fine in CTR (but never in CBC!) mode.

I have only two points to make: first, why not use a random IV?  What’s going to happen to you if you hard-code a zero IV, and then the next (idiot) developer switches your mode of operation to CBC?  This shouldn’t happen, but it might.

But more generally, my advice is not to use CTR at all.  Use a provably-secure authenticated mode of operation like CCM, OCB or (best) GCM, in the way recommended by standards.  These modes handle both encryption and authentication for you, and better yet — they all use a single key (by design). You’re much better off using these standards than hand-rolling your own encrypt/MAC approach (especially if — per Bram’s Advice #2) you plan on doing it wrong.

In Summary

I have a lot of respect for Bram, and if I come off a bit critical in the post above, it’s isn’t because I dislike the guy.  In fact, many of his remaining points are good.

But the only thing worse than bad advice is bad advice from a voice of authority.  And Bram has a lot of authority, which could lead people to trust him over some goofy, non-famous cryptographer (or even Bruce Schneier, for some crazy reason).

Fixing people’s cryptographic mistakes has given me a good living so far, but that doesn’t mean I like it.  So please, please don’t listen to Bram.  Do things right, follow standards, and most importantly: if you don’t know exactly what you’re doing, don’t do it at all.

Notes:

* I initially wrote “e = 100001” which is totally bogus.  Goes to show you shouldn’t trust a blog post.  Thanks to GSBabil.

** Use of padding, incidentally, completely breaks the reduction to factoring unless you conduct the proof in the Random Oracle Model.  And as I discuss here, there are some serious problems with that.