Saturday, October 22, 2011

Attack of the week: XML Encryption

CC photo by Flickr user
RalphTQ.
Unfortunately I had to skip this year's CCS because I was traveling for work. This means I missed out on a chance to go to my favorite Chicago restaurant and on the architecture cruise. I'm told that there were also some research papers being given somewhere.

The Internet can't make up for every missed opportunity, but it can help with the research. In fact, today's news is all about one of those papers. I'm talking about Tibor Jager and Juraj Somorovsky's work entitled "How To Break XML Encryption".*

In case I had any doubts, it only took one glance at the first page to confirm that we're back in "someone screwed up basic symmetric encryption" territory. But really, to describe Jager and Somorovsky's findings like this does them no justice. It's like comparing the time that kid built a nuclear reactor in his garage to, say, the incident at Fukushima.

Some background is in order.
What is XML Encryption and why should I care?
You probably already know that XML is the world's most popular way to get structured data from point A to point B.** Following the success of HTML, the computing world decided that we needed a single "flexible" and ASCII-based format for handling arbitrary data types. (We've spent much of the subsequent period regretting this decision.)

The one upshot of this XML epidemic was the emergence of some standards and libraries, which -- among other things -- were supposed to help application developers process data in a reliable and secure way. One of those standards is SOAP, which is used to transport data in web services frameworks.

Another standard is the W3C XML Encryption Standard, which was dropped like a log in 2002 and doesn't seem to have been updated since. The point of this standard was to give developers a uniform (and secure!) way to protect their XML documents so that we wouldn't wind up with five thousand incompatible, insecure ways of doing it. (Oh the irony.)

Finally, a very common implementation of both standards can be found in the Apache Axis2 web services framework and in the RedHat JBoss framework. These are probably the most common open-source SOAP frameworks. They power a number of systems you don't necessarily think about, but you probably should because they carry a lot of your personal information.
So what about the encryption?
To make a very long story short, the W3C standard recommends to encrypt messages using a block cipher configured using (our old friend) CBC-mode. Here's a quick recap on CBC mode, courtesy Wikipedia:

CBC mode encryption. The message is first subdivided into equal-length blocks and encrypted as in the diagram.  The circular plus symbol denotes XOR.
There are two basic things you need to know about CBC mode, and ought to know if you ever plan to use it.***

First: CBC requires that every plaintext be an even multiple of the block size. In the case of AES, this means 16 bytes. If the message is not the right length, then CBC implementations will pad the message with additional bytes. Of course it must be possible to recognize this padding so that it can be stripped off after decryption. There are various schemes for doing this.

The W3C standard uses the following padding approach. Let "MM" indicate message bytes, and let "NN" be the total number of padding bytes being added. "XX" represents arbitrary padding bytes, which can hold any value you want. A padded block will look like this:

0x MM MM MM MM MM MM MM MM MM MM XX XX XX XX XX NN

When a decryptor encounters a padded message, it looks at the last byte to figure out how many bytes to strip off. If it encounters padding that doesn't make sense, i.e., NN > 16 or NN < 1, then it should see something funny going on and reject the message.

Second: CBC ciphertexts are malleable. This means that you can modify a CBC-encrypted ciphertext such that your modifications will carry through decryption, and have a meaningful effect on the resulting decrypted plaintext.

In the simplest case, you can truncate the message by stripping blocks off the end. This does the same thing to the resulting plaintext.

More interestingly, you can flip any bit (or set of bits) in the IV of a CBC-encrypted message, and upon decryption you'll find that the same bits have been flipped in the first block of the decrypted plaintext. You can also do a lot more, but it's not that important right now.
Anything else I should know?
Yup. You need to know how XML messages are formatted. They're encoded using UTF-8 -- essentially ASCII -- with some special rules.

These are described in the paper. Briefly: any XML message that contains bytes ranging from 0x00-0x1F (with the exception of tabs, LF and CR) may be considered invalid. Similarly, the ampersand (&) character is used as an escape character, and must be followed by a valid string. Lastly, open brackets "<" must be properly closed. Messages that don't meet this standard should be rejected.  ****

Of course, the message will have to be decrypted (using the appropriate key) before any of these checks can be run.
This is all fascinating, but how does it lead to an attack? 
There's one more detail I haven't given you. You see, in addition to the details above, the Axis2 server (ditto JBoss) is kind enough to let you know when you haven't met its standards for an XML message.

Specifically, after it decrypts a message, it kicks back an error if the message either: (1) has bad padding, or (2) contains invalid characters. In both cases, the error is the same. And this error is your way in the door.

I'm not going to completely describe the attack, but I'll try to give an overview.

Imagine that you've intercepted a legitimately-encoded, encrypted XML message (IV, C1, ..., CN) and you want to know what it says. You also have access to an Axis2 or JBoss server that knows the key and can decrypt it. The server won't tell you what the message says, but it will give you an error if the encoding doesn't meet its standards.

Sending the original, intercepted message won't tell you much. We know that it's encoded correctly. But what if you tamper with the message? This is precisely what Jager and Somorovsky proceed to do.

Step 1: Truncate the ciphertext. A good way to start is to chop off everything after the first ciphertext block, and send the much-shorter message consisting of (IV, C1) in to be decrypted. Chances are good that this new message will produce a decryption error, since the server will interpret the last byte of the decrypted message as padding -- a number that should be between 0x01 and 0x10.

Step 2: Tweak the padding. I already said that if you flip bits in the IV, this will result in a similar change to the decrypted plaintext. Using this concept, you can force the last byte of the IV through all possible values and ask the server to decrypt each version of the ciphertext. More formally, the 'new' last IV byte can be computed as ('original last IV byte' ⊕ i) for i from 0 to 255.

In the best case, 16 of these test ciphertexts will decrypt without error. These correspond to the decrypted values 0x01 through 0x10, i,e., valid padding. If fewer than 16 of your test ciphertexts decrypt, this means that there's something wrong with the message: probably it contains an unclosed "<" tag.

Step 3: Squish the bug(s). This is no problem. If exactly only one of your ciphertexts decrypts successfully, that means the open "<" character must in the first byte of the message. You caused the last byte of the message to decrypt to 0x10 (16 decimal), and the decryptor treated the whole block as padding. There are no errors in an empty message.

If two ciphertexts decrypt successfully, then the "<" character must be in the second position of the message, because now there are only two padding lengths (16 and 15) that would cause it to be stripped off. And so on. The number of successful decryptions tells you where the "<" is.

Now that you know where it is, kill it by flipping the last bit of the appropriate byte in the IV. This turns "<" into a harmless "=".

Step 4: Learn the last byte of the block. You should now be able to come up with 16 test ciphertexts that decrypt successfully. This means you have 16 values of "i" such that 'last byte of the original IV' ⊕ i results in a successful decryption. One of these "i" values will differ from the others in the fourth most significant bit. This one will correspond to the padding value 0x10.

If "x" is the original plaintext byte and "i" is the value you just identified, you know now the last byte of the block. Since we have x ⊕ i = 0x10, then through the very sophisticated process of rearranging a simple equation we have 0x10 ⊕ i = x.

Now, using our knowledge of the last byte of the plaintext, manually tweak the IV so that the last byte (padding) of the plaintext will decrypt to 0x01.

Step 5: Learn everything else. The rest of the attack hinges on the fact that all of the bytes in the message should be 'acceptable' UTF-8 characters. Thanks to our trick with the IV, we can now flip arbitrary bits in any given byte, and see whether or not that leads to another 'acceptable' character or not.

Believe it or not, the results of this experiment tell us enough to figure out each character in turn. I won't belabor this, because it takes a little bit of squinting at the ASCII table, but fundamentally there's not much magic beyond that. It's very simple and elegant, and it works.

Step 6: Finish it. You've done this for one block. Now do it for the rest. Just set your 'ciphertext' to be the next block in the message, and your 'IV' to be the ciphertext block immediately preceding it. Now go back to Step 1.

And that's the ballgame.

Obviously all of this takes a lot of tests, which you can reduce a bit using some of the optimizations suggested in the paper. Jager and Somorovsky are able to recover plaintexts with "14 requests per plaintext byte on average." But at the end of the day, who cares if it takes 50? The box is sitting there ready and willing to decrypt ciphertexts. And it's incredibly unlikely that anyone is paying attention.
Ok. Can't you fix this by authenticating the ciphertexts?
This entire attack is a special example of an adaptive chosen ciphertext attack. (Specifically, it's a super-duper variation of Vaudenay's padding oracle attack, which he discovered in 2002, the same year the W3C standard hit!)

These attacks can almost always be prevented with proper authentication of the ciphertexts. If the decryptor checks that the ciphertexts are valid before decrypting them, the attacker can't tamper with them. Hence, no useful information should leak out to permit these attacks at all.

The designers could have added a mandatory MAC or even a signature on the ciphertext, or they could have used an authenticated mode of operation.
But the W3C standard does provide for MACs and signatures!
Indeed. As the authors point out, there's an optional field in the specification where you can add a signature or a MAC over the ciphertext and its meta data. But there's one tiny, hilarious thing about that signature...

Did I mention it's optional?

You can quite easily take a signed/MACed ciphertext and 'convert' it into one that's not signed or MACed at all, simply by stripping the ciphertext contents out and placing them into a section that does not claim to have a signature. Since the spec doesn't mandate that the authentication be on every message, the decryptor will be totally cool with this.

So in summary: optional MACs and signatures don't buy you much.
Is there anything else to say about this?
Lots. Stop using unauthenticated CBC mode. Stop using any encryption standard designed before 2003 that hasn't been carefully analyzed and recently updated. Stop using any unauthenticated encryption at all. Wrap your SOAP connections with TLS (a recent version!) if you can.

People didn't really think much about active chosen ciphertext attacks on symmetric encryption before 2000, but now they're the new hip thing. If your system is online and doesn't have a solid, well-analyzed protection against them, don't pretend that you're doing anything at all to secure your data.

I wish I had a funny, pithy way to sum this all up. But honestly, I'm just a little depressed. It's not Jager and Somorosky's fault at all -- this is a great paper. It's just that there's too much embarrassingly bad crypto going around.

While I love reading these papers, I'm starting to feel that this one-off approach is not sufficient to the problem. Maybe what the community needs is a centralized clearinghouse for this stuff, a website where suspect crypto standards (and implementations) can be identified and listed. Crowdsourced and analyzed. Probably many are just fine, but I doubt most have been looked at (or even thought about in a while).

Anyone looking for a project?
  
Notes:
  
* Tibor Jager and Juraj Somorovsky, "How To Break XML Encryption". In ACM CCS 2011. ACM Press.

** Someone is inevitably going to tell me that JSON is the world's most popular way to move structured data. Ok. I don't care.

*** But better yet, don't use it. Use a standard authenticated encryption scheme like CCM, OCB or GCM, and use someone else's implementation according to the standards. These modes should prevent all of this nonsense.

**** These details are quoted straight from the Jager and Somorovsky paper. I'm not 100% sure if all implementations enforce this, but Axis2 does.

Thursday, October 20, 2011

What is the Random Oracle Model and why should you care? (Part 3)

This is part 3 of a series on the Random Oracle Model.  See here for the previous posts:

Part 1: An introduction
Part 2: The ROM formalized, a scheme and a proof sketch

Photo by Flickr user greeblie, used under a CC license.
I began this series in the hope of writing a true layperson's introduction to the Random Oracle Model. At the outset I figured I probably knock this thing out in a couple of not-too-technical posts. Yet, here I am at post three, and there's still a big chunk of technical left to go.

Please bear with me. We're going to get back to some solid, practical crypto with the next post in the series.

A quick recap

In the last two posts we've covered a bit of ground. We talked about cryptographic hash functions and their properties, and discussed why the cryptographer's 'ideal' hash function would be a random function. We noted that random functions would never work in real life (too big to store, too slow to compute), so we came to a strange resolution: we would pretend that our hashes were random functions, just for the purposes of our security analysis. Then of course we would use real hash functions like SHA.

Of course, even this required us to tamper with the fabric of the universe, since we want to restrict our 'adversaries' to running in a reasonable amount of time, and they can't do that if they're busy evaluating random functions. So we created a new model of computation (aka 'thought experiment') called the Random Oracle Model. In this model all hashing is computed by a magical third party called an oracle.

If you know a little about the random oracle model, say from reading the previous posts, you could be forgiven for thinking that it isn't all that bad. Basically we're assuming that hash functions are super-duper "random". Maybe a little more so than a real function can be, but big deal.

But unfortunately it's much more than that.

The world's shortest lesson on provable security

To explain what I mean, we first need to understand a little about provable security, and particularly, a type of proof called a reduction proof.

When we say that an encryption scheme is "provably secure", we usually mean that it's secure under some assumption. For most practical schemes the assumption is mathematical in nature. For example, we might found an encryption scheme on the assumption that it's hard to factor large composite numbers, or to find the shortest vector in a lattice. In this case 'hard' doesn't mean 'I can't figure out how to do it'. It means: there exists no (efficient) algorithm that can do it.1

Now, I mentioned that these are assumptions. We don't absolutely know that these problems are hard. However, we like to think that if some bright people have looked at the problem for a while, and nobody has made any progress towards solving it, that means it's probably not easy. In some cases the problem may even fall into a useful class of problems, i.e., it may be NP-Hard.

Even if you're not willing to blindly trust in these assumptions, security proofs are still useful. Analyzing a new cryptosystem can soak up huge amounts of time and energy. If we can at least 'reduce' each new scheme to one of a small number of mathematical problems, we can pour our brainpower into attacking just those problems, knowing that our work will apply to every scheme that uses them.

All of this brings me to my main point, namely, how do these proofs work. Getting specific, the general flow of such a proof is as follows:
  1. Assume that problem X is hard.1
  2. Show that if there exists an (efficient) algorithm that "breaks" our cryptographic scheme, then there exists an (efficient) algorithm that solves problem X. 
  3. Snottily point out that (2) means if the scheme can be broken, then problem X can't be hard.
Clearly point (3) cannot be true if point (1) is. Following the logic, this allows us to conclude that if Problem X is hard, then there can't be an algorithm that breaks our encryption scheme.

Of course I haven't actually explained how to actually accomplish step (2), which is pretty important. It's also the beautiful part. You see, in most reductions, step (2) consists of actually writing an efficient algorithm that solves problem X.

You probably think I'm crazy. I just claimed that problem X can't be efficiently solved, now I'm telling you that we write an algorithm for solving it. But the genius of it is that our solver algorithm won't quite work. It's missing an important subroutine. 

That subroutine is identical -- at the API level -- to an algorithm that 'breaks' our encryption scheme. 

So here is how the argument breaks down: if there was, hypothetically, an algorithm that efficiently 'broke' our encryption scheme, then we could plug it into the hole we left in our 'solver' algorithm. This would yield an algorithm that efficiently solves problem X. Hence, it completely satisfies what we're asking for in point (2).

Black Boxes

With this foundation laid, I only need to make one more point. There's a basic restriction that standard reduction proofs must obey. Our 'Problem X solver' algorithm must use the 'encryption scheme breaker' subroutine as a black box. It cannot make assumptions about how the breaker algorithm works internally.

Let me be more clear about this. We cannot, for example, have a step in our 'solver' algorithm where we decompile the breaker algorithm, or peek inside its internal memory registers, and then assume that they'll contain a partial result we need.

This is because our reduction only works if it works with every possible algorithm that breaks the scheme. Since we can't guess a priori how such an algorithm might work internally, we can't depend on its internal workings. The breaker algorithm could hypothetically be horrifically obfuscated, to the point where we couldn't understand it even if we held the code in our hands.

Thus in a traditional black box reduction, we can count on the breaker algorithm having a standard "API" (which we'll define at some point). This typically takes in public keys and/or ciphertexts (or signatures, etc.) It outputs some useful information that constitutes our definition of 'breaking'. And though it doesn't always need to succeed, we expect that it does so with reasonable probability. But everything that happens inside the algorithm must be hidden from us.

Dirty laundry

You wouldn't think this has
anything to do with provable
security.
Let's pause for a moment to consider some related problems in the area of celebrity journalism.

Imagine you're a muckraking celebrity reporter who's been assigned to get a scoop on Brad Pitt and Angelina Jolie's latest adopted baby. The Pitt/Jolies are extremely private, and won't even acknowledge that they're adopting. You're a clever reporter, and more importantly, you have no morals.

You know that if you could just get a peek at the family laundry, you'd have your whole story. Sex, size, weight, favorite color. Even diet (Angelina insists on eco-safe cloth diapers). In a perfect world you could drive over to Pitt/Jolie's laundry company, slip someone $500, and lo -- your whole story would be there in the clothes. If you were really devious, you could start your own celebrity laundry service and trick them into sending it right to you.

But this is all wishful thinking. Pitt and Jolie don't send out their laundry -- they have it done at home.  In truth, you're not sure that they even do laundry. You heard a rumor that Angelina has her clothes ritually burned after each use.

In case you don't see the relevance, this is an analogy to reduction proofs. To you, the 'breaker' algorithm is a black box. You can see inputs go in, and see what comes out, but you can't see what it's doing in between. If your reduction depends on peeking at its dirty laundry, you'd better come up with a better reduction. It does its own laundry.

Back to the Random Oracle Model

The Random Oracle Model is different. In my previous post I explained that it's just like the real world, with one exception: nobody can compute hashes by themselves. Every party -- including the adversary (synonymous with 'breaker' algorithm) -- must hash by sending messages to a special outside party. This party computes the hashes using a random function, and sends them back to the caller.

To make this abundantly clear, in the Random Oracle Model, everyone sends out their laundry.

The Random Oracle Model. The oracle computes hashes for all parties, including the adversary.
Speaking more formally, in the Random Oracle Model, an adversary (aka our 'breaker' algorithm) is no longer quite a black box. In one very important way we have visibility into it. It must call outside of itself if it ever wants to hash something.

If we're building a reduction, that means we can take advantage of the 'extra' info that leaks from these calls, just as our gossip reporter could take advantage of celebrities' laundry. We know that the algorithm has to make these hash calls, so there must exist a 'port' in the API for it to make them. Once we make this stipulation, our reduction can tap into this port and intercept the adversary's hash calls to see which messages it's asking for, and possibly even tamper with the results it gets back.

In an extreme case, our reduction can even run its own fake 'oracle' (aka laundry service), as long as we always hand back convincing responses.

And while this may not seem a big deal, it really is. Just as Brangelina's dirty laundry could reveal major details about the stars' private life, the choice of which messages a 'breaker' algorithm chooses to hash can reveal major details about what it's thinking internally. By tampering with those responses we can also fundamentally change its operation. These details can be the difference between a reduction proof that works, and one that doesn't.

Taking Stock

At this point, I think it's worth taking a breath and asking what we've gotten ourselves into here. When we started with this nonsense, all we wanted was a better hash function. One that looked slightly more 'random' than any real function we could construct.

Along the way we realized that random functions were totally impractical. Even to pretend we could use them, we had to tamper with our computational model and "pull the random function" outside of the parties. After all, they run in polynomial time and couldn't possibly evaluate such a function.

Each of these steps seemed perfectly reasonable, and yet somewhere we jumped the shark. We took the most sacrosanct principle of reductionist security and violated it. Instead of "our argument works with any adversary", we now say "our argument works only with adversaries who kindly tell us whatever and whenever they need to hash".

Real adversaries don't work like this. If someone comes up with an algorithm that breaks an implementation of RSA-OAEP that uses a real hash function (say SHA256), that algorithm won't need to tell us what it's hashing. It will just do it internally. And if its code is obfuscated, we won't be even able to connect it to our reduction, if our reduction depends on knowing (or tampering with) the hash calls.

An Example: RSA Signing

At this point I've run out of ways to talk about the Random Oracle Model without actually invoking a real scheme. So let's bring this back to something concrete.

The RSA scheme is probably the most famous cryptosystem of the modern age. Many people mistakenly believe that RSA is based on the hardness of factoring large integers. In fact, that's not quite accurate. RSA is actually based on a slightly different, but related assumption not-creatively named the "RSA assumption".

This assumption holds that the following problem is fundamentally hard, at least when the numbers get sufficiently large.1 Given three values (N, e, M), where:
  1. N is the product of two prime numbers (p, q) that you don't know.
  2. 2 < e < N and gcd(e, (p-1)(q-1)) = 1.
  3. 0 <= M < N.
The problem is to find a value "S" such that S^e mod N = M.  2

It's conjectured that this problem is hard1 if you only have the values given above. However, if you know (p, q) or an additional 'secret key' derived from those values, it's quite easy to solve.

The RSA signature scheme depends on this. In a textbook RSA signature, you "sign" a message by first encoding it as an integer m, such that 0 <= m < N.  Then, using your secret key you compute a 'signature' "s", such that s^e mod N = m. (See here for the exact algorithm.)

The intuitive argument for RSA signatures rests on three legs:
  1. Since only you know the secret key, it's easy for you to "sign" messages of your choice. 
  2. Anybody else can verify your signatures given (m, s) and the "public key" (N, e). 
  3. Intuitively, nobody else can sign messages on your behalf (this is known as 'forging'), because doing so without the secret key would be equivalent to solving the RSA problem.
This is, of course, total nonsense.

If you look carefully, what the RSA assumption actually implies is that it's hard to come up with a "signature"(S) on a particular message (M). If the adversary has freedom to choose the message himself, then it's easy for him to come up with a whole bunch of 'signed messages'. This makes textbook RSA signatures pretty useless.3

More importantly, to prove that RSA signatures are secure under the RSA assumption, we can't just have the adversary spitting out any old message/signature pair. To make a reduction work, we must use the adversary to solve a specific RSA problem instance (N, e, M). That means whatever the adversary outputs, we must be able to use that value in order to compute an "S" that satisfies S^e mod N = M.

And now here, finally is where the hash functions come through for us.

Adding a Hash

I've mentioned previously that most signature schemes don't literally sign a raw message. Rather, they sign the hash of a message. This is convenient, since messages can be large and hashes are usually small. But in the Random Oracle Model it has a special significance.

Let's say that we have a hash function H() with the odd property that it outputs an integer between 0 and N-1. This is called a "full domain hash". This is a little unusual for a hash function, but I promise that such things can be built out of more "traditional" hash functions like SHA.

Instead of signing the message directly, we could have our signer first hash the message (which can now be an arbitrary string), and sign the resulting hash. More formally, a signature "s" must now satisfy:
s^e mod N = H(message)
Can we prove that this scheme is secure under the RSA assumption?

A proof sketch

Imagine that A is some algorithm that breaks the signature scheme, meaning that it takes in a public key (N, e), and eventually outputs an "existential forgery", i.e., any pair (message, s') that satisfies the verification equation above.

Our goal is to design a solving algorithm for that can take in an arbitrary RSA problem instance (N, e, M) and somehow force A to help us solve it. Our goal is to eventually spit out a value S where S^e mod N = M.

First, our solver will send A a "public key" consisting of (N, e). Since we're in the Random Oracle Model, A will eventually have to hash something. Whenever A asks to hash a message, we'll intercept its attempt to call the oracle, pick a random value 0 <= r < N and answer its query as follows:

    H(message) = M * r^e mod N

I will claim that since "r" is random, then our "hash" result is also randomly-distributed, and thus "looks right" to A. So he won't twig to what we're doing. Most importantly, we've managed to tie every single hash value back to the one value "M" that we took from the RSA problem instance. Of course, we'll also store each value "r" so we can retrieve it later.

We assumed that eventually A will output a valid "forged" message/signature pair of the form (message, s'), where this pair satisfies s'^e mod N = H(message). If this is the case, then all we have to do is look up the "r" value used to compute H(message) and output:

    S = s' / r mod N         (or more accurately, s' * inverse(r) mod N)

This is the answer to our RSA problem instance, since S^e mod N = M. 4, 5

Why does this work? Well, let's simplify the equation "s'^e mod N = H(message)" a bit:

        s'^e mod N                            = H(message)
        s'^e mod N                            = M * r^e mod N         (this is the hash we computed)
        s'                                            = M^{1/e} * r mod N  (raising both sides to the inverse of e)
        s'                                            = S * r mod N              (because S = M^{1/e}.)

A full version of this proof can be found here. None of it works in the real world, of course. The minute you sub in a real hash function, this whole proof is bunk.

A few notes and a conclusion

This has been a long post; even my footnotes are getting out of control. I do hope I was able to make my main point, which is that the Random Oracle Model is both deeply unnatural, and deeply useful to cryptographers writing security proofs.

This is a dangerous combination, since it allows us to take shortcuts and make assumptions, that simply can't be supported in the real world we live in. And yet, it's so powerful that we can find it hard to stop.

More to the point, even when cryptographers stop relying on this crutch (and it's started to happen), we have a hard time convincing end users to get rid of schemes that have been proven in the ROM. Heck, we have a hard time convincing our friends in the systems security community. They think we're nuts.

And it's hard to blame them. Even when we have proper replacements, they're usually funny looking, almost always more "costly", and sometimes rest on bizarre new mathematical assumptions that people don't trust. Plus, it's not like the random oracle schemes are broken.

In my next post I'm going to talk about all this, pull back from the crypto a bit, and focus on this dispute. I'll also talk a bit about the state of the art in ROM/non-ROM schemes.

And oh yes, I'll explain why cryptographers have started to get religion on this point.  Hint: someone proved something. Something very bad for the ROM.

Click here for the next post in this series.

Notes:

1. More formally, saying that a problem is "hard" means that that all probabilistic polynomial time Turing machines solve it with at most negligible probability.

2. For a quick explanation of modular arithmetic, see here.

3. For example, he can pick an arbitrary number as "S", then -- using only the public key -- work backwards to compute "M" as M = S^e mod N. This pair (M, S) is a valid message/signature pair. Similarly, he can mangle legitimate message/signature pairs in various ways, to get new ones.

4. In practice, there's no division mod N. Instead, we compute the inverse of "r" mod N, then multiply by that number.

5. In a real proof, we would also consider the possibility that the adversary can ask for and obtain signatures on chosen messages. Our reduction would have to answer these requests. This can be done, but it adds a bit of complexity. See the full proof for some details on this process.

Wednesday, October 12, 2011

Oh my...

Neal Koblitz is not a happy camper:
At Crypto 2011 the Program Chair, Phil Rogaway, explained that the paper [2] was chosen for the Best Paper Award “overwhelmingly” by the Program Committee, which “praised the work for its broad appeal...and its potential impact.” 
And, indeed, the treatment of hashed ElGamal encryption in [2] is in some sense a remarkable achievement. Usually one has to leave the sciences entirely if one wishes to find works of scholarship — for example, Michel Foucault’s turgid 760-page three-volume philosophical treatise on sex [14] — that have been so successful in turning something that should be interesting and accessible to everyone into something lengthy, unreadable, and boring.

Monday, October 10, 2011

DESFire

If you skipped today's crypto news, you missed some good press for David Oswald and Christof Paar's recent side-channel attack on the Mifare DESFire MF3ICD40 chip.* I'm hardly an expert on side-channel attacks, but being uninformed has never stopped me before, and I figured it might be fun to read the paper and talk about this very neat attack.
What's the DESFire MF3ICD40 and where is it used?
I'm guessing MF3ICD40 probably isn't a household name where you live, but you might be familiar with one of the systems that use it.  These include the San Francisco Clipper card, the Czech "in-karta", and a cute Australian card transit card known as the myki.  It also powers a number of access control systems, like the prox cards that let you into some high security buildings.

Clipper card.
The MF3ICD40, henceforth "MF3", is a capable little chip.  It's a Radio Frequency Identification (RFID) device, meaning that it communicates wirelessly with a reader --- e.g., a Clipper card turnstyle.  It can store a certain amount of information and retrieve it on demand.  Like the DST chip I mentioned in an earlier post, every MF3 contains a cryptographic key and a cipher, which it can use to (among other things) perform a challenge/response authentication protocol.

Unlike the DST, the MF3 uses a pretty reasonable cipher for its authentication protocol. Specifically, it employs Triple-DES with a 112-bit key.  While 3DES is definitely getting a bit long in the tooth (it's been deprecated in FIPS), it's not thought to be vulnerable to any truly practical attacks.  Nor, to the best of my knowledge, is there anything wrong with the way that the MF3 uses 3DES.
So if the crypto is ok, what's the problem?
Just like the DST, the MF3 chip contains no internal power source.  Instead, power for the chip is extracted from a magnetic field generated by the reader.  This is mostly a good thing --- by powering the chip this way, the designers were able to dramatically reduce its size and cost.  Moreover, you don't need to replace your MF3 when some crappy battery dies.

Unfortunately, the use of an external power source opens the device up to a special class of side-channel attack known as Correlation Power Analysis, or CPA.  Since the MF3 draws power from the reader, a malicious reader can actually tell --- from moment to moment --- how much power the device is using.

Ordinarily this wouldn't be very interesting. However, if you know exactly how the device performs the computation, and you can repeatedly measure the device's power consumption while sending it carefully crafted input values, you can actually learn quite a lot.  Specifically, when the device encrypts (or decrypts) your chosen inputs, those power fluctuations can actually leak the bits of the cryptographic key.  In the case of MF3, you can obtain the full 112-bit key in about seven hours.
Isn't side channel analysis pretty well known?
Yes, and that's what makes this interesting.  With caveats.

Side channel attacks themselves are hardly a new thing.  The concept of using power analysis to attack cipher implementations was first proposed in CRYPTO '99 by Paul Kocher, Joshua Jaffe and Benjamin Jun, all of whom worked for Paul's company, Cryptography Research.  In a truly inspired move, Paul and his researchers turned around and patented the standard set of countermeasures to the very attack they had invented.  These countermeasures can be found in just about every high-security smart card manufactured today.

This story is important because the MF3 clearly didn't have the latest and greatest hardware countermeasures built into it.  If it had, the Oswald and Paar attacks wouldn't have gone anywhere. (Though interestingly, it did have one countermeasure, see below.)

This seems to be a common situation for a certain class of RFID devices, of which the MF3 is one.  These devices toe a funny line between "contactless smart card" (where SCA countermeasures would be expected) and "plain-old RFID" where security isn't thought to be as important. Unfortunately, the people who select these systems may not realize this, and that's when bad things can happen.

I should also add that --- unlike a recent side-channel attack on KeeLoq --- this attack does not require the adversary to physically crack open the device. It's implemented purely via the EM field.
So how does the attack work?
This is clearly the crux of the matter, and unfortunately this is where I'm going to have to let you off with a warning.  Side channel analysis is just not my area, and there's little I can offer here that would improve on reading the paper itself.**, ***

But to make a very short go of it, DES works by breaking the key up into smaller pieces called "subkeys", each of which is used in a different round of the cipher.  Portions of these subkeys are combined with bits of of an attacker-supplied input and sent to a series of small functions called S-boxes.  Since only a few bits go into each S-box, there is a relatively small set of possible input values, each of which results in a slightly different power consumption profile.

By observing the power consumed when calculating these S-boxes on different inputs -- both for known and unknown keys, and over many, many experiments -- the attacker can correlate these traces to figure out the bits that come from an unknown key.  The full key recovery attack takes ~250,000 traces, which require about 7 hours to collect.

The most interesting aspect of the paper is that the authors didn't actually know how the DES implementation in the MF3 worked.  Much of the paper is devoted to reverse-engineering the necessary details of the operation by examining power traces.  What's most interesting about this discussion is that along the way, they discovered that the MF3 actually does implement a countermeasure to CPA -- specifically, it randomly adds up to eight "dummy rounds" to DES. Unfortunately for the MF3, this protection was not sufficient to prevent the attack.

Figure from the Oswald-Paar paper.  These charts show the operation of the DES cipher  on several input values.  The authors used this information to reverse-engineer the operation of the device.
So what's next for MF3?
A quick glance at the MF3 website will tell you that the manufacturer "has begun phasing out the MF3ICD40", but that "orders ... will continue to be taken until December 31st, so start planning your new year's party now!" (Ok, I made up the last bit.) From here on out, it looks like it's going to be AES-based cards all the way.

The note doesn't specifically mention this attack, but I assume that's what's behind it.  If so, that's a pretty good response for a manufacturer.  To pull a profitable card off of the market probably cost them quite a bit, but it protects their reputation, not to mention their customers.  Unfortunately, there's no indication that the newer AES-based cards actually have stronger protections against side channel analysis; they're just using a different cipher.  But I guess that's what next year's CHES is for.

Notes:

* This post brought to you by David Oswald and Christof Paar.  Breaking Mifare DESFire MF3ICD40: Power Analysis and Templates in the Real World.  In CHES '11.

** This is an extended version of the conference paper that I read. Thanks to Bart Coppens for finding the link.

*** If you want to hear from people who do know their side channels, see Luke Mather's excellent post on DPA and distinguishers from the Bristol Cryptography Blog.

Saturday, October 8, 2011

What is the Random Oracle Model and why should you care? (Part 2)

This is part 2 of a series on the Random Oracle Model.  See "Part 1: An introduction" for the previous post.

In a previous post I promised to explain what the Random Oracle Model is, and -- more importantly -- why you should care about it.  I think I made a good start, but I realize that I haven't actually answered either of these questions.

Instead what I did is talk a lot about hash functions. I described what an ideal hash function would look like: namely, a random function.  Then I spent most of the post explaining why random functions were totally impractical (recap: too big to store, too slow to compute).

At the end of it all, I left us with the following conclusions: random functions would certainly make great hash functions, but unfortunately we can't use them in real life. Still, if we can't get a security proof any other way, it might just be useful to pretend that a hash function can behave like a random function. Of course, when we actually deploy the scheme we'll have to use a real hash function like SHA or RIPEMD -- definitely not random functions -- and it's not clear what our proof would mean in that case.

Still, this isn't totally useless. If we can accomplish such a security proof, we can at least rule out the "obvious" attacks (based on the kind of glitches that real scheme designers tend to make). Any specific attack on the scheme would essentially have to have something to do with the properties of the hash function. This is still a good heuristic.

As you can see, this has gotten somewhat technical, and it's going to get a bit more technical through to the end of the post. If you've gotten enough background to get the gist of the thing, feel free to skip on to the next post in this series, in which I'll talk about where this model is used (hint all modern RSA signing and encryption!), and what it means for cryptography in practice.

Brass tacks

A security proof typically considers the interaction of two or more parties, one of whom we refer to as the adversary. In a typical proof, most of the players are "honest", meaning they'll operate according to a defined cryptographic protocol. The adversary, on the other hand, can do anything she wants. Her goal is to break the scheme.

Usually we spell out a very specific game that the parties will play.  For an encryption scheme, the rules of this game allow the adversary to do things that she might be able to do in a "real world" environment. For example: she might be able to obtain the encryption of chosen plaintexts and/or the decryption of chosen ciphertexts. In most cases she'll intercept a legitimate ciphertext transmitted by an honest encryptor. Typically her goal is to learn (something about) the underlying plaintext.

It's standard to assume that the parties will all know the scheme or protocol, including how to compute the hash function if one is used. This is a common-sense assumption, sometimes referred to as Kerckhoffs's Principle.  Moreover, we generally assume that the adversary is limited in some way --- she doesn't have unlimited computational power or time, which means she can't just brute force the decryption key.

Security proof in the random oracle model.  All parties (Encryptor, Decryptor and Adversary) can call out to the Random Oracle, which computes the hash function for them and provides consistent responses to all parties.
A Random Oracle security proof is just like this, but we make one further tweak.  Even though everyone is allowed to compute the hash function, they can't do it by themselves. Hashing is done by a new magical party called the "oracle". If any party needs to hash something, they ship the message (over an imaginary secure channel) to the oracle, who computes the hash and sends it back to them.

Why the oracle? Well, since the oracle is "outside" of the parties, they no longer need to carry around a description of the hash function, and they don't need to do the work of computing it.  This is good because -- as I mentioned in the previous post -- the full description of a random function is ridiculously large, and consequently takes an insane of amount of time to compute. By sticking it into the oracle, we can handwave away that ugliness.

We'll place two restrictions on the random oracle. First, it must implement a genuine random function (sampled at the beginning of the game), with the same input/output profile as a realistic hash function we might use (e.g., 256-bit output strings if we're eventually going to implement with SHA256).

Secondly, the responses given by the oracle must be consistent.  In other words, just as hashing "Peter Piper" through SHA1 will always produce 1c6244929dc8216f5c1024eebefb5cb73c378431,* regardless of who does it, sending a given string through the random oracle will produce the same result, regardless of which party asks for it.

Lazy Afternoon

There's one last (and kind of neat) trick related to how we model the Random Oracle. Internally a random oracle just implements a huge table mapping input values to random output strings.  Since this is the case, our security proofs can fill this table in "lazily". Rather than starting with an entire table all filled in, we can instead begin our analysis with an empty table, and generate it as we go. It works like this:
  1. At the beginning of the game the oracle's table is empty.
  2. Whenever some party asks the oracle to hash a message, the oracle first checks to see if that input value is already stored in the table.  
  3. If not, it generates a random output string, stores the input message and the new output string in its table, and returns the output string to the caller.
  4. If the oracle does find the input value in the table, it returns output value that's already stored there.
If you think about this for a minute, you'll realize that, to the various parties, this approach "looks" exactly the same as if the oracle had begun with a complete table. But it makes our lives a little easier.

Back to Delphia

In the last post I proposed to build a stream cipher out of a hash function H(). To get a little bit more specific, here's a version of what that algorithm might look like:

1. Pick a random secret key (say 128 bits long) and share it with both the sender and receiver.
    Set IV to 1.
2. Cut a plaintext message into equal-length blocks M1, …, MN,
    where every block is exactly as long as the output of the hash function.
3. Using || to indicate concatenation and ⊕ for exclusive-OR, encrypt the message as follows:
      C1 = H(key || IV)       ⊕ M1
      C2 = H(key || IV+1)   ⊕ M2
      …
      CN = H(key || IV+N) ⊕ MN

4. Output (IV, C1, C2, ..., CN) as the ciphertext.
5. Make sure to set IV = (IV+N+1) before we encrypt the next message.

If you're familiar with the modes of operation, you might notice that this scheme looks a lot like CTR mode encryption, except that we're using a hash function instead of a block cipher.  

A proof sketch!

Let's say that I want to prove that this scheme is secure in the Random Oracle Model. Normally we'd use a specific definition of "secure" that handles many things that a real adversary could do.  But since this is a blog, we're going to use the following simplified game:
Some honest party -- the encryptor -- is going to encrypt a message.  The adversary will intercept the ciphertext.  The adversary "wins" if he can learn any information about the underlying plaintext besides its length.
Recall that we're in the magical Random Oracle Model. This is just like the real world, except whenever any party computes the hash function H(), s/he's actually issuing a call out to the oracle, who will do all the work of hashing using a random function. Everyone can call the oracle, even the adversary.

Following the scheme description above, the encryptor first chooses a random key (step 1). He divides his plaintext up into blocks (step 2). For each block i = 1 to N, he queries the oracle to obtain H(key || i).  Finally, he XORs the received hashes with the blocks of the plaintext (step 3), and sends the ciphertext (C1, ..., CN) out such that the adversary intercepts it (step 4).

We can now make the following observations.
  1. Internally, the oracle starts with an empty table.
  2. Each time the oracle receives a new query of the form "key || i" from the encryptor, it produces a new, perfectly random string for the output value. It stores both input and output in its table, and sends the output back to the encryptor.
  3. The encryptor forms the ciphertext blocks (C1, ..., CN) by XORing each message block with one of these random strings.
  4. The encryptor never asks for the same input string twice, because the IV is always incremented.

    And finally, we make the most important observation:
     
  5. XORing a message with a secret, perfectly random string is a very effective way to encrypt it. In fact, it's a One Time Pad. As long as the adversary does not know the random string, the resulting ciphertext is itself randomly-distributed, and thus reveals no information about the underlying message (other than its length).
Based on observation (5), all we have to do is show that the values returned when the oracle hashes "key || i" are (a) each perfectly random, and (b) that the adversary does not learn them. If we can do this, then we're guaranteed that the ciphertext (C1, ..., CN) will be a random string, and the adversary can learn nothing about the underlying plaintext!

The Final Analysis

Demonstrating part (a) from above is trivial. When the encryptor asks the oracle to compute H(key || i), provided that this value hasn't been asked for previously, the oracle will generate a perfectly random string -- which is all spelled out in the definition of how the oracle works. Also, the encryptor never asks for the same input value twice, since it always updates its IV value.

Part (b) is a little harder.  Is there any way the Adversary could learn the oracle's output for H(key || i) for even one value of i from 1 to N? The oracle does not make its table public, and the adversary can't "listen in" on the hash calls that the encryptor made.

Thinking about this a little harder, you'll realize that there's exactly one way that the adversary could learn one of those strings: she can query the oracle too. Specifically, all she has to do is query the oracle on "key || i" for any 1 <= i <= N and she'll get back one of the strings that the encryptor used to encrypt the ciphertext. If the adversary learns one of these strings, she can easily recover (at least part of) the plaintext, and our argument doesn't work anymore.

There's one obvious problem with this, at least from the adversary's point of view: she doesn't know the key.

So in order to query on "key || i", the adversary would have to guess the key.  Every time she submits a guess "guess || i" to the oracle she'll either be wrong -- in which case she gets back a useless random string. Or she'll be right, in which case our scheme won't be secure anymore.

But fortunately for us, the key is random, and quite big (128 bits). This tells us something about the adversary's probability of success.

Specifically: if she makes exactly one guess, her probability of guessing the right key is quite small: one in the number of possible keys.  For our specific example, her success probability would be 1/(2^128), which is astronomically small. Or course, she could try guessing more than once; if makes, say q guesses, her success probability rises to q/(2^128).

The basic assumption we started with is that the adversary is somehow limited in her computational power. This means she can only perform a reasonable number of hashes. In a formal security proof this is where we might invoke polynomial vs. exponential time.  But rather than do that, let's just plug in some concrete numbers.

Let's assume that the adversary only has time to compute 2^40 (1.1 trillion) hashes. Assuming we used a 128-bit key, then her probability of guessing the key is at most (2^40)/(2^128) = 1/(2^88).  This is still a pretty tiny number.

Think a real-world adversary might be able to compute more than 2^40 hashes? No problem.  Plug in your own numbers -- the calculation stays the same. And if you think the scheme isn't secure enough, you can always try making the key longer.

The point here is that we've demonstrated that in the Random Oracle Model the above scheme is 'as secure as the key'.** In other words, the adversary's best hope to attack the scheme is to find the key via brute-force guessing. As long as the key is large enough, we can confidently conclude that a "realistic" (i.e., time-bounded) adversary will have a very low -- or at least calculable -- probability of learning anything about the message.

In Summary

Depending on your point of view, this may have seemed a bit technical. Or maybe not. In any case, it definitely went a bit past the "layperson's introduction" I promised in the previous post. Moreover, it wasn't technical enough, in that it didn't capture all of the ways that cryptographers can "cheat" when using the random oracle model.***

I'm going to do my best to up for that in the next post, where we'll talk a bit about some real schemes with proofs in the random oracle model, and -- more importantly -- why cryptographers are willing to put up with this kind of thing.


Notes:

* I got this result from an online SHA1 calculator of untrusted veracity.  YMMV.

** Of course, in a real proof we'd be more formal, and we'd also consider the possibility that an adversary might at least be able to obtain the encryption of some chosen plaintexts.  This would be captured in the security game we use.

*** Some will note that I didn't exactly need to model the hash function as a random oracle to make this example work.  Had I invoked a keyed hash function, I might have used a weaker assumption (and in this case, weak is good!) that H() is pseudo-random.  But this is really a topic for another post.  I chose this example for pedagogical reasons (it's easy to understand!) rather than its unique need for a random oracle analysis.

Tuesday, October 4, 2011

How standards go wrong: constructive advice edition

The other day I poked a little fun at people who use non-standard encryption schemes in major Internet standards.  I admit that my flowchart was a tad snarky.  Moreover you could have gotten essentially the same advice by traveling back to 1998 and talking to Bruce Schneier.*

So to atone for my snark, I wanted to offer some more serious thoughts on this problem.  This is important, since TLS 1.0 (which I discussed the other day) is hardly the first widely-deployed spec to use bogus cryptography.  In my mind, the prevalence of this kind of thing raises two important questions:
  1. How do these bugs creep into major protocol specs?
  2. What can can specification designers do stop it?
Below I'm going to speculate wildly about the first question, and hopefully in the process generate some answers to the second.  But first, an important caveat: I'm far too lazy to join any standards committees myself,** so feel free to take my advice with appropriate salt.
Problem #1: Too much committee specialization and delegation.
A few years ago I became involved in a legal dispute related to an IEEE wireless security standard.  My role in this case required me to read the minutes of every single meeting held by the standards committee.  (This is a funny thing about litigation --- lawyers will often read these documents more carefully than any committee member ever will.)

The good news: this committee did a very nice job, especially given the magnitude of the problem they were trying to solve -- which I will analogize to fixing a levee breach with four sandbags and a staple remover.

What struck me about the process, however, is that from a large committee, most of the critical security decisions were ultimately delegated to one or two domain experts, with the rest of the committee assuming a passive approach.

"How should we arrange the sandbags? Ok, Bob's our sandbag expert. Jerry, should the staple remover be red? Ok, let's vote on it. Excellent."

The committee got lucky in that it had some fantastic domain experts.  But not every committee gets this lucky.  Furthermore, in this case the arrangement of the sandbags was critical to the stability of the entire system, so there was pressure to get it right.

On the other hand, many standards have little used features that nobody much cares about.  When the details of this feature are farmed out, the pressure is rarely as high, and this is when things can go poorly.
Problem #2: A lack of appreciation for security proofs.
Provable security is probably the neatest development in the history of computer security.  Rather than simply conjecture that an encryption scheme is secure, we can often prove it exactly.  Even when our proofs rest on unrealistic assumptions (like the existence of ideal ciphers or random oracles), they can still be effective at ruling out many "obvious" attacks on a cryptosystem.

With this power at our fingertips, we should never have to worry about something like the recent BEAST attack on SSL and TLS 1.0.  And yet stuff like this still happens all the time.

By way of a reminder: BEAST relies on a glitch in the way that SSL/TLS 1.0 implements CBC-mode encryption.  In normal CBC-mode, every ciphertext is constructed using a random Initialization Vector (IV).  In SSL/TLS 1.0, the designers "optimized" by setting the IV to be the final block of the previous ciphertext, sometimes known as a "residue".

Presumably the designers thought this was ok.  It wasn't in line with best practices, but it saved a few bytes.  Moreover, nobody had proposed any practical attacks on this approach (and wouldn't, until nearly five years after the TLS 1.0 spec was finalized).

Still, what if the designers had been obliged to offer a proof the security of this approach?  If they had, they would have quickly realized that there was a problem.  You see, it's quite possible to prove that standard CBC mode is secure under chosen plaintext attack.***  But you can't do the same thing with the residue optimization.  Any reasonable proof attempt will crash and burn.

Requiring a proof would have immediately knocked this optimization out of the protocol, long before anyone had thought of a specific attack on it.

Now, you might argue that TLS 1.0 was written way back in 1999, and that it inherited this nonsense from an even older protocol (SSL).  Furthermore, back then "practice oriented" provable security wasn't well understood; people were too busy buying tech stocks.  If you were really pedantic, you might even remind me that things eventually got better in 2006 when TLS 1.1 came out.

And yet...  Even in TLS 1.1, two whole years after an attack on TLS 1.0 was pointed out, the designers were still putting goofy, unproven stuff into the spec:

      The following alternative procedure MAY be used; however, it has
      not been demonstrated to be as cryptographically strong
as the
      above procedures
.  The sender prepends a fixed block F to the
      plaintext (or, alternatively, a block generated with a weak PRNG).
      He then encrypts as in (2), above, using the CBC residue from the
      previous block as the mask for the prepended block.  Note that in
      this case the mask for the first record transmitted by the
      application (the Finished) MUST be generated using a
      cryptographically strong PRNG.
This kind of thing may be right, it may be wrong.  But that's my point.  If you don't know, don't put it in the spec.
Problem #3: What, no point releases?
A couple of months ago I installed MacOS 10.7 on my Macbook Pro.  Just as I was getting used to all the new UI "improvements", a software update arrived.  Now I'm running MacOS 10.7.1.  Even at this moment, my Mac would really like to talk to me about another security update --- which I'm too lazy to install (cobbler's children, shoes, etc.).

And this is the Macintosh experience.  I shudder to think what it's like on Windows.****

On the flipside, some major Internet standards seem too respectable to patch.  Take our friend TLS 1.0, which "shipped" in 1999 with a bad cipher mode inherited from SSL.  In 2002, the OpenSSL project had the foresight to apply a band-aid patch.  In 2004 some specific attacks were identified.  And it took still another two years for TLS version 1.1 to propose a formal patch to these issues.

How would you feel if your operating system went seven years between security patches?  Especially when experts "in the know" already recognized the problems, and had functional workarounds.  Where was TLS 1.0.1?  It would have been nice to get that to the folks working on NSS.
Problem #4: Too many damn options.
I mentioned this once before on this blog, but a good point can never be made too many times.  It's very common for major protocol specifications to include gobs and gobs of options.  In one sense this makes sense --- you never know precisely what your users will require.  If you don't cater to them, they'll hack in their own approach, or worse: design a totally new protocol.

But every option you add to a protocol spec implicitly creates a new protocol.  This protocol has to be analyzed.  And this analysis takes time and adds complexity.  Add this to all the other problems I mention here, you can have some interesting times.
Problem #5: Don't worry, implementers won't follow the spec!
Anyone who's ever written a specification knows that it's like a platonic form.  Just as your kitchen table is never quite as good as the "perfect" table (mine has three Splenda packets propping up one leg), the implementation of your spec will never be quite exact.

Usually this is a bad thing, and it's frequently how protocols break in practice.  However, there are some very strange cases --- such as when the specification is broken --- where incorrect implementation may actually be perceived as a good thing.

Case in point: a few years back I was asked to look at one portion of a new proposed standard for securing consumer electronics devices.  The spec was already public, and some devices had even been manufactured to support it.

The bug I found was not a showstopper, but it did completely negate the effectiveness of one particular aspect of the protocol.  The discussion I had with the standards body, however, was quite strange.  Good news, they told me, none of the existing devices actually implement the protocol as written!  They're not vulnerable!  

Ok, that' is good news --- for now.  But really, what kind of solution is this?  If a protocol spec is bad, then fix it!  You're much better off fixing the spec once than hoping that none of the eventual implementers will do what you tell them to do.

In summary

This has been a long post.  There's more I could probably say, but I'm exhausted. The truth is, snark is a lot more fun than "real advice".

I have no idea if these thoughts really sum up the totality of how standards go wrong.  Maybe I'm still being a bit unfair --- many security standards are extremely resilient.  It's not like you hear about a "0-day" attack on TLS every other day.

But committees do screw up, and there are a few simple things that should happen in order to reduce the occurrence of such problems.  Mandate security proofs where possible.  Get outside advice, if you need it.  Reduce the number of features you support.  Update your specifications aggressively.  Make sure your implementers understand what's important.  Specification design is hard work, but fixing a bad spec is a million times worse.

Anyway, maybe one of these days I'll be crazy enough to get involved with a standards committee, and then you can all laugh at how badly I screw it up.

Notes:

If you do this, please call 1998-me and tell him not to waste two years trying to sell music over the Internet.  Also, tell him to buy Apple stock.


** I can't handle the meetings.

*** A common approach is to assume that the block cipher is a pseudo-random permutation.  See this work for such an analysis.

**** Cruel Windows joke.  Honestly, I hear it's gotten much better for you guys.

Sunday, October 2, 2011

Should I use a non-standard encryption scheme?

On the heels of the BEAST exploit on TLS 1.0, I've prepared a helpful flowchart to assist future standards committees.  You can click to enlarge.

Update: I felt bad that this post was a little snarky, so I've written another post in which I try to provide some constructive advice.