In defense of Applied Cryptography

Over the weekend I found myself re-reading a few bits of Bruce Schneier’s Applied Cryptography for a historical research project, and it reminded me what a fantastically wonderful, completely insane piece of writing it is. I’m sure Bruce Schneier needs no additional validation in his life, but I do think it’s worth saying a few words about the book — and why we need more works like it in our field.

I should preface this all by saying that Applied Cryptography is probably one of the most influential crypto books ever written. It certainly played a part in getting me interested in the field. If you don’t own a copy, you should, if only to be awed by Bruce’s knowledge of bizarre, historical ciphers and all of the ways they’ve been broken. (Though the most recent edition is from 1996, so don’t go looking for up to date information in it.)

Much the way a hard-core Nirvana fan will roll their eyes if you mention “Nevermind”, people in the crypto research community tend to get a little bent out of shape if someone references Applied Cryptography. This is not because AC is a lousy book, or because the material is inaccessible or wrong. Quite the contrary. Rather, the case against Applied Cryptography is that it made cryptography too damned accessible. And that has led to a whole lot of fail, some on a pretty grand scale.

The detailed argument goes something like this: Applied Cryptography demystified cryptography for a lot of people. By doing so, it empowered them to experiment with crypto techniques, and to implement their own code. No problem so far.

Unfortunately, some readers, abetted by Bruce’s detailed explanations and convenient source code examples, felt that they were now ready to implement crypto professionally. Inevitably their code made its way into commercial products, which shipped full of horribly ridiculous, broken crypto implementations. This is the part that was not so good. We’re probably still dealing with the blowback today.

Just for one modest example, take this fragment of code spotted in a Diebold voting machine, circa 2003:

// LCG – Linear Conguential Generator – used to generate ballot serial numbers 
// A psuedo-random-sequence generator  
// (per Applied Cryptography, by Bruce Schneier, Wiley, 1996) 
#define LCG_MULTIPLIER 1366 
#define LCG_INCREMENTOR 150889 …

Thanks to Applied Cryptography, the Diebold coders were able to write a perfectly functional Linear Congruential Generator in no time at all. You certainly can’t blame Bruce for anything here — the LCG code is fine. It’s certainly not his fault that Diebold missed the part where he warned never to use LCGs for security applications. Whoops!

Although it’s all said with love, some people really do blame Applied Cryptography for this sort of thing. Even Bruce has at various points himself apologized for this aspect of the book.

(Not coincidentally, you’ll notice that his more recent books are nowhere near as brazenly useful as AC. Where Practical Cryptography is all crapped up with grave warnings about the dangers of rolling your own crypto implementations, Applied Cryptography just laid it all out there sans apology, like a copy of the Anarchist Cookbook left open in a middle school library.)

But I said that I’m writing this post to praise the book, not to damn it with faint praise.

What’s magical about Applied Cryptography is really two things. First of all, it’s an incredible historical document. If there’s a cipher that was used in the period 1970-1996, you’ll read about it in Applied Cryptography. Even if the cipher was based on the cryptographic equivalent of an abacus, even if it was broken in the same conference in which it was published, Bruce will still give you a full design description and the address of the guy who owns the patent.

And you have to love the understated way in which he sums up his opinions. Take, for example, the section on the FEAL cipher. After four dense paragraphs in which he lays out not one, but eleven devastating attacks culminating in the total demolition of FEAL (and possibly, the ritual suicide of its designers), he could have written something terrible, capping it with three exclamation points. Instead, he leaves it with the following, dry remark:

Whenever someone discovers a new cryptanalytic attack, he always seems to try it out on FEAL first.

All joking aside, I’ve found Applied Cryptography (and its massive bibliography) to be incredibly useful, particularly when examining historical crypto software, and (most importantly) when finding prior art to current crypto patents. Applied Cryptography is like a key to a time period that’s not well covered by Google Scholar or other Internet search tools. It’s helped me through more than one trip to the library.

The other magical thing about AC is that you really feel that after writing it, Bruce must have been a few pounds lighter — and some of that weight was his soul. I’m not saying he’s soulless! (Though he does work for BT now, doesn’t he?) It’s just that there’s just no way to read through this monster of a book, with its enormous compendium of bibliographies and source code, and imagine someone writing this thing just for the money or even for the fame. In my opinion it’s an achievement of practical cryptography that has not been matched either before or since.

And that’s too bad, because someone really should try.

How standards go wrong: constructive advice edition

The other day I poked a little fun at people who use non-standard encryption schemes in major Internet standards.  I admit that my flowchart was a tad snarky.  Moreover you could have gotten essentially the same advice by traveling back to 1998 and talking to Bruce Schneier.*

So to atone for my snark, I wanted to offer some more serious thoughts on this problem.  This is important, since TLS 1.0 (which I discussed the other day) is hardly the first widely-deployed spec to use bogus cryptography.  In my mind, the prevalence of this kind of thing raises two important questions:

  1. How do these bugs creep into major protocol specs?
  2. What can can specification designers do stop it?

Below I’m going to speculate wildly about the first question, and hopefully in the process generate some answers to the second.  But first, an important caveat: I’m far too lazy to join any standards committees myself,** so feel free to take my advice with appropriate salt.

Problem #1: Too much committee specialization and delegation.

A few years ago I became involved in a legal dispute related to an IEEE wireless security standard.  My role in this case required me to read the minutes of every single meeting held by the standards committee.  (This is a funny thing about litigation — lawyers will often read these documents more carefully than any committee member ever will.)

The good news: this committee did a very nice job, especially given the magnitude of the problem they were trying to solve — which I will analogize to fixing a levee breach with four sandbags and a staple remover.

What struck me about the process, however, is that from a large committee, most of the critical security decisions were ultimately delegated to one or two domain experts, with the rest of the committee assuming a passive approach.

“How should we arrange the sandbags? Ok, Bob’s our sandbag expert. Jerry, should the staple remover be red? Ok, let’s vote on it. Excellent.”

The committee got lucky in that it had some fantastic domain experts.  But not every committee gets this lucky.  Furthermore, in this case the arrangement of the sandbags was critical to the stability of the entire system, so there was pressure to get it right.

On the other hand, many standards have little used features that nobody much cares about.  When the details of this feature are farmed out, the pressure is rarely as high, and this is when things can go poorly.

Problem #2: A lack of appreciation for security proofs.

Provable security is probably the neatest development in the history of computer security.  Rather than simply conjecture that an encryption scheme is secure, we can often prove it exactly.  Even when our proofs rest on unrealistic assumptions (like the existence of ideal ciphers or random oracles), they can still be effective at ruling out many “obvious” attacks on a cryptosystem.

With this power at our fingertips, we should never have to worry about something like the recent BEAST attack on SSL and TLS 1.0.  And yet stuff like this still happens all the time.

By way of a reminder: BEAST relies on a glitch in the way that SSL/TLS 1.0 implements CBC-mode encryption.  In normal CBC-mode, every ciphertext is constructed using a random Initialization Vector (IV).  In SSL/TLS 1.0, the designers “optimized” by setting the IV to be the final block of the previous ciphertext, sometimes known as a “residue”.

Presumably the designers thought this was ok.  It wasn’t in line with best practices, but it saved a few bytes.  Moreover, nobody had proposed any practical attacks on this approach (and wouldn’t, until nearly five years after the TLS 1.0 spec was finalized).

Still, what if the designers had been obliged to offer a proof the security of this approach?  If they had, they would have quickly realized that there was a problem.  You see, it’s quite possible to prove that standard CBC mode is secure under chosen plaintext attack.***  But you can’t do the same thing with the residue optimization.  Any reasonable proof attempt will crash and burn.

Requiring a proof would have immediately knocked this optimization out of the protocol, long before anyone had thought of a specific attack on it.

Now, you might argue that TLS 1.0 was written way back in 1999, and that it inherited this nonsense from an even older protocol (SSL).  Furthermore, back then “practice oriented” provable security wasn’t well understood; people were too busy buying tech stocks.  If you were really pedantic, you might even remind me that things eventually got better in 2006 when TLS 1.1 came out.

And yet…  Even in TLS 1.1, two whole years after an attack on TLS 1.0 was pointed out, the designers were still putting goofy, unproven stuff into the spec:

      The following alternative procedure MAY be used; however, it has
not been demonstrated to be as cryptographically strong
as the
above procedures
.  The sender prepends a fixed block F to the
plaintext (or, alternatively, a block generated with a weak PRNG).
He then encrypts as in (2), above, using the CBC residue from the
previous block as the mask for the prepended block.  Note that in
this case the mask for the first record transmitted by the
application (the Finished) MUST be generated using a
cryptographically strong PRNG.

This kind of thing may be right, it may be wrong.  But that’s my point.  If you don’t know, don’t put it in the spec.

Problem #3: What, no point releases?

A couple of months ago I installed MacOS 10.7 on my Macbook Pro.  Just as I was getting used to all the new UI “improvements”, a software update arrived.  Now I’m running MacOS 10.7.1.  Even at this moment, my Mac would really like to talk to me about another security update — which I’m too lazy to install (cobbler’s children, shoes, etc.).

And this is the Macintosh experience.  I shudder to think what it’s like on Windows.****

On the flipside, some major Internet standards seem too respectable to patch.  Take our friend TLS 1.0, which “shipped” in 1999 with a bad cipher mode inherited from SSL.  In 2002, the OpenSSL project had the foresight to apply a band-aid patch.  In 2004 some specific attacks were identified.  And it took still another two years for TLS version 1.1 to propose a formal patch to these issues.

How would you feel if your operating system went seven years between security patches?  Especially when experts “in the know” already recognized the problems, and had functional workarounds.  Where was TLS 1.0.1?  It would have been nice to get that to the folks working on NSS.

Problem #4: Too many damn options.

I mentioned this once before on this blog, but a good point can never be made too many times.  It’s very common for major protocol specifications to include gobs and gobs of options.  In one sense this makes sense — you never know precisely what your users will require.  If you don’t cater to them, they’ll hack in their own approach, or worse: design a totally new protocol.

But every option you add to a protocol spec implicitly creates a new protocol.  This protocol has to be analyzed.  And this analysis takes time and adds complexity.  Add this to all the other problems I mention here, you can have some interesting times.

Problem #5: Don’t worry, implementers won’t follow the spec!

Anyone who’s ever written a specification knows that it’s like a platonic form.  Just as your kitchen table is never quite as good as the “perfect” table (mine has three Splenda packets propping up one leg), the implementation of your spec will never be quite exact.

Usually this is a bad thing, and it’s frequently how protocols break in practice. However, there are some very strange cases — such as when the specification is broken — where incorrect implementation may actually be perceived as a good thing.

Case in point: a few years back I was asked to look at one portion of a new proposed standard for securing consumer electronics devices.  The spec was already public, and some devices had even been manufactured to support it.

The bug I found was not a showstopper, but it did completely negate the effectiveness of one particular aspect of the protocol.  The discussion I had with the standards body, however, was quite strange.  Good news, they told me, none of the existing devices actually implement the protocol as written!  They’re not vulnerable!  

Ok, that’ is good news — for now.  But really, what kind of solution is this?  If a protocol spec is bad, then fix it!  You’re much better off fixing the spec once than hoping that none of the eventual implementers will do what you tell them to do.

In summary

This has been a long post.  There’s more I could probably say, but I’m exhausted. The truth is, snark is a lot more fun than “real advice”.

I have no idea if these thoughts really sum up the totality of how standards go wrong.  Maybe I’m still being a bit unfair — many security standards are extremely resilient.  It’s not like you hear about a “0-day” attack on TLS every other day.

But committees do screw up, and there are a few simple things that should happen in order to reduce the occurrence of such problems.  Mandate security proofs where possible.  Get outside advice, if you need it.  Reduce the number of features you support.  Update your specifications aggressively.  Make sure your implementers understand what’s important.  Specification design is hard work, but fixing a bad spec is a million times worse.

Anyway, maybe one of these days I’ll be crazy enough to get involved with a standards committee, and then you can all laugh at how badly I screw it up.

Notes:

If you do this, please call 1998-me and tell him not to waste two years trying to sell music over the Internet.  Also, tell him to buy Apple stock.

** I can’t handle the meetings.

*** A common approach is to assume that the block cipher is a pseudo-random permutation.  See this work for such an analysis.

**** Cruel Windows joke.  Honestly, I hear it’s gotten much better for you guys.

Bram Cohen Corrected

Update 11/24/11: Bram has responded my criticisms in this post. See the comments section below for his thoughts and my response. Also, see this new post for my more detailed response.

I was innocently Googling the title of my blog (yes, I do this, it’s sad) when I came across this May 2011 post by Bram Cohen entitled “Practical Cryptography Corrected“.

Ostensibly the post is a criticism of Bruce Schneier’s book Practical Cryptography. Instead, it’s proof that inventing the world’s most useful peer-to-peer protocol does not make you a cryptographer.  More worrisome than that, it’s an example of someone with a big megaphone using it to give really bad advice.

Bram’s post has ten points, of which four are (somewhat) problematic.  Let’s take them one at a time.

Bram’s Practical Advice #1: For an RSA exponent, you should always use 2. Technically that’s Rabin-Williams, and requires slightly different implementation, but that actually works in its favor. Rabin-Williams has a reduction to factoring, RSA does not.

No, no, no.  A thousand times no.

If you’re going to encrypt with RSA, you should encrypt with RSA.  That means using a standard, well-tested RSA implementation with the traditional RSA public exponent (e = 0x10001).*

Unless you’re an expert, do not go switching your exponents around.  Rabin-Williams (public exponent 2) looks like RSA, but it’s a very different cryptosystem — both in how decryption works and in what goes wrong if you screw up your implementation.

For one thing, when you implement Rabin-Williams incorrectly, it becomes vulnerable to a chosen ciphertext attack that completely recovers the scheme’s secret key.  RSA does not have this problem.  This issue can be managed through skillful use of padding**, but your typical from-scratch implementer is not skillful.  Even bad implementations of good padding can lead to problems.  Why chance it?

Which brings me to my first law: The probability that your RSA ciphertext will be broken via some number-theoretical attack is vanishingly small.  The probability that it will be broken because you screwed up your implementation, on the other hand, is high.  All current approaches to solving the RSA problem work by attacking the factoring problem anyway, so it’s unclear what advantage we’re buying ourselves here.

Bram’s Practical Advice #2: You should always do encryption as a layer outside of authentication.

Cryptographers have been arguing the order of encryption and authentication for nearly thirty years.  Should you encrypt first, then apply authentication (a MAC)? Or should you authenticate first, then encrypt?

By 2000 we had basically put this question to bed.  The recommended best practice? Always encrypt first (with a strong cipher and mode of operation), then apply a MAC to the resulting ciphertext.  This isn’t just easy to remember, it’s provably secure against adaptive chosen ciphertext attacks.  That means you’re protected against very real issues like padding oracle attacks.  Bram’s technique doesn’t give you this, and that can spell disaster.

Bram’s Practical Advice #3: For an encryption mode, you should always use CTR, and always use a nonce of zero, and never reuse keys.

Bram’s Practical Advice #4: You should never use the same key for both encryption and authentication. If you need both encryption and authentication, you should use two keys. One for encryption, and one for authentication.

This one’s a twofer, and technically none of it is bad advice.  I’m a fan of CTR mode, and zero IVs are just fine in CTR (but never in CBC!) mode.

I have only two points to make: first, why not use a random IV?  What’s going to happen to you if you hard-code a zero IV, and then the next (idiot) developer switches your mode of operation to CBC?  This shouldn’t happen, but it might.

But more generally, my advice is not to use CTR at all.  Use a provably-secure authenticated mode of operation like CCM, OCB or (best) GCM, in the way recommended by standards.  These modes handle both encryption and authentication for you, and better yet — they all use a single key (by design). You’re much better off using these standards than hand-rolling your own encrypt/MAC approach (especially if — per Bram’s Advice #2) you plan on doing it wrong.

In Summary

I have a lot of respect for Bram, and if I come off a bit critical in the post above, it’s isn’t because I dislike the guy.  In fact, many of his remaining points are good.

But the only thing worse than bad advice is bad advice from a voice of authority.  And Bram has a lot of authority, which could lead people to trust him over some goofy, non-famous cryptographer (or even Bruce Schneier, for some crazy reason).

Fixing people’s cryptographic mistakes has given me a good living so far, but that doesn’t mean I like it.  So please, please don’t listen to Bram.  Do things right, follow standards, and most importantly: if you don’t know exactly what you’re doing, don’t do it at all.

Notes:

* I initially wrote “e = 100001” which is totally bogus.  Goes to show you shouldn’t trust a blog post.  Thanks to GSBabil.

** Use of padding, incidentally, completely breaks the reduction to factoring unless you conduct the proof in the Random Oracle Model.  And as I discuss here, there are some serious problems with that.