Tuesday, July 24, 2012

Four theories on the cryptography of Star Trek

"I'm sorry Captain. They rotated by fourteen."
Over on ZDNet they're asking why cybersecurity is like Star Trek. I think this is the wrong question. A better one is: why is the cybersecurity so bad on Star Trek?

Please don't take this the wrong way. I'm a huge Trek fan. I've watched every episode ever made, and I'd do it again if I had time. Even the Holodeck ones.

But I also teach computer security, and specifically, cryptography. Which is ruining the show for me! How can I buy into a universe where the protagonists have starships, transporters and dorky positronic robots, but still can't encrypt an email to save their livesThe Trek crew has never encountered an encryption scheme that didn't crack like an egg when faced with an 'adaptive algorithm' (whatever that is), or -- worse -- just a dude doing math in his head.

But there's no reason to take my word for this. Thanks to the miracle of searchable Star Trek, you can see for yourself.

Cryptographers deserve better. Viewers deserve better. And while I can't fix bad screenwriting, I can try to retcon us an explanation. And that will be the subject of this post: four scientifically credible explanations why 24th century crypto could legitimately be so awful.

Theory #1: A quantum leap

One answer to the mystery of Trek's bad crypto is so obvious it's mundane. It's the 24th century, and of course all the computers are quantum. Everyone knows that quantum computers are super-duper-powerful, and would blow through traditional encryption like a knife through butter.

But not so fast! As I've written before on this blog, quantum computers are actually quite limited in what (we think) they can do. This even goes for quantum computers enhanced with bio-neural gel packs, whatever the hell those are.

Specifically: while QCs are very good at solving certain number-theoretic problems -- including the ones that power RSA and most public-key encryption schemes -- theorists don't believe that they can efficiently solve NP-complete problems, which should still leave an opening for complexity-theoretic crypto to thrive in the 24th century. And yet we never hear about this in Trek.

Of course it's always possible that the theorists are wrong. But quantum computers still don't explain why Spock can apparently crack encryption codes in his head. (And no, 'Vulcans are really good at math' is not a theory.)

Theory #2: It's the warp drive, stupid  

If there's a single technology that makes the Star Trek universe different from ours, it's the Warp drive. And this tees up our next theory:
Could it be that there's a conflict between faster-than-light travel and secure cryptography? Could Zephram Cochrane have done in crypto?
Shockingly, there might actually be something to this. Exhibit A is this paper by Scott Aaronson and John Watrous -- two honest-to-god complexity theorists -- on the implications of a physical structure called a 'closed timelike curve' (CTC) and what would happen if you used one to go back in time and kill your grandfather.

Aaronson and Watrous aren't really interested in killing anyone. What they're interested in is paradoxes, and particularly, what it means if the Universe resolves paradoxes. It turns out that this resolution power has huge implications for computing.

It seems that computers with access to paradox-resolving time travel would be dramatically more powerful than any of the computers we can envision today, regardless of whether they're quantum or classical. In fact, CTC-enhanced computers would be powerful enough to efficiently solve problems in the complexity class PSPACE. This would utterly doom the type of complexity-theoretic crypto we rely on today.

But this still leaves a question: does the Warp drive necessarily imply the existence of CTCs?

One clue comes from Einstein's special theory of relativity, which implies that faster-than-light travel would imply violation of causality. For those without the physics background: Star Trek IV. 

Theory #3: Complexity theory is dead

Do you remember the episode in Deep Space Nine where O'Brien and Bashir discussed the latest developments in Ferengi computer science? How about the episode that took place at a Vulcan complexity theory conference? No, I don't either. These things never happened.

This all by itself is suspicious. Trek characters could waste hours blabbering about subspace fields or trying to convince Data he's a real boy. But something as central as the computers that run their ship and keep them alive? Not a peep, not even in a "TECH" scene.

It's almost as though by the end of the 24th century, complexity theory has fallen off of the list of things people care about. Which brings me to my next theory:
In the Star Trek Universe, P = NP.
In one sense this would be huge and mostly great news for computer scientists. But it would be a disaster for the efficient (complexity-theoretic) encryption we use on a daily basis. For things like RSA and AES to be truly secure, we require the existence of 'one-way functions'. And those can only exist if P does not equal NP (P != NP).

Fortunately for cryptography, most computer scientists are convinced that P != NP. They just haven't been able to to prove it. The most recent attempt was made by Vinay Deolalikar of HP Labs, and his proof foundered on subtleties just like every one before it. This means the problem is still open, and technically could go either way.

If P did turn out to be equal to NP, it's conceivable that result would look exactly like Star Trek! A few algorithms could still be quite difficult to break (i.e., the attacks would have huge polynomial runtimes). But maybe not. People might instead fall back on obscurity to overcome the mathematical impossibility of building strong complexity-theoretic encryption. One-time pads would still work, of course, and quantum key distribution might allow for point-to-point transmission. Everything else would become a massive joke.

Now, this theory still doesn't explain the 'breaking crypto in your head' thing, or why it takes like six hours to change the Enterprise's command codes. But it would go a long way to repairing the damage wrought by years of bad scriptwriting.

Theory #4: The Stallman effect

Live long and publish your source.
This last theory is the most mindbending. It's also not mine (I ripped it off from Chris Long).

To get a fix on it, you first have to think about this Federation we hold so dear. Here we have a society where the cost of making something is simply the marginal cost of replicating a copy. Money isn't necessary, and people are free to devote themselves to activities that are fun, after spending the necessary ten hours a week on required tasks such as legislation, family counseling, robot repair and asteroid prospecting.

Does any of this sound familiar to you? Yes. The Federation was founded on the teachings of Richard M. Stallman.

A society based on the teachings of RMS can't possibly get security right. To such a society, security is simply a tool that prevents you you from accessing the full capabilities of your computer replicator. How could we expect serious crypto in a society that worships the legacy of RMS?

A minor problem with this theory is that it doesn't explain why bad cryptography crosses species lines: even the Romulans have terrible encryption. Of course, the Romulans have frigging cloaking devices and still haven't managed to wipe us out. So maybe we can just chalk that one up to incompetence.

In conclusion

I admit that there's only so far you can go with all of this. At a certain point you have to give in and admit that the Trek screenwriters don't know encryption from a Chronoton field. And honestly, what they've done with cryptography is nothing compared to what they've done to physics, electronics, and historical drama.

And please don't get me started on the Holodeck. Can't they just fit that thing with an OFF switch?

Still, if nothing else, this post has given me another forum to bitch about my favorite grievance: bad cryptography in movies and TV. And a chance to remind Hollywood (should any representatives be reading) that I am ready and willing to help you with your cryptographic script writing problems for a very reasonable fee. Just don't expect anyone to do crypto in their head.

Tuesday, July 17, 2012

Indifferentiability

Delicious.
After umpteen weeks writing about broken stuff, I'm thrilled to tell you that for once, nothing awful is happening in the crypto world. It won't last. But while it does, let's grab the opportunity to talk about something constructive.

Now a word of warning: what I'm going to talk about today is fairly wonky and (worse), involves hash functions. If you're not feeling up for this, this is your cue to bail and go read something nice about buffer overflows.

For those still with me, the subject of this post is the design of hash functions, and more specifically: the indifferentiability proofs that designers write to argue for their security. I was surprised to find that most people have never heard of these proofs, and thus have no idea why they're useful. That's too bad, since they're extremely important to the way we analyze hash functions today.

Merkle-Damgård

This is not Ivan Damgård. (Seriously Google?)
The best way to begin any discussion of hash function design is to take a quick peek inside of the hash functions we actually use. Since the most popular hashes today are MD5 (ugh) and SHA, the right place to start is with the 'Merkle-Damgård' paradigm.

To understand Merkle-Damgård, you need to understand that cryptographers love to build complicated things out of simpler components. Under the hood of most block ciphers you'll find S-boxes. Similarly, if you take the lid off a Merkle-Damgård hash function -- surprise! -- you find block ciphers. Or at least, something very much like them.

This approach dates to a 1979 proposal by a young cryptographer named Ralph Merkle. What Merkle showed is a way to build hash functions with a variable-length input, using any fixed one-way compression function (a one-way function that spits out fewer bits than it takes in). While Merkle wasn't specific about the function, he suggested that DES might be a good candidate.

Expressed as a colorful diagram, the Merkle construction looks something like this:
Merkle-Damgård Construction (source: Wikipedia because I'm too lazy to
draw my own diagrams). IV is a fixed value. f is a one-way compression function.
The beauty of Merkle's proposal is that it's relatively simple to understand. You simply chop your message into blocks, then feed each block into the function f along with the output of the previous function evaluation. Throw in a finalization stage and you're done.

Of course there's a difference between proposing a technique and showing that it actually works. It would take ten more years, but at CRYPTO 1989, Merkle and another cryptographer named Ivan Damgård independently submitted formal analyses of Merkle's proposal. What they showed is that as long as the function f has certain ideal properties, the resulting hash function is guaranteed to be collision-resistant.  The rest, as they say, is history.

The popularity of Merkle-Damgård can be attributed in part to its security proof. But it also owes something to some major practical advantages:
  1. You can use any secure block cipher as the function f, with just a few tweaks.
  2. M-D hash functions can be pretty darn fast, again depending on f and how you use it.
  3. M-D hashes allow you to digest 'live' data streams, where you don't know in advance how much data you're going to be hashing. 
Of course, Merkle-Damgård hashes also have serious weaknesses. The most famous is the 'length extension attack' in which an attacker, given only H(M) for some unknown message M, can 'tack on' additional blocks of her own choosing. This issue spells big trouble for people who think that H(key || message) is a good Message Authentication Code

What's interesting about the length-extension issue is not that it leads to broken MACs. I mean, that is interesting, and it's why you should use HMAC. But what's really interesting is that this flaw doesn't represent a violation of the collision-resistance guarantee. The two issues are in fact completely orthogonal. And this tells us something fundamental. Namely: collision-resistance is not enough.

Today's implementers do all kinds of crazy things with hash functions, and many of those applications require much more than collision-resistance. To achieve the necessary properties, we first need to figure out what they are. And that requires us to think hard about the following question:

What the heck is a secure hash function?

If you crack a typical security textbook (or visit the Wikipedia page on hash functions), you'll see a long list of things of things a hash function 'must' accomplish. The list usually starts with these:
  1. Collision resistance. It should be hard to find any pair of messages M1, M2 such that H(M1) == H(M2).
  2. Pre-image resistance. Given only h it should be hard to find a 'pre-image' M2 such that H(M2) == h.
Now leave aside the technical fact that none of the unkeyed hash functions we use today are 'truly' collision-resistant. Or that the above definition of pre-image resistance implies that I can hash my cat's name ('fluffy') and nobody can invert the hash (note: not true. Go ask LinkedIn if you don't believe me.) The real problem is that these definitions don't cover the things that people actually do with hash functions.

For example, take the construction of PRNGs. A common PRNG design hashes together large pools of gathered entropy in the hope that the result will be sufficiently uniform for cryptographic work. This is so common that it's probably happening somewhere on your computer right now. And yet, absolutely nothing in the definitions above implies that this technique is safe!* Similar problems exist for key derivation functions, and even for signature schemes like ECDSA which clearly require hash functions that are more than just collision-resistant.

The more you look into the way that people use hash functions, the more you realize that they really need something that produces 'random-looking' output. Unfortunately, this notion is surprisingly hard to formalize. Hash functions are unkeyed, so they're not pseudo-random functions. What in the world are people asking for?

Random oracles and indifferentiability

The answer, if you dig hard enough, is that people want hash functions to be random oracles

Random oracles are cryptographers' conception of what an 'ideal' hash function should be. Put succinctly, a random oracle is a perfectly random function that you can evaluate quickly. Random functions are beautiful not just because the output is random-looking (of course), but also because they're automatically collision-resistant and pre-image resistant. It's the only requirement you ever need.

The problem with random functions is that you just can't evaluate them quickly: you need exponential storage space to keep them, and exponential time to evaluate one. Moreover, we know of nothing in the 'real' world that can approximate them. When cryptographers try to analyze their schemes with random functions, they have to go off into an imaginary fantasy world that we call the 'random oracle model'.

But ok, this post is not to judge. For the moment, let's imagine that we are willing to visit this fantasy world. An obvious question is: what would it take to build a random oracle? If we had a compression function that was good enough -- itself a random function -- could we use a technique like Merkle-Damgård to get the rest of the way?

In 2004, Maurer, Renner and Holenstein gave us a powerful tool for answering this question. What they showed is that it's always possible to replace functionality A (e.g., a random oracle) with another functionality B (e.g., an ideal compression function) provided that the following rules are satisfied:
  1. There exists a way to 'construct' something 'like' A out of B.
  2. There exists a way to 'simulate' something 'like' B using A.
  3. An attacker who interacts with {constructed A-like thing, B} cannot tell the difference (i.e., can't differentiate it) from {A, simulated B-like thing}
The definition of simulation gets a bit wonky. but expressed in simpler language all this means is: if you can show that your hash function, instantiated with an 'ideal' compression function, looks indistinguishable from a random oracle. And you can show that a manufactured compression function, built using a random oracle as an ingredient, looks indistinguishable from an ideal compression function, then you can always replace one with the other. That is, your hash function is 'good enough' to be a random oracle.

The following year, Coron, Dodis, Malinaud and Puniya applied this framework to Merkle-Damgård-hash functions. Their first result was immediate: such a proof does not work for Merkle-Damgård. 

Of course this shouldn't actually surprise us. We already know that Merkle-Damgård doesn't behave like a random oracle, since random oracles don't exhibit length-extension attacks. Still it's one thing to know this, and another to see a known problem actually turn up and screw up a proof. So far, no problem.

What Coron et al. showed next is much more interesting:
  • They proved formally that Merkle-Damgård can be made indifferentiable from a random oracle, as long as you apply a prefix-free encoding to the input before hashing it. Prefix-free encodings prevent length-extensions by ensuring that no message can ever be a prefix of another.
  • Next, they proved the security of HMAC applied to a Merkle-Damgård hash.
  • Finally, and best of all, they showed that if you simply drop some bits from the last output block -- something called a 'chop' construction -- you can make Merkle-Damgård hashes secure with much less work.
The best part of Coron et al.'s findings is that the chop construction is already (inadvertently) in place on SHA384, which is constructed by dropping some output bits from its big-brother hash SHA512. The modern hash variants SHA512/224 and SHA512/256 also have this property.** So this theoretical work already has one big payoff: we know that (under certain assumptions) these hashes may be better than some of the others.

And these results have bigger implications. Now that we know how to do this, we can repeat the process for just about every candidate hash function anyone proposes. This lets us immediately weed out obvious bugs, and avoid standardizing another hash with problems like the length extension attack. This process has become so common that all of the SHA3 candidates now sport exactly such an indifferentiability proof.

Of course, in the real world, indifferentiability only takes you so far. It does tell us something, but it doesn't tell us everything. Sure, if the compression function is perfect, you obtain a strong result about the hash function. But compression functions are never perfect. Real compression functions have glitches and oddities that can make these theoretical results irrelevant. This is why we'll always need smart people to arm wrestle over which hash we get to use next.

In conclusion

If I had it in me, I'd go on to talk about the SHA3 candidates, and the techniques that each uses to achieve security in this model. But this has already been a long post, so that will have to wait for another time.

I want to say only one final thing.

This is a practical blog, and I admit that I try to avoid theory. What fascinates me about this area is that it's a great example of a place where theory has directly come to the aid of practice. You may think of hash functions as whizzing little black boxes of ad-hoc machinery, and to some extent they are. But without theoretical analysis like this, they'd be a whole lot more ad-hoc. They might not even work.

Remember this when NIST finally gets around to picking Keccak BLAKE.

Notes:

* For a ridiculous example, imagine that you have a secure (collision-resistant, pre-image resistant) hash function H. Now construct a new hash function H' such that H'(M) = {"long string of 0s" || H(M)}. This function is as collision-resistant as the original, but won't be very useful if you're generating keys with it.

** Thanks to Paulo Barreto for fixing numerous typos and pointing out that SHA512/256 and /224 make excellent candidates for chop hashes!