Liveblogging WWII: December 12, 1941

In mid-December 1941, Driscoll finally sent the British some information on her special method, with only cursory answers to the few questions Denniston had posed four months before. Driscoll again declared her faith in her approach, but GCCS concluded that it “apparently failed.” For one thing, it could not, as she had claimed, overcome the problem of turnover — that is, the tumbling of the Enigma’s wheels before enough letters had been enciphered to identify the wheel being used. And as Turing pointed out to her in a letter in October 1941, her method would take seventy-two thousand hours — more than eight years — to find a solution. Given Driscoll’s obstinacy, Bletchley park began to have second thoughts about providing more technical information.

As luck would have it, an apparent mix-up in the mail delivery between OP20G and Bletchley Park soon brought the simmering distrust and jealousies between the two agencies flaring the the surface. Dennison’s early October dispatch — a bag of materials containing detailed answers to all but one of Driscoll’s questions — never reached OP20G, the Navy claimed. It didn’t take long for Safford, who feared the British were breaking their promises, to push Leigh Noyes into firing off a series of complaints to the British. Through November and December 1941, angry memos and accusations flew across the Atlantic. Noyes didn’t mince his words: Britain had broken its promise to OP20G; American had no use for the Bombe; and if GCCS cooperated, Driscoll could have her method working on real problems. …

Noyes, who was unaware of the complexities of the mail mix-up, continued to fire off angry memos to the British, some of them clearly threatening. The U.S. Navy, he said, had never agreed to confine itself to Enigma research. It had always intended to be “operational” — that is, intercepting and decoding messages on its own. He told Hastings that all the Navy wanted from the British was the information on the Enigma and the codebooks and Enigma machine that Safford and Driscoll had requested.

Then, belying later histories of GCCS and OP20G relations, Noyes apologized to the British, twice. On December 10 and again on the twelfth, he declared that British explanations and actions since his outbursts had satisfied him and “everyone” at OP20G. The missing package, of course, was found. On December 13, Bletchley received a cryptic yet pointed message from someone in the U.S. Navy Department: “Luke Chapter 15, v 9: And she found it. She calleth together her friends and neighbors saying: Rejoice with me for I have found the piece which we lost”.

— Jim DeBrosse, Colin Burke: The secret in Building 26

Is there an Enigma bubble?

First it was .com stocks, then it was housing. Now it’s WWII-era German Enigma machines. From a recent CNN story:

An Enigma machine which featured in a Hollywood movie about the codebreakers of World War II has smashed auction estimates and sold for a world record price. 

The encoding device sparked a three-way bidding war when it went under the hammer at Christie’s in London Thursday, selling for £133,250 ($208,137) — more than double the upper estimate of £50,000. 

Christie’s said the previous record for an Enigma machine was £67,250, at the same auction house, in November 2010.

I for one would love to own an Enigma. But unless it’ll lead me to a cache of buried Nazi gold I have to draw the line at $100,000. It’s not like the Enigma algorithm is getting better.

But lack of funding doesn’t mean you shouldn’t be creative.

When I worked at AT&T I was told an (apocryphal?) story about a noted cryptographer who couldn’t afford to purchase an Enigma for himself, so he set out instead to blackmail one out of the NSA. Allegedly it took him only four conference submissions before they gave in. The last paper described how to attack a significant cryptosystem with paper and pencil.

This sounds so improbable that I can’t believe it really happened — which means that it probably did. If anyone knows the story and has a source for it, please drop me a line.

Matt Green smackdown watch (Are AEAD modes more vulnerable to side-channel attacks?)

Apropos of my last postColin Percival tweets:

I still think encrypt+MAC trumps AEAD because of side channel attacks on block ciphers.

AEAD stands for Authenticated Encryption with Associated Data, and it describes several new modes of operation that perform encryption and authentication all in one go, using a block cipher and a single key, rather than a separate MAC.* In my last post I recommended using one of these things, mostly because it’s simpler.

Colin thinks you’re better off using traditional encryption plus a separate hash-based MAC, e.g., HMAC (using a separate key), rather than one of these fancy new modes. This is because, at least in theory, using a block cipher for authentication could make you more vulnerable to side channel attacks on the block cipher.**

This tweet is followed by some back and forth, which becomes amusing when I fail to read a simple diagram and make a fool of myself. Also, I step on a rake.

My foolishness aside, this point deserves some unpacking. Let’s grant — without comment — the proposition that block-ciphers are vulnerable to side-channel analysis and HMAC-SHAx isn’t. Or at very least, if it is vulnerable, we’re only going to expose the MAC key, and we don’t care about that.***

Let’s also grant that our implementation is free of better vulnerabilities, and that side-channel attacks on block ciphers are actually where our attacker’s going to hit us.

So now we’re talking about three separate questions:

    1. Will using Encrypt + MAC (vs an AEAD) protect you from side-channel attacks on encryption? Trivial answer: no. You’re using the block cipher to encrypt, so who cares what authentication you perform afterwards.

But ok, maybe your side-channel attack requires the device to encrypt known, or chosen plaintexts, and it won’t do that for you. So you need something stronger.

  • Will using Encrypt + Mac (vs an AEAD) save you from side-channel attacks that work against the decryption of arbitrary ciphertexts? Answer: again, probably not. If you can get your hands on some legit ciphertexts (replays, for example), you can probably exercise the block cipher. At least in theory, this should be enough to implement your attack.
  • Will using Encrypt + Mac (vs an AEAD) save you from side-channel attacks that require decryption of known, or chosen ciphertexts? Answer: this may be the case where the means of authentication really matters.

So let’s drill into case (3) a bit. If you’re using Encrypt + MAC with a hash-based MAC (and a separate key), then the block cipher only comes into play for legitimate messages. Your decryption process simply terminates when it encounters an invalid MAC. This should prevent you from submitting chosen or mauled ciphertexts — you’re limited to whatever legit ciphertexts you can get your hands on.

On the other hand, if you’re using an AEAD mode of operation — which typically uses a single key for both authentication and encryption — then technically your block cipher (and key) come into play for every ciphertext received, even the invalid ones.

Alright, let’s dig into the modes a bit to see what kinds of encipherment/decipherment actually happens when you submit a ciphertext (and its IV, associated data) to be decrypted:

  • GCM mode: at very minimum, the decryptor will encipher the supplied IV. Technically, this is the only encipherment that needs to happen in order to check that the authentication tag is valid.**** If it’s not valid, GCM decryption can reject the ciphertext before going further.

Since the IV is typically adversarially-selected (in part), this gives the adversary a chance to exercise the encipherment mode of the cipher on a single partially-chosen block. One block doesn’t sound bad — however, he may be able to submit many such chosen ciphertexts.

  • CCM mode: CCM is a combination of CTR-mode encryption with CBC-MAC. Since the MAC is computed over the plaintext, the CCM decryptor can’t verify (and hence reject a bad ciphertext) until after he’s completely decrypted it. Decryption is actually encipherment of a set of known counter values, the first of which is (partly) chosen by the adversary.
  • OCB mode: Difficult to say. It’s the same as CCM, as far as decryption before MAC testing. However, this complicated by the fact that the cipher is ‘tweaked’ in a DES-X-type construction. And honestly, nobody uses it anyway.
  • EAX mode: the good news is that the MAC is computed on the ciphertext, so the decryptor shouldn’t decrypt the ciphertext unless the MAC is valid. The bad news is that the MAC is a CBC-style MAC (OMAC), using the same key as for decryption, so authentication will encipher at least some chosen values.

So where are we? Pretty far down the rabbit hole.

I’m going to go with the following: if you’re deeply afraid of side-channel attacks on your block cipher, you might feel marginally better about with Encrypt + MAC if your application is definitely not vulnerable to cases (1) and (2) above and you’re positive that HMAC-SHAx isn’t vulnerable to side-channel attacks. Otherwise I’d use an AEAD just to make my life simpler.

But I’m not passing judgement on this. It’s a new perspective to me, and I’m genuinely curious to see what others have to say. Can people recommend any practical attacks, papers, opinions on this subject?

Notes:

* Includes GCM, CWC, OCB, EAX and CCM modes.

** Attacks like the recent attack on the Mifare DESFire, or timing/cache timing attacks on AES.

*** Of course, if you do expose the MAC key through one side-channel attack, then you might be able to do something more sophisticated against the encryption algorithm. But my head is already hurting too much.

**** Technically, checking GHASH also requires you to encipher the 0 message under the cipher key, but that can be done beforehand. It doesn’t need to happen each time you decrypt.

How (not) to use symmetric encryption

This is supposed to be a blog about cryptographic engineering. brass_key_corkscrew12That means crypto, but it also means software engineering, protocol design, HCI and hardware engineering (fair warning: my hardware experience mostly consists of solder burns).

And yet, in describing the attacks of the past few weeks, I’ve mostly been talking about basic symmetric encryption. This seems to be an area that people get wrong, no matter how straightforward it sounds.

So at the risk of boring my readership to tears — I’m going to spend the next two posts talking about the dos and don’ts of symmetric encryption, particularly symmetric encryption with block ciphers. I may also branch out a little into key derivation and management, but I know you’ll be understanding.

I realize that for some of you this may not make for scintillating reading, but here’s my thinking: we do it once now, we’ll never have to discuss it again. Right?

Excellent. In this first post I’m going to start with the don’ts.

Symmetric Encryption Don’t #1: Don’t encrypt with ECB mode

Block ciphers are designed to process discrete chunks of data. For example, AES works on 128-bit blocks. To encrypt longer messages with them, we use one of several “modes of operation“. These modes are not all created equal.

 Tux image: Larry Ewing. (I will not
thank the GIMP, no matter what
his license says.)

ECB (Electronic Codebook) mode is by far the stupidest. Essentially you’re just chopping the message up into blocks, then using the raw block cipher to encipher each block individually. There are two problems with ECB mode:

  1. It’s not randomized. This means that anytime you encrypt a given message under a key, you’ll get exactly the same ciphertext. This goes for substrings of the message as well.
  2. It treats each block of the message independently. As a result, some of the structure of the message can leak through. This includes things like padding, which will produce predictable patterns in the ciphertext.

The first point can be a problem in some circumstances. Imagine, for example, that you’re sending a relatively small number of messages (e.g., commands for a remote system). Every time you send a given command, you’re sending exactly the same ciphertext. This gets obvious pretty fast.

I would say that the second problem is the more serious one. Perhaps you’ve seen Wikipedia’s classic image of an ECB-mode encrypted TIFF file (right). But probably this seemed a little contrived to you — after all, who uses TIFF files anymore?

So allow me to give my favorite example of why ECB mode is problematic. This image comes from a software packaging system that used ECB mode to encrypt executable files. All I’ve done is open one of those encrypted files as a raw bitmap image. You’ll have to squint a little.

An executable file encrypted using ECB mode. Click to see a larger version.

This doesn’t give away the contents of the executable, but it gives a pretty good picture of where different segments are. Just look for the funny patterns and tire tracks. Just having this little bit of information might give you a nice head start on finding those segments when they’re in memory, which is potentially what you’re going to do next.

Symmetric Encryption Don’t #2: Don’t re-use your IVs

Every block cipher mode of operation except for ECB (which you shouldn’t use!) employs a special per-message nonce called an Initialization Vector, or IV. The basic purpose of an IV is to ensure that the encryption function works differently every time; it adds an element of randomness, or at least unpredictability to your ciphertexts.

Unfortunately, developers seem genuinely stumped by IVs. Maybe this isn’t their fault. Every mode of operation has slightly different rules about how to pick IVs, and a slightly different set of consequences for when you screw it up.

So let’s start with something simple. No matter what mode you use, this kind of thing is never ok:

This chunk of bad advice comes from an ancient (and hopefully obsolete) version of the AACS specification. But it’s hardly the exception. Grep for “IV” in the source repositories of just about any major software house, and I guarantee you’ll find plenty of stuff like this.

Why is this a problem? Let me count the ways:

  1. At a minimum, it eliminates any random behavior in the encryption scheme. With a fixed IV, a given message will always encrypt to the same ciphertext (if you’re using the same key). This goes for two messages that are the same up to a certain point. See my discussion of ECB mode above for why this can be a problem.
  2. If you’re using a stream cipher mode of operation like CTR or OFB, it’s a disaster. If you encrypt two different messages with the same key, and the IV is fixed, then an attacker can XOR two ciphertexts together. This will give them the XOR of the two underlying plaintexts. (Think this will be useless? I doubt it, especially if they’re clever.)

    By the way, this kind of thing also happens when people forget to change the IV when encrypting multiple versions of a file. Don’t do that either.

  3. If you’re using a chaining mode like CBC, use of a fixed IV can still lead to plaintext recovery. See, for example, this chosen plaintext attack on TLS, which only requires the adversary know which IV is being used. This type of attack is pretty tricky to implement, but it’s definitely possible.
  4. It will make you look stupid and embarrass you when a professional audits your code.

Clearly some of these issues are application-specific. Maybe you don’t think anyone will be able to leverage them. You might even be right — in this version of the application. But sooner or later, you or a future developer will bolt on new features, deploy it in the cloud, or make it into a browser applet. When they do, all of these issues will magically go from theoretically vulnerable to stupidly vulnerable.

And people will blame you.

So how do you use IVs correctly? I’ll talk about this more in my next post. But if you’re really chomping at the bit, my advice is to take a look at the FIPS specification for block cipher modes. (I must warn you, however: please don’t operate heavy machinery while reading these documents.)

Symmetric Encryption Don’t #3: Don’t encrypt your IVs

So you’ve generated your IV correctly, you’ve used it correctly, but now you’re hung up on a final question: what do I do with this darn thing?As I’ve said, IVs make people nervous. People know they’re not keys, but they’re not ciphertexts either. They wonder: is this an important value? Should I just send it over the wire as it is? Hmm, just to be safe, I’d better encrypt it. Even if I’m wrong, it can’t hurt.

As this Reddit commenter can attest, what you don’t know can hurt you.

Here’s a simple rule of thumb. IVs are not keys. They’re not secret. If you’ve chosen the IV correctly, you can send it along with your ciphertext in the clear. You should authenticate it (see below), but you should never encrypt it.

The worst thing you can do is encrypt your IVs using the same key that you’re using to encrypt messages. The absolute worst example is when you’re using CTR mode encryption, and you make the mistake of encrypting your IV using ECB mode. When you do this, anyone can XOR the first block of ciphertext with the encrypted IV, and obtain the plaintext of that block.

These problems aren’t limited to CTR. My advice: have faith in your IVs, and they’ll have faith in you.

Symmetric Encryption Don’t #4: Don’t forget to authenticate your ciphertexts

Once upon a time cryptographers looked at encryption and authentication as two unrelated operations. Encryption was for protecting the confidentiality of your data, and authentication was used to keep people from tampering with it.*

Nowadays we know that the two are much more tightly linked. You may not care about people tampering with your data, but your encryption scheme just well might. The problem is active attacks. See here and here for just a couple of examples.

To make a long story short, many of the clever attacks on symmetric encryption schemes these days require an attacker to tamper with ciphertexts, then submit them to be decrypted. Even if the decryptor leaks just a tiny bit of information (e.g., is the padding correctly formatted, is the message properly formatted), that can be enough to gradually peel away the encryption and recover the plaintext.

Obviously you don’t want this.

The very elegant solution is to authenticate your ciphertexts, and not just in any willy-nilly fashion. There are basically two approaches that won’t lead to heartburn down the road:

  1. Best: use an authenticated mode of operation, such as GCM, CCM, OCB or EAX. These modes handle encryption and authentication in one go (and they can even authenticate some optional unencrypted ‘associated’ data for you). Best yet, they use a single key.
  2. Almost as good: first encrypt your message using a secure mode of operation (say, CTR), then compute a Message Authentication Code (e.g., HMAC-SHA1) on the resulting ciphertext and its IV. Use two totally different keys to do this. And please, don’t forget to MAC the darn IV!

What you should not do is apply the MAC to the plaintext. First of all, this won’t necessarily prevent active attacks. Padding oracle attacks, for example, can still be leveraged against a scheme that authenticates the message (but not the padding). Furthermore, even if you MAC the padding, there’s still a slim possibility of timing attacks against your implementation.

Symmetric Encryption Don’t #5: Don’t CBC-encrypt in real time

Let me use this space to reiterate that there’s nothing wrong with CBC mode, provided that you use it correctly. The unfortunate thing about CBC mode is that there are many ways to use it incorrectly.

Knowing this, you shouldn’t be surprised to hear that CBC is the most popular mode of operation.

This ‘don’t’ is really a variant of point #2 above. CBC mode can be insecure when an attacker has the ability to submit chosen plaintexts to be encrypted, and if the encryption is on a live stream of data where the adversary can see the ciphertext blocks immediately after they come out (this is called ‘online’ encryption). This is because the adversary may learn the encryption of the previous plaintext block he submitted, which can allow him to craft the next plaintext block in a useful way.

If he can do this, he might be able to take some other ciphertext that he’s intercepted, maul it, and feed it through the encryptor. This kind of attack is challenging, but given the right circumstances it’s possible to decrypt the original message. This attack is called a blockwise chosen plaintext attack, and it’s essentially what the BEAST attack does.

Symmetric Encryption Don’t #6: Don’t share a single key across many devices

A wise man once said that a secret is something you tell one other person. I’m not sure he realized it, but what he was saying is this: don’t put the same symmetric key into a large number of devices (or software instances, etc.) if you want it to remain secret.

About the fastest way to lose security in a system is to spread a single key broadly and widely. It doesn’t matter if you’ve embedded that key inside of a tamper-resistant chip, buried in a block of solid concrete, and/or placed it in a locked file cabinet with a sign saying ‘beware of leopard‘.

The probability of your system being compromised goes up exponentially with each additional copy of that key. If you’re doing this in your current design, think hard about not doing it.

Symmetric Encryption Don’t #7: Don’t pluralize your keys using XOR

(credit)

This one is really a flavor of #5, but a more subtle and stupid one.

Key ‘pluralization’ refers to a process where you obtain multiple distinct keys from a single master key, or ‘seed’. Usually this is done using some strong cryptographic function, for example, a pseudo-random function.

This happens all over the place. For example: TLS does it to derive separate MAC and encryption keys from a master secret. But an extreme type of pluralization often occurs in large-scale systems that provide unique keys to a large number of users.

Think of a cellular provider distributing SIM cards, for example. A provider could generate millions of random authentication keys, distribute them to individual SIM cards, and then store the whole package in a back-end database. This works fine, but they’d have to store this database and do a lookup everytime someone contacts them to authorize a phone call.

Alternatively, they could start with one short master seed, then pluralize to derive each of the SIM keys on demand. This process would take as input the seed plus some auxiliary value (like the subscriber ID), and would output a key for that user. The advantage is that you no longer need to keep a huge database around — just the tiny initial seed.

This approach is so tempting that sometimes people forget about the ‘strong cryptographic function’ part, and they derive their keys using tools that aren’t so secure. For example, they might just XOR the master seed with the subscriber or device ID.

No, you say, nobody could be that dumb. And yet KeeLoq was. Millions of cars keys were provisioned with cryptographic keys that were generated this way. It turns out that if you can extract any one of those per-car keys, and if you know the device’s serial number, you can easily recover the master key — which means you can break into any other car.

Symmetric Encryption Don’t #8: Don’t use DES or RC4 or @#(*&@!

Ok, I said this was mostly going to be about block ciphers. DES fits that category, and I hope you know why not to use it. But RC4 also deserves a special mention just for being the world’s most popular dubious stream cipher.

RC4 shows up everywhere. It shows up in products. It shows up in malware. Basically, it shows up anywhere someone needed crypto, but was too lazy to download a copy of AES. Hell, I’ve used it myself — um, for personal reasons, not for work, mind you.

If you use RC4 correctly, it’s probably ok. For now. The problem is twofold. First of all, cryptanalysts are slowly chipping away at it — sooner or later they’re going to make some serious progress.

The second problem is that it’s not always used securely. Why not? You might as well ask why meth labs explode at a disproportionate rate. My guess is that the set of people who use caution when mixing volatile chemicals simply doesn’t overlap well with the set of people who cook methamphetamine. Ditto RC4 and proper usage.

I could waste a lot of time going on about all of this, but instead I’m just going to quote Thomas Ptacek:

if you see a bespoke crypto design, and it dates from after 2000, and it uses RC4, that’s an audit flag.

Now if Thomas says this about RC4, what do you think he’s going to say about your homebrew protocol based on the Russian GOST cipher? That’s right: nothing nice. Don’t let it happen to you.

In Summary

So this is where I leave you. I doubt this list is complete — I’ll try to update it if I think of anything else. At the very least, if we could fix these issues it would knock out a healthy chunk of the silly crypto issues I see on a day to day basis.

Oh, and a pony too.

Ok, so, I’m a little skeptical that all of these problems will go away that easily, but I’d be content with even just one or two of the points. So if you’re designing a new crypto product and could spare a minute just to glance at the above, you would certainly make my day.

Notes:

* Honestly, there was even a certain amount of confusion on this point. If you look at old protocols like  Needham-Schroeder, you’ll see that they basically treat encryption as authentication. Don’t do this. Most common modes of operation are malleable, meaning that you can mess with the ciphertext, cut and paste different ones, etc.

Academic vs. commercial cryptographers

Luther Martin has a post on the difference between academic andcommercial cryptographers. You should read the whole thing, but I wanted to add my $.02 on this part:

I don’t have any firsthand experience with this, but I’ve heard stories of how people in the academic world who try to approach cryptography from the point of view of commercial cryptographers also encounter problems. The places where they work typically put more value on inventing new things, so practical implementations are often considered not as good as less practical, yet new, inventions.

I do have some firsthand experience with this. And Luther’s basically right.

I’m fortunate to have a foot in both the commercial and academic worlds. On the one hand, this means that I get to spend my days working with real products, which is fascinating because, well, it’s fascinating. And it’s relevant.

Unfortunately, from a technological point of view, commercial crypto doesn’t exactly set the world on fire. Once in a while you get to work with a company which is doing something interesting, like Voltage or PGP. But for the most part you’re playing with the same basic set of tools.

Therefore, when I have a chance to do research, I tend to gravitate to the purely academic. This includes protocols that enhance user privacy — stuff like this.

I will cheerfully admit that there’s about a 1% chance that any of my academic work will be deployed in this decade. And I’m ok with that! I enjoy solving problems, and I like that in crypto research, at least, we’re not slaves to the immediate practical.

But maybe as academics, we take this too far.

I advise some grad students, and one of my sad duties is to inculcate them with this understanding: you can do whatever you want in your career, but if you want to get published, the absolute worst thing is be too practical. Don’t kill yourself implementing some cryptosystem that’s practical and deployable, unless there’s something extremely sexy and new in it. And if that’s the case, try not to waste all that time implementing it in the first place! The reviewers (mostly) don’t care.

This is problematic, since in my opinion there’s a huge gap between commercial work and academic crypto. This includes a big category of technologies that we need in order to make (secure) crypto easier to deploy. But none of the incentives are there to support this kind of research.

Despite this, I’m trying to shift some of my work in that direction. This means a whole lot of time-consuming work writing tools like this one. Building this kind of tool is a pre-requisite to doing real research. Unfortunately it requires a lot of scut work, and that’s not going to get anyone a ton of sexy research publications. Still someone needs to do it.

I’m not really sure what to do about this, and I sure hope it changes at some point.

Human error is something to be engineered around, not lamented

Random thought of the day, apropos of this comment by Jon Callas:

We know that the attack against EMC/RSA and SecureID was done with a vuln in a Flash attachment embedded in an Excel spreadsheet. According to the best news I have heard, the Patient Zero of that attack had had the infected file identified as bad! They pulled it out of the spam folder and opened it anyway. That attack happened because of a security failure on the device that sits between the keyboard and chair, not for any technology of any sort.

Quite frankly, if this is what qualifies as human error in a security system, then we’re all in deep trouble. We’re stuck with it. We’re born to it.

I’ll assume one of two things happened here:

  1. An AV scanning system identified a known signature inside of an attachment, recognized that this could be an exploit, and responded to this very serious issue by moving the file into the SPAM folder, where it joined many other legitimate messages that were improperly marked as spam.
  2. A Spam filter noticed something funny about a header, and moved the file into the SPAM folder, something it probably does eight times per week for no reason at all.

Unless your users are superhuman, the problem here is not the user. It’s the system. If the file legitimately contained a vulnerability, it shouldn’t have been moved into the SPAM filter where it could easily be mistaken for a random false positive.

If, on the other hand, the problem was just something to do with the headers, then maybe the user was just doing what was normal — pulling a probable false positive out of their spam folder, just like they did every day.

People are not superhuman. They react to the inputs you give them: GIGO applies. If security systems give people crap inputs, then they’ll make crap decisions. Fixing this problem is our job. We don’t get to complain every time a user does something perfectly understandable in response to bad data that we (security system designers) give them.

And of course, this leaves aside the basic fact that the master seed was available to this attack in the first place, something that boggles the mind… But I guess that’s all been said.

Non-governmental crypto attacks

Over on Web 1.0, Steve Bellovin is asking an interesting question:

Does anyone know of any (verifiable) examples of non-government enemies exploiting flaws in cryptography?  I’m looking for real-world attacks on short key lengths, bad ciphers, faulty protocols, etc., by parties other than governments and militaries.  I’m not interested in academic attacks — I want to be able to give real-world advice — nor am I looking for yet another long thread on the evils and frailties of PKI.

The responses vary from the useful to the not-so-useful, occasionally punctuated by an all-out flamewar — pretty much par for the course in these things.

Here are a few of the responses that sound pretty reasonable. They’re (mostly) not mine, and I’ve tried to give credit where it’s due:

  1. Cases of breached databases where the passwords were hashed and maybe salted, but with an insufficient work factor enabling dictionary attacks.*
  2. NTLMv1/MSCHAPv1 dictionary attacks.*
  3. NTLMv2/MSCHAPv2 credentials forwarding/reflection attacks.*
  4. The fail0verflow break of poorly-nonced ECDSA as used in the Sony PlayStation 3.*
  5. DeCSS.*
  6. Various AACS reverse-engineering efforts.
  7. The HDCP master key leak.*
  8. Various attacks on pay satellite TV services.****
  9. GSM decryption, which seems to have gone beyond the academic and into commercial products.
  10. Factoring of the Texas Instruments 512-bit firmware signing key for calculators, and Elcomsoft’s factoring of the Quicken backup key.**
  11. Key recovery in WEP.
  12. Exploits on game consoles: the original XBox,*** Wii software signing.

There’s also some debate about recent claims that 512-bit RSA certificate signing keys were factored and used to sign malware. As much as I’d like to believe this, the evidence isn’t too solid. Some posters claim that there were also 1024-bit keys used in these attacks. If that’s true, it points more to key theft (aka Steve’s ‘evils and frailties of PKI’).

You’ll also notice I’m leaving lots of stuff off of this list, only because I don’t know of any specific attacks based on it. That would include all the padding oracle attacks of late, the BEAST attack on TLS, bad Debian keys, and so on.

So what’s the takeway from all of this? Well, it’s complicated. A quick glance at the list is enough to tell us that there are plenty of ‘real people’ (aka non-professional cryptographers) out there with the skills to exploit subtle crypto flaws. That definitely supports my view that proper crypto implementation is important, and that your code will be exploited if you screw it up.

Some people may take comfort from the fact that there’s no crypto ‘pearl harbor’ on this list, i.e., the cryptographic equivalent of a Conficker or Stuxnet. I would say: don’t get too cocky. Sure, software security is a mess, and it’s a whole lot easier to set up a dumb fuzzer than to implement sophisticated crypto exploits. (No offense to dumb fuzzers — I’m friends with several.)

But on the other hand, maybe this is misleading. We mostly learn about software 0days from mass malware, which is relatively easy to catch. If sophisticated crypto exploits are being implemented, I would guess that they’re not going into retail worms and trojans — they’re being very quietly applied against high-value targets. Banking systems, for example.

But again, this is just speculation. What do you think?

Notes:

* Marsh Ray.

** Solar Designer.

*** Tom Ritter.

**** commenter “Swiss Made”, below.

Bram Cohen corrects me?

Bram Cohen has responded to my criticism of his comments on Practical Cryptography. For those who didn’t read my original post, here are my comments in a nutshell:

  1. No, you shouldn’t use RSA with public exponent 2 (i.e., Rabin-Williams). You’re way too likely to screw it up.
  2. No, you should not authenticate (MAC) before you encrypt. You should encrypt first, then you should add a MAC on the ciphertext. Or better yet, use a standard authenticated encryption scheme.
  3. You’re almost always better off using a standard encryption scheme that’s already been implemented and tested, no matter how much ‘better’ an alternative might sound.
You can see Bram’s response in the comments section, down at the bottom of the original post. I’ll try to hit the high points here without doing him an injustice:

Bram: Matthew, first of all, you’re being extremely dismissive of any practical concerns here. Practical crypto deployments bottleneck on the server doing modular exponentiations, and if you use the standard exponent over plain old 2, then you need *five times* as much computing power, which frequently translates into five times as many real machines consuming five times the power in five times as large of a colo space. 

My response? Fair enough. Rabin encryption is a little bit faster. If you’re writing towards an application that needs to perform enormous amounts of encryption, there’s a colorable argument that Rabin might be a good choice. But as Bram points out in his own response, if that’s the case, you’re probably better off using Elliptic Curve cryptography.

In general, I’m leery of efficiency arguments. Standard RSA encryption is extremely fast. The 0x10001 exponents was chosen to minimize the number of squarings required in the encryption process. According to this benchmark, taken on an Intel Core 2 1.83 GHz processor, 1024-bit encryption takes 0.08 milliseconds. That’s without hardware acceleration.

If you’re building a massive server farm and performance is an issue, perhaps there’s reason to worry about the encryption time. If you’re doing something with a very limited processor, and you really know what you’re doing, perhaps there’s some reason to be concerned about performance.

But as a general recommendation? No way. You’re just encouraging people to adopt schemes they don’t really understand, in order to realize a minor performance benefit they won’t notice.

Bram: Doing encrypt-then-authenticate looks a lot like authenticate-then-encrypt except that the authentication is plaintext, which obviously opens it to attacks which involve guessing the plaintext and verifying it because the authentication is right there, and you can extend those to modifying the ciphertext and replacing the authentication. I simply do not believe that making yourself susceptible to those problems is a win. 

Ok, I admit that I legitimately don’t understand Bram’s argument here. If anything, this sounds like the argument for encrypt-then-authenticate.

For the benefit of people who are just joining the discussion, there are basically two points of view. Bram says you should authenticate your plaintext (using a MAC) then you should encrypt it. I (and many other cryptographers) say you should encrypt first, then authenticate the ciphertext.

In a nutshell, the advantage of my approach is what happens at decryption time: when a ciphertext comes in, you first verify a MAC to ensure that the ciphertext is correct and hasn’t been tampered with. You do this with a key that’s not related to the encryption key. If the MAC verification fails, you stop. You simply don’t decrypt.

This ensures that no information about the plaintext ever leaks back to the attacker. If the attacker is tampering with ciphertexts (e.g., in an attempt to implement a padding oracle attack), he gets nothing. This approach has all kinds of nice advantages, one of which is its modularity.

You see, provided that you’re MACing the ciphertext, it doesn’t matter how the encryption scheme works. You don’t have to worry about using it a funny padding scheme. You don’t have to worry about any encoding or compression that might take place during the encryption process. If someone upgrades your encryption to a different mode of operation, you’re still fine — at least as far as active attacks are concerned.

Even better, the encrypt-then-authenticate approach can be proven IND-CCA secure (semantically secure against adaptive chosen ciphertext attacks). A proof can be found in this paper.

Bram: The other issue with encrypt-then-authenticate is that it makes the code more complicated and just plain uglier, which directly makes you more susceptible to the #1 cause of actual breaks, implementation error.

Ok, here again I disagree. Let’s take a look at the most common example of authenticate-then-encrypt that I can think of. It’s CBC-mode encryption as defined in the TLS specification. This has been standard going back to SSL 3.0, but it’s still used in current versions — due to backwards-compatibility issues.

So with that in mind, see this note included in the TLS 1.2 spec:

   Implementation note: Canvel et al. [CBCTIME] have demonstrated a
timing attack on CBC padding based on the time required to compute
the MAC.  In order to defend against this attack, implementations
MUST ensure that record processing time is essentially the same
whether or not the padding is correct.  In general, the best way to
do this is to compute the MAC even if the padding is incorrect, and
only then reject the packet.  For instance, if the pad appears to be
incorrect, the implementation might assume a zero-length pad and then
compute the MAC.  This leaves a small timing channel, since MAC
performance depends to some extent on the size of the data fragment,
but it is not believed to be large enough to be exploitable, due to
the large block size of existing MACs and the small size of the
timing signal.

Allow me to translate this to plain English: “because we chose to authenticate our plaintext, rather than our ciphertext, we had to (a) MAC both the message and the padding, and (b) even though we did that, it turned out that attackers could still hoover out a little bit of information based on the time that it takes to check the ciphertext (basically, implementations would immediately throw an error after seeing invalid padding, and this gave away the game). So implementers now have to do all this extra nonsense if they want their implementation to be secure.”

None of this is necessary in encrypt-then-authenticate. It’s simpler. Really!

I haven’t touched on every single point that Bram makes — you can see his argument in the original posts if you want.

The only final point I would like to make is that this stuff matters. If you get it wrong, you can attack an implementation. And too many people — professionals, even — get it wrong.

In conclusion: I certainly didn’t mean to bash Bram, but I’m happy to bash the advice itself. We can do better.

Digital Fortress: I read it so you don’t have to

Once in a while I run into a work of dramatic fiction that takes such a powerful, realistic look at modern cryptography — and its implications for our national security — that we should all be grateful to the author. They’ve made us all a little bit smarter.

Needless to say, Dan Brown’s Digital Fortress is not one of those books.

What is Digital Fortress? I’m not sure. It may be a practical joke. I’m hoping so, anyway, because the alternative — that Dan Brown spent time learning about cryptography and this is what came out — is too terrible to contemplate.

Miraculously, the end result is so ridiculous that it’s almost tolerable. Almost.

(Before I go further, I should warn you that there are huge spoilers below. But don’t worry about it because (a) you shouldn’t read this book, and (b) the plot is so predictable that I doubt it can really be spoiled.)

Where to begin? Let me just hit some of the high notes:

  • Matt Blaze gets whacked. Ok, Brown doesn’t call him Matt Blaze. The character in the book is named Greg Hale (single syllables, get it?) But we know he’s Matt Blaze because he single-handedly discovered a backdoor (“a few lines of cunning programming”) in the Skipjack cipher, one that would have let the NSA “read the world’s email”.For his efforts, Blaze/Hale is rewarded with a thankless job at the NSA which he hates. And not without reason! Everyone suspects him of being a turncoat (while simultaneously giving him access to their most important secrets, go figure.) Then to cap off the job experience, he’s horrifically murdered. I personally would have ended the book halfway through, and just made Hale the bad guy. Who could blame him
  • The EFF are the bad guys. Did you know that the National Security Agency (2011 ops budget: $9 bazillion) lives in constant terror of the “sharks” over at the Electronic Frontier Foundation (2011 ops budget: $67.34, mostly spent on construction paper)?I did not know this. I sure hope it’s true.Near the end of the book the NSA’s “firewall” goes down for a few minutes, and within seconds the EFF sharks are circling. This does not appear to be a metaphor — they’re actually swimming in literal circles around the NSA’s cyber-perimeter, trying to ferret out our nation’s secrets. Only our hero Susan Fletcher can stop them, thanks to her long, willowy legs staggering intellect.
  • The NSA has a secret supercomputer named TRANSLTR that can brute-force any cryptosystem. Hmm. Ok. This one is actually pretty accurate.
  • TRANSLTR is stumped by a new cipher that uses “rotating cleartext” to resist brute-force attacks. No, seriously. “Rotating cleartext”. Brilliant! This is such a good idea that I hereby offer to make this scheme a reality for only $800. My only condition is that you must never, ever try to decrypt anything with it.

I suppose I could also mention the NSA’s recruiting program for cryptographers, which would make a Big Ten football coach blush. Or the way that brilliant, righteous people always seem to be beautiful and athletic, while stupid evil people look like trolls (have you ever been to a crypto conference, Dan?)

These and other details are embedded in a stew of unintentional comic awfulness that really defies summarization. But it’s good. Not to read, mind you. Just to know about, so you have a reason to chuckle while you’re lamenting the damage that Brown’s writing has done to your temporal lobe.

I really could not recommend this book less, and am considering sending Dan Brown a bill for the time I spent reading it. But if you’re afflicted by a terrible case of Seasonal Affective Disorder and just need some levity in your life, it might keep you giggling through the long dark months ahead.

Happy Thanksgiving everyone!