A bad couple of years for the cryptographic token industry

SafeNet eToken PRO Anywhere

There was a time just a few weeks ago when it seemed like the CRYPTO 2012 accepted papers list might not be posted in time for, well, CRYPTO 2012. Fortunately the list is public now, which means (1) we’ve avoided the longest rump session in the history of cryptography, and (2) I get to tell you about a particularly neat paper.

The paper in question is Efficient Padding Oracle Attacks on Cryptographic Hardware‘ by Bardou, Focardi, Kawamoto, Simionato, Steel and Tsay. This is a typically understated academic title for a surprisingly nifty result.

Here’s the postage stamp version: due to a perfect storm of (subtle, but not novel) cryptographic flaws, an attacker can extract sensitive keys from several popular cryptographic token devices. This is obviously not good, and it may have big implications for people who depend on tokens for their day-to-day security. If that describes you, I suggest you take a look at this table:

Tokens affected by the Bardou et al. attacks. A checkmark means “vulnerable”. (source).

That’s the headline news, anyway. The more specific (and important) lesson for cryptographic implementers is: if you’re using PKCS#1v1.5 padding for RSA encryption, cut it out. Really. This is the last warning you’re going to get.

So much for the short version. Keep reading for the long one.

What are cryptographic tokens?

If you’ve ever developed cryptographic software, you know that general-purpose computers aren’t the safest place to store keys. Leaving aside stuff like this, all it takes is one nasty software bug to send all your important key material (e.g., TLS server keys, certificate signing keys, authentication keys) into someone else’s hands.

Since it seems increasingly difficult to harden standard computers, a popular solution is to simply put the keys somewhere else. The most popular ‘somewhere’ is onboard a cryptographic co-processor. When the co-processor is expensive and powerful, we often refer to it as a Hardware Security Module (HSM). When it’s removable and consumer-grade we call it a ‘token‘. Most tokens are used for authentication purposes (think SecurID), but they’re useful for other things like disk encryption. Modern tokens typically look like USB sticks.

Tokens operate under some of the most challenging conditions you can imagine. The computer they’re connected to may be completely pwned. An attacker could steal a token and physically dismember it in an attempt to extract its secrets. None of this is easy to deal with, and manufacturers have invested a lot of time and money getting this right.

Unfortunately, what Bardou et al. are telling us is: maybe they didn’t invest enough. Several popular token brands are still vulnerable to cryptographic attacks that have been well-known since the 1990s. And that is bad news.

But this paper is brand new. Why are you talking about ‘a bad couple of years’?

While this new paper is a bad result for the token industry, it’s not the first bad news we’ve had about these devices.

Let me explain. You see, while tokens won’t let you extract sensitive keys, they make an exception if the sensitive keys are encrypted (aka: ‘wrapped’). This makes a lot of sense. How else are you supposed to import/export/back up keys from a token? In theory, this encryption (wrapping) should keep the keys safe even if an attacker has totally compromised your computer.

An older problem — discovered by a series of researchers in 2009 and 2010 — is that token manufacturers were careless in the way they implemented the PKCS#11 Cryptographic Token Interface Standard. The spec says that encryption keys are to be used for either (a) encryption/decryption of normal data, OR (b) for wrapping sensitive keys. But never both. If you allow users to perform both operations with a given key, you enable the following attack:

  1. Ask the token to encrypt a sensitive key K and export the ciphertext.
  2. Ask the token to decrypt the ciphertext and hand you the result.
  3. Congratulations, you have K!

It’s hard to believe that something this ‘stupid’ actually worked, but for some tokens it did. The whole mess reached a head in 2010 when Bortolozzo, Focardi, Centenaro and Steel automated the process of finding API bugs like this, and were able to ‘break’ a whole bunch of tokens in one go.

(Token manufacturers were not happy with this, by the way. At CCS 2010, Graham Steel described being invited to a major French token manufacturer where the CEO held him captive and berated him gently tried to persuade him not to publish his results. Such are the hazards of responsible disclosure.)

The above issues should be long fixed by now. Still, they raised questions! If token manufacturers can’t implement a secure API, how well are they doing with things like, say, complicated cryptographic algorithms? And that’s the question this new paper tries to answer.

So what’s wrong with the crypto?

What the CRYPTO paper shows is that several commercial tokens (but not all of them!) are vulnerable to practical padding oracle attacks on both symmetric (AES-CBC) and asymmetric (RSA PKCS #1v1.5) encryption. These attacks are a huge problem, since they give an attacker access to keys that otherwise should be safely encrypted outside of the token.

Padding attacks take different forms, but the general intuition is this:

  1. Many encryption schemes apply padding to the plaintext before encryption.
  2. After decrypting a given ciphertext, most implementations check to see if the padding is valid. Some output a recognizable error (or timing differential) when it isn’t.
  3. By mauling a legitimate (intercepted) ciphertext in various ways, the attacker can submit many modified versions of the ciphertext to be decrypted. Some of these will have valid padding, others will produce error results. Suprisingly, just these bits of information can be sufficient to gradually reveal the underlying plaintext.

For tokens, you start with an encrypted key you want to learn, and you submit it to the token for import. Normally the token will just accept it, which isn’t very helpful. So instead, you modify the ciphertext in some way, and send the mangled ciphertext to be imported. If the token responds with something like ‘could not import key: the ciphertext had bad padding’, you may be able to run a padding oracle attack.

(Note that it doesn’t take an explicit error: if decryption is simply faster when the padding is bad, you’ve got everything you need.)

The most famous (and efficient) padding oracle attack is Vaudenay’s attack on padded CBC-mode encryption, proposed back in 2002. (See here for an idea of what the attack looks like.) The first finding of Bardou et al. is that many tokens are vulnerable to Vaudenay’s attack. Yikes.

But what about RSA encryption? This is important, since many tokens only support RSA to import secret keys.

Fortunately (for the researchers) RSA encryption also uses padding, and even better: these tokens use a scheme that’s long been known to have padding oracle problems of its own. That scheme is known as RSA-PKCS#1v1.5, and it was famously broken by Daniel Bleichenbacher in the late 1990s. As long as these implementations produce obvious errors (or timing differences) when they encounter bad padding, there should be an ‘off the shelf’ attack that works against them.

But here the researchers ran into a problem. Bleichenbacher’s attack is beautiful, but it’s too slow to be useful against tokens. When attacking a 1024-bit RSA key, it can require millions of decryption attempts! Your typical consumer-grade token token can only compute a few decryptions per second.

No problem! To make their attack work, Bardou et al. ‘simply’ optimized Bleichenbacher’s attack. The new version requires mere thousands (or tens of thousands) of decryption attempts, which means you can unwrap sensitive keys in just a few minutes on some tokens.

This table tells us how long the attacks actually took to run against real tokens:

Timing results for the optimized Bleichenbacher attacks (on 1024-bit RSA encryptions). ‘Oracle’ indicates the nature of the RSA decryption/padding check implementation. (source)

Is my HSM secure?

If you’re an HSM manufacturer I have ‘good’ news for you. You’re ok. For now.

Oh, not because your HSM is secure or anything. Hilariously, the researchers were unable to run their attacks on a commercial HSM because they couldn’t afford one. (They can costs upwards of EUR20,000.) Don’t get complacent: they’re working on it.

How does the RSA attack work?

For the real details you should see the paper. Here I can offer only the tiniest little intuition.

The Bleichenbacher attack relies on the fact that RSA is homomorphic with respect to multiplication. That means you can multiply a hidden RSA plaintext (the “m” in “m^e mod N”) by some known value “s” of your choosing. What you do is compute “m^e * s^e mod N”. The decryption of this value is “(m*s) mod N”.

Now consider the details of RSA-PKCS#1v1.5. To encrypt a legitimate message, the encryptor first pads it with the following bytes, and treats the padded result as the integer “m” to be RSA encrypted.

0x 00 02 { at least 8 non-zero random bytes } 00 { message }

So you’ve intercepted a valid, PKCS#1v1.5 padded RSA ciphertext, and you have access to a decryptor who knows the secret key. The trick now is to ‘mangle’ the legitimate ciphertext in various clever ways and submit these new results to the decryptor.

The basic approach is to multiply the ciphertext by different chosen values “s^e mod N” (for known “s”) until eventually you find one that does not produce a padding error. This means that the decrypted value “(m*s mod N)” is itself a PKCS-padded message. This seems inconsequential, but actually it tells you quite a lot. Presumably you know “s”, since you chose it. You also know that “m*s mod N” is an integer of a very specific form, one that satisfies the padding checks. By adaptively choosing new “s” values and submitting many, many ciphertexts for decryption, these tiny bits of information will allow you to ‘zero in’ on the underlying plaintext.

The authors of this paper describe several optimizations to Bleichenbacher’s attack. What I find most interesting is the way they exploited differences in the way that tokens actually implement their padding checks. It turns out that this nature of this implementation makes a huge difference to the efficiency of the attack.

Five possible ‘implementations’ are described below.

  1. FFF: padding is ‘ok’ only if correctly padded and plaintext is of a specific length (e.g., it’s a 128-bit AES key)
  2. FFT: padding is ‘ok’ only if correctly padded, but plaintext can be any length.
  3. FTT: same as above, but also allows 0s in the “non-zero random bytes”.
  4. TFT: same as above, but ‘ok’ even if there are no zeros after the first byte.
  5. TTT: padding is ‘ok’ as long as it starts with ‘0x 00 02’.

These all look quite similar, but have very different practical implications!

You see, intuitively, the Bleichenbacher attack only ‘works’ if sometimes the decryptor says ‘hey, this mauled ciphertext has good padding!’. As a general rule, the more often that happens, the faster the attack works.

It goes without saying that the TTT padding check is a lot more likely to produce ‘good padding’ results than the ‘FFF’, and correspondingly for the intermediate ones. FFF is so challenging that it’s basically impractical. The ‘good’ (uh, ‘bad’) news is that many tokens tend towards the more useful implementations. And as you can see in the table below, little details like this have huge impacts.

Total number of decryption queries required to decrypt an RSA-PKCS#1v1.5-encrypted symmetric (AES) key.  ‘Original algorithm’ (blue box) shows the mean/median performance of the standard Bleichenbacher attack. ‘Modified algorithm’ (red box) is the mean/median performance of the new, optimized algorithm. Note that the specific implementation of the padding check (FFF/TTT etc.) makes a huge difference in the cost. (source) 

What can I do to protect my own code from padding oracle attacks?

If you’re using symmetric encryption, that’s easy. Simply use authenticated encryption, or put a MAC on all of your ciphertexts. Do this correctly and Vaudenay padding oracles will not trouble you.

RSA encryption is more difficult. The ‘best practice’ in implementing RSA is: don’t implement RSA. Other people have done it better than you can. Go find a good implementation of RSA-OAEP and use that. End of story.

Unfortunately, even with RSA-OAEP, this can be difficult. There are attacks on OAEP encryption too (some are mentioned in this paper). And if you’re working in hardware, life is particularly difficult. Even small timing differentials can leak details about your padding check. If this describes your use case: find a professional and be very, very careful.

Are there any broad security lessons, any wisdom, to be drawn from this result?

Usually the lesson I glean from these attacks is: people make mistakes. Which is barely a lesson, and it’s certainly not wisdom. For some of the issues in this paper, namely the symmetric padding oracle attacks, that’s still about what I get. Manufacturers made unforced errors.

But the RSA-PKCS#1v1.5 attacks are, to me, a bit more meaningful. I run into bad implementations of this obsolete standard this all the time. Most of the time when I bring it up, nobody really cares. The typical response is something like: it’s called the ‘million message attack’ for a reason. There’s no way anyone run that thing against our system.

Bruce Schneier says ‘attacks only get better, they never get worse’. In my opinion, the lesson of this paper is that attacks not only get better, but if you go around tempting fate, they’re quite capable of getting better by precisely the amount necessary to break your (high-value) system. You should never use a broken primitive just because the ‘known’ attack seems impractical. 

And while this may not be ‘wisdom’ either, I think it’s advice to live by.

Flame, certificates, collisions. Oh my.

Update 6/6: Microsoft has given us more details on what’s going on here. If these collision attacks are truly novel, this tells us a lot about who worked on Flame and how important it is. 

Update 6/7: Yes, it’s officially a novel MD5 collision attack, with the implication that top cryptographers were involved in Flame’s creation. We truly are living in the future.

See detailed updates (and a timeline) at the bottom of this post.

If you pay attention to these things, you’ve heard that there’s a new piece of government-issued malware making the rounds. It’s called Flame (or sometimes Skywiper), and it’s basically the most sophisticated — or most bloated — piece of malicious software ever devised. Kaspersky gets the credit for identifying Flame, but the task of tearing it to pieces has fallen to a whole bunch of different people.

Normally I don’t get too excited about malware, cause, well, that kind of thing is somebody else’s problem. But Flame is special: if reports are correct, this is the first piece of malware ever to take advantage of an MD5 certificate collision to enable code signing. Actually, it may be the first real anything to use MD5 certificate collisions. Neat stuff.

What we know is pretty sketchy, and is based entirely on the information-sparse updates coming from Microsoft’s security team. Here’s the story so far:

Sometime yesterday (June 3), Microsoft released an urgent security advisory warning administrators to revoke two Intermediate certificates hanging off of the Microsoft root cert. These belong to the Terminal Services Licensing service. Note that while these are real certificates — with keys and everything — they actually have nothing to do with security: their sole purpose was to authorize, say, 20 seats on your Terminal Services machine.

Despite the fact that these certs don’t serve any real security purpose, and shouldn’t be confused with anything that matters, Microsoft hung them off of the same root as the real certificates it uses for, well, important stuff. Stuff like signing Windows Update packages. You see where this is going?

Actually, this shouldn’t have been a real problem, because these licensing certs (and any new certificates beneath them) should have been recognizably different from code-signing certificates. Unfortunately, someone in Product Licensing appears to have really screwed the pooch. These certificates were used to generate Licensing certificates for customers. And each of those certificates had the code signing bit set.

The result? Anyone who paid for a Terminal Services license could (apparently) sign code as if they were Microsoft. If you’re wondering what that looks like, it looks like this.

This is obviously a bad thing. For many reasons. Mikko Hypponen points out that many organizations whitelist software signing by Microsoft, so even if institutions were being paranoid, this malware basically had carte blanche.

Ok, so far this is really bad, but just a garden-variety screwup. I mean, a huge one, to be sure. But nothing really interesting.

Here’s where that changes. Just today, Microsoft released a new update. This is the new piece:

The Flame malware used a cryptographic collision attack in combination with the terminal server licensing service certificates to sign code as if it came from Microsoft. However, code-signing without performing a collision is also possible.

Now, I’m not sure why the ‘collision attack’ is necessary if code-signing is possible without a collision. But who cares! A collision attack! In the wild! On top-secret government-issued Malware! Dan Brown couldn’t hold a candle to this.

If we look at the Licensing certificates in question, we do indeed see that at least one — created in 2009 — uses the MD5 hash function, something everyone knows is a bad idea:

Certificate:    Data:        Version: 3 (0x2)        Serial Number:            3a:ab:11:de:e5:2f:1b:19:d0:56    Signature Algorithm: md5WithRSAEncryption        Issuer: OU=Copyright (c) 1997 Microsoft Corp., OU=Microsoft Corporation, CN=Microsoft Root Authority        Validity            Not Before: Dec 10 01:55:35 2009 GMT            Not After : Oct 23 08:00:00 2016 GMT        Subject: C=US, ST=Washington, L=Redmond, O=Microsoft Corporation, OU=Copyright (c) 1999 Microsoft Corp., CN=Microsoft Enforced Licensing Intermediate PCA        Subject Public Key Info:            Public Key Algorithm: rsaEncryption                Public-Key: (2048 bit)

And so… Well, I don’t really know. That’s where we run out of information. And end up with a lot of questions.

For one thing, why did the Flame authors need a collision? What extra capabilities did it get them? (Update: the best theory I’ve heard comes from Nate Lawson, who theorizes that the collision might have allowed the attackers to hide their identity. Update 6/6: The real reason has to do with certificate extensions and compatibility with more recent Windows versions. See the end of this post.)

More importantly, what kind of resources would it take to find the necessary MD5 collision? That would tell us a lot about the capabilities of the people government contractors who write malware like this. In 2008 it took one day on a supercomputer, i.e., several hundred PS3s linked together.

And just out of curiosity, what in the world was Microsoft doing issuing MD5-based certificates in 2009? I mean, the code signing bit is disastrous, but at least that’s a relatively subtle issue — I can understand how someone might have missed it. But making MD5 certificates one year after they were definitively shown to be insecure, that’s hard to excuse. Lastly, why do licensing certificates even need to involve a Certificate Signing Request at all, which is the vector you’d use for a collision attack?

I hope that we’ll learn the answers to these questions over the next couple of days. In the mean time, it’s a fascinating time to be in security. If not, unfortunately, a very good time for anyone else.

***

Update 6/5: For those who aren’t familiar with certificate collision attacks, I should clear up the basic premise. Most Certificate Authorities sign a Certificate Signing Request (CSR) issued by a possibly untrustworthy customer. The idea of an MD5 collision-finding attack is to come up with two different CSRs that hash to the same thing. One would be legitimate and might contain your Terminal Services licensing number. The other would contain… something else. By signing the first CSR, Microsoft would be implicitly creating a certificate on the second batch of information.

Finding collisions is a tricky process, since it requires you to muck with the bits of the public key embedded in the certificate (see this paper for more details). Also, some CAs embed a random serial number into the certificate, which really messes with the attack. Microsoft did not.

Finally, some have noticed that Microsoft is still using a root certificate based on MD5, one that doesn’t expire until 2020, and hasn’t been revoked. What makes this ok? The simple answer is that it’s not ok. However, Microsoft probably does not sign arbitrary CSRs with that root certificate, meaning that collision attacks are not viable against it. It’s only the Intermediate certs, the ones that actually sign untrusted CSRs you need to worry about — today.

***

Update 6/6: Microsoft has finally given us some red meat. Short summary: the Terminal Services Licensing certs do work to sign code on versions of Windows prior to Vista, with no collision attack needed. All in all, that’s a heck of a bad thing. But it seems that more recent versions of Windows objected to the particular X.509 extensions that were in the TS licensing certs. The government (or their contractors) therefore used a collision attack to make a certificate that did not have these extensions. From what I’m reading on mailing lists, the collision appears to be “new, and of scientific interest”.

If this pans out, it’s a big deal. Normally the government doesn’t blow a truly sophisticated cryptanalytic attack on some minor spying effort. Getting MD5 collisions up and running is a big effort; it’s not something you can do from a cookbook. But developing novel collision techniques is a step beyond that. I take this to mean that Flame’s authors were breaking out the big guns, which tells us something about its importance to our government. It also tells us that the creation of Flame may have involved some of the top mathematicians and cryptographers in (our?) intelligence services. This is an important datapoint.

***

Update 6/7: It looks like it does pan out. Marc Stevens — one of the authors of the earlier MD5 certificate collision papers — confirms that Flame’s certificate uses a novel collision-finding technique.

“Flame uses a completely new variant of a ‘chosen prefix collision attack’ to impersonate a legitimate security update from Microsoft. The design of this new variant required world-class cryptanalysis”

We can only speculate about why this would be necessary, but a good guess is that Flame is old (or, at very least, the crypto techniques are). If the collision technique was in the works prior to 2009 (and why wouldn’t it be?), Stevens et al. wouldn’t have had much relevance. For those who like details, a (very incomplete) timeline looks something like this:

  1. Mid-1990s: MD5 is effectively obsolete. Switch to a better hash function!
  2. August 2004: Wang and Yu present first (‘random’), manually-found collision on MD5.
  3. 2004-2007: Collisions extended to ‘meaningful’ documents. Lots of stuff omitted here.
  4. May 2007: (CWI) Stevens et al. publish a first impractical technique for colliding certificates.
  5. Late 2008: (CWI) Stevens et al. develop a practical certificate collision.
  6. December 2008: Vendors notified.
  7. Feburary 2009: Paper submitted to CRYPTO.
  8. Summer 2009: Source code released.

Even assuming that CWI team had perfect security with their result, conference program committees are hardly built for secrecy — at least, not from goverments. So it’s reasonable to assume that the Stevens et al. technique was known by February 2009, possibly earlier. Which means that the Flame developers may have had their own approach under development sometime before that date.

The really interesting question is: when did secret collision research get ahead of the public, academic stuff? Post-2007? Post-2004? Pre-2004? I doubt we’ll ever know the answer to this question. I sure would love to.

Update 6/12: Alex Sotirov has a great Summercon presentation with many of the details. The word from those who know is: don’t take his $20,000 figure too literally. We know nothing about the techniques used to find this collision.

RSA keys: no insight whatsoever

I have a deadline coming up so (substantial) posting will be light this week.

For those of you who don’t read the New York Times, the big story of the week is this paper by Lenstra, Hughes, Augier, Bos, Kleinjung and Wachterlet:

Ron was wrong, Whit is right

We performed a sanity check of public keys collected on the web. Our main goal was to test the validity of the assumption that different random choices are made each time keys are generated. We found that the vast majority of public keys work as intended. A more disconcerting finding is that two out of every one thousand RSA moduli that we collected offer no security. Our conclusion is that the validity of the assumption is questionable and that generating keys in the real world for “multiple-secrets” cryptosystems such as RSA is significantly riskier than for “single-secret” ones such as ElGamal or (EC)DSA which are based on Diffie-Hellman.

Lots of people have written insightfully on this topic. See Dan Kaminsky’s post here, for example, or Thomas Ptacek’s excellent multi-part Twitter musing. (Update: much better, see Nadia Heninger’s explanation at the end of this post.)

There must be something wrong with me, because I find it almost impossible to draw any deep insight at all from this work. Don’t get me wrong: the paper itself is a fantastic piece of research; it sets a new standard for data analysis on public keys and certs. I hope we see more like it.

But what’s the takeaway? That two-key systems are insecure? That intelligence agencies have known this for years? Maybe. Whatever. The takeaway to me is that one (or more) RSA keygen implementations had a crappy RNG, or didn’t properly seed its PRG.

That’s really good to know about, but it isn’t the big news that the paper’s title would imply. It doesn’t have any implications for the use of RSA or any other cryptosystem. I’d sure like to solve the mystery of which implementations we need to look out for, and how to make sure this doesn’t happen again, but that’s literally the only thing I take away from this — so far.

I don’t mean to sound like a curmudgeon. Really, I want to believe. Please help me!

Update: Mystery solved! Nadia Heninger has a post at Freedom to Tinker explaining that most of these keys were generated by embedded devices, and that — through a parallel research effort — they actually know which devices. Once again extremely nice work. Even nicer than Lenstra et al., since it’s actually useful. (I can only imagine how Nadia and her team have been feeling the past two days, seeing ‘their’ result all over the New York Times. That’s responsible disclosure for you.)

Tor and the Great Firewall of China

Here’s a fascinating post by Tim Wilde overat the Tor Project blog, discussing China’s growing sophistication in blocking Tor connections:

In October 2011, ticket #4185 was filed in the Tor bug tracker by a user in China who found that their connections to US-based Tor bridge relays were being regularly cut off after a very short period of time. At the time we performed some basic experimentation and discovered that Chinese IPs (presumably at the behest of the Great Firewall of China, or GFW) would reach out to the US-based bridge and connect to it shortly after the Tor user in China connected, and, if successful, shortly thereafter the connection would be blocked by the GFW.

we discovered two types of probing. First, “garbage binary” probes, containing nothing more than arbitrary (but sometimes repeated in later probes) binary data, were experienced by the non-China side of any connection that originated from China to TCP port 443 (HTTPS) in which an SSL negotiation was performed. … The purpose of these probes is unknown …

The second type of probe, on the other hand, is aimed quite directly at Tor. When a Tor client within China connected to a US-based bridge relay, we consistently found that at the next round 15 minute interval (HH:00, HH:15, HH:30, HH:45), the bridge relay would receive a probe from hosts within China that not only established a TCP connection, but performed an SSL negotiation, an SSL renegotiation, and then spoke the Tor protocol sufficiently to build a one-hop circuit and send a BEGIN_DIR cell. No matter what TCP port the bridge was listening on, once a Tor client from China connected, within 3 minutes of the next 15 minute interval we saw a series of probes including at least one connection speaking the Tor protocol.

Obviously this is disturbing. And unlike previous, passive efforts to block Tor, these active attacks are tough to defend against. After all, Tor was designed to be a public service. If the general public can download a Tor client and connect to a bridge, so can the Chinese government. This means that protocol-level workarounds (obfuscators, for example) will only work until China cares enough to stop them.

The situation isn’t hopeless: proposed workarounds include password protecting Tor bridges, which might solve the problem to an extent — though it seems to me that this is just kicking the problem down the road a bit. As with the bridge security model, it embeds the assumption that Chinese users can find bridges/passwords, but their government can’t. More to the point, any password protocol is going to have to work hard to look ‘innocent’ (i.e., not Tor-like) to someone who doesn’t know the password. There are a lot of ways this could go wrong.

On the research side there are ideas like Telex which would eliminate the need for bridges by embedding anti-censorship into the network. Chinese Tor clients would make TLS connections to arbitrary US websites; the connections would be monitored by special Telex routers along the way; any TLS connection with a special steganographic marking would get transparently re-routed to the nearest Tor node. Unfortunately, while the crypto in Telex is great, actually deploying it would be a nightmare — and would almost certainly require government cooperation. Even if Western governments were game, the Chinese government could respond by banning overseas TLS connections altogether.

One last note: I love a good mystery, so does anyone care to speculate about those “garbage probes”? What are they — a test? Automated fuzzing? Most likely they’re an attempt to provoke a response from some other TLS server that the Chinese government cares about, but if it’s not Tor then what is it?

Tim’s full investigation can be found here.

Update 1/26: Emily Stark points me to the Flash Proxies project out of Stanford. This would put Tor proxies in individual client machines, thus massively increasing the number of bridges available and eliminating the outgoing client->bridge connection. They even have an implementation, though I warn you: running untrusted traffic through your Flash plugin is not for the faint of heart!

EAX’, Knight Rider, and an admission of defeat

A few weeks back I found myself on the wrong side of Daniel Bernstein over something I’d tweeted the week before. This was pretty surprising, since my original tweet hadn’t seemed all that controversial.

What I’d said was that cryptographic standards aren’t always perfect, but non-standard crypto is almost always worse. Daniel politely pointed out that I was nuts — plenty of bad stuff appears in standards, and conversely, plenty of good stuff isn’t standardized. (As you can see, the conversation got a little weirder after that.)

Today I’m here to say that I’ve found religion. Not only do I see where Daniel’s coming from, I’m here to surrender, throw down my hat and concede defeat. Daniel: you win. I still think standards are preferable in theory, but only if they’re promulgated by reasonable standards bodies. And we seem to have a shortage of those.

My new convictions are apropos of an innocuous-looking ePrint just posted by Kazuhiko Minematsu, Hiraku Morita and Tetsu Iwata. These researchers have found serious flaws in an authenticated block cipher mode of operation called EAX’ (henceforth: EAXprime). EAXprime was recently adopted as the encryption mode for ANSI’s Smart Grid standard, and (until today) was practically a shoo-in to become a standalone NIST-certified mode of operation.

Ok, so standards get broken. Why I am I making such a big deal about this one? The simple reason is that EAXprime isn’t just another standard. It’s a slightly-modified version of EAX mode, which was proposed by Bellare, Rogaway and Wagner. And the important thing to know about EAX (non-prime) is that it comes with a formal proof of security.

It’s hard to explain how wonderful this is. The existence of such a proof means that (within limits) a vulnerability in EAX mode would indicate a problem with the underlying cipher (e.g., AES) itself. Since we’re pretty confident in the security of our standard block ciphers, we can extend that confidence to EAX. And the best part: this wonderful guarantee costs us almost nothing — EAX is a very efficient mode of operation.

But not efficient enough for ANSI, which decided to standardize on a variant called EAXprime. EAXprime is faster: it uses 3-5 fewer block cipher calls to encrypt each message, and (in the case of AES) about 40 bytes less RAM to store scheduled keys. (This is presumably important when your target is a tiny little embedded chip in a smart meter.)

Unfortunately, there’s a cost to that extra speed: EAXprime is no longer covered by the original EAX security proof. Which brings us towards the moral of the story, and to the Minematsu, Morita and Iwata paper.

Did you ever see that old episode of Knight Rider where the knight-riderbad guys figure out how to neutralize KITT‘s bulletproof coating? Reading this paper is kind of like watching the middle part of that episode. Everything pretty much looks the same but holy crap WTF the bullets aren’t bouncing off anymore.

The MMI attacks allow an adversary to create ciphertexts (aka forgeries) that seem valid even though they weren’t created by the actual encryptor. They’re very powerful in that sense, but they’re limited in others (they only work against very short messages). Still, at the end of the day, they’re attacks. Attacks that couldn’t possibly exist if the standards designers had placed a high value on EAX’s security proof, and had tried to maintain that security in their optimized standard.

And this is why I’m admitting defeat on this whole standards thing. How can I advocate for crypto standards when standards bodies will casually throw away something as wonderful as a security proof? At least when KITT lost his bulletproof coating it was because of something the bad guys did to him. Can you imagine the good guys doing that to him on purpose?

Attack of the week: Datagram TLS

Nadhem Alfardan and Kenny Paterson have a paper set to appear in NDSS 2012 entitled ‘Plaintext-Recovery Attacks Against Datagram TLS‘.* This is obviously music to my ears. Plaintext recovery! Datagram TLS! Padding oracles. Oh my.

There’s just one problem: the paper doesn’t seem to be online yet (Update 1/10: It is now. See my update further below.) Infoworld has a few vague details, and it looks like the vulnerability fixes are coming fast and furious. So let’s put on our thinking caps and try to puzzle this one out.

What is Datagram TLS?

If you’re reading this blog, I assume you know TLS is the go-to protocol for encrypting data over the Internet. Most people associate TLS with reliable transport mechanisms such as TCP. But TCP isn’t the only game in town.

Audio/video streaming, gaming, and VoIP apps often use unreliable datagram transports like UDP. These applications don’t care if packets arrive out of order, or if they don’t arrive at all. The biggest priority is quickly handling the packets that do arrive.

Since these applications need security too, TLS tries to handle them via an extension called Datagram TLS (DTLS). DTLS addresses the two big limitations that make TLS hard to use on an unreliable datagram transport:

  1. TLS handshake messages need to arrive whole and in the right order. This is easy when you’re using TCP, but doesn’t work so well with unreliable datagram transport. Moreover, these messages are bigger than the typical 1500 byte Ethernet frame, which means fragmentation (at best), and packet loss (at worst).
  2. Ordinarily TLS decryption assumes that you’ve received all the previous data. But datagrams arrive when they want to — that means you need the ability to decrypt a packet even if you haven’t received its predecessor.

There are various ways to deal with these problems; DTLS mounts a frontal assault. The handshake is made reliable by implementing a custom ack/re-transmit framework. A protocol-level fragmentation mechanism is added to break large handshake messages up over multiple datagrams. And most importantly: the approach to encrypting records is slightly modified.

So what’s the problem?

To avoid radical changes, DTLS inherits most of the features of TLS. That includes its wonky (and obsolete) MAC-then-Encrypt approach to protecting data records. Encryption involves three steps:

  1. Append a MAC to the plaintext to prevent tampering.
  2. Pad the resulting message to a multiple of the cipher block size (16 bytes for AES). This is done by appending bytes of padding, where each byte must contain the value X.
  3. Encrypt the whole mess using CBC mode.

Cryptographers have long known that this kind of encryption can admit padding oracle attacks. This happens when a decryptor does something obvious (throw an error, for example) when it encounters invalid padding, i.e., padding that doesn’t meet the specification above.

CBC Mode encryption, courtesy Wikipedia.

This wouldn’t matter very much, except that CBC mode is malleable. This means an attacker can flip bits in an intercepted ciphertext, which will cause the same bits to be flipped when the ciphertext is ultimately decrypted. Padding oracle attacks work by carefully tweaking a ciphertext in specific ways before sending it to an honest decryptor. If the the decryptor returns a padding error, then the adversary knows something about the underlying plaintext. Given enough time the attacker can use these errors to completely decrypt the message.

I could spend a lot of time describing padding oracle attacks, but it’s mostly beside the point.** Standard TLS implementations know about this attack and deal with it in a pretty effective way. Whenever the decryptor encounters bad padding, it just pretends that it hasn’t. Instead, it goes ahead with the rest of the decryption procedure (i.e., checking the MAC) even if it knows that the message is already borked.

This is extra work, but it’s extra work with a purpose. If a decryptor doesn’t perform the extra steps, then messages with bad padding will get rejected considerably faster than other messages. A clever attacker can detect this condition by carefully timing the responses. Performing the unnecessary steps (mostly) neutralizes that threat.

Ok, so you say these attacks are already mitigated. Why are we still talking about this?

Before I go on, I offer one caveat: what I know about this attack comes from speculation, code diffs and some funny shapes I saw in the clouds this afternoon. I think what I’m saying is legit, but I won’t swear to it until I read Alfardan and Paterson’s paper.

But taking my best guess, there are two problems here. One is related to the DTLS spec, and the second is just an implementation problem. Either one alone probably wouldn’t be an issue; the two together spell big trouble.

The first issue is in the way that the DTLS spec deals with invalid records. Since standard TLS works over a reliable connection, the application should never receive invalid or out-of-order data except when packets are being deliberately tampered with. So when standard TLS encounters a bad record MAC (or padding) it takes things very seriously — in fact, it’s required to drop the connection.

This necessitates a new handshake, a new key, and generally makes it hard for attackers to run an honest padding oracle attack, since these attacks typically require hundreds or thousands of decryption attempts on a single key.***

DTLS, on the other hand, runs over an unreliable datagram transport, which may not correct for accidental packet errors. Dropping the connection for every corrupted packet just isn’t an option. Thus, the standard is relaxed. An invalid MAC (or padding) will cause a single record to be thrown away, but the connection itself goes on.

This still wouldn’t matter much if it wasn’t for the second problem, which is specific to the implementation of DTLS in libraries like OpenSSL and GnuTLS.

You see, padding oracle vulnerabilities in standard TLS are understood and mitigated. In OpenSSL, for example, the main decryption code has been carefully vetted. It does not return specific padding errors, and to avoid timing attacks it performs the same (unnecessary) operations whether or not the padding checks out.

In a perfect world DTLS decryption would do all the same things. But DTLS encryption is subtly different from standard TLS encryption, which means it’s implemented in separate code. Code that isn’t used frequently, and doesn’t receive the same level of scrutiny as the main TLS code. Thus — two nearly identical implementations, subject to the same attacks, with one secure and one not. (Update 1/11: There’s a decent explanation for this, see my update below.)

And if you’re the kind of person who needs this all tied up with a bow, I would point you to this small chunk of the diff just released for the latest OpenSSL fix. It comes from the DTLS-specific file d1_pkt.c:

+ /* To minimize information leaked via timing, we will always

+        * perform all computations before discarding the message.

+        */
+ decryption_failed_or_bad_record_mac = 1;

I guess that confirms the OpenSSL vulnerability. Presumably with these fixes in place, the MAC-then-Encrypt usage in DTLS will now go back to being, well, just theoretically insecure. But not actively so.

Update 1/11/2012: Kenny Paterson has kindly sent me a link to the paper, which wasn’t available when I wrote the original post. And it turns out that while the vulnerability is along the lines above, the attack is much more interesting.

An important aspect that I’d missed is that DTLS does not return error messages when it encounters invalid padding — it just silently drops the packet. This helps to explain the lack of countermeasures in the DTLS code, since the lack of responses would seem to be a padding oracle attack killer.

Alfardan and Paterson show that this isn’t the case. They’re able to get the same information by timing the arrival of ‘heartbeat’ messages (or any predictable responses sent by an upper-level protocol). Since DTLS decryption gums up the works, it can slightly delay the arrival of these packets. By measuring this ‘gumming’ they can determine whether padding errors have ocurred. Even better, they can amplify this gumming by sending ‘trains’ of valid or invalid packets.

All in all, a very clever attack. So clever, in fact, that it makes me despair that we’ll ever have truly secure systems. I guess I’ll have to be satisfied with one less insecure one.

Notes:

* N.J.A. Alfardan and K.G. Paterson, Plaintext-Recovery Attacks Against Datagram TLS, To appear in NDSS 2012.

** See here for one explanation. See also a post from this blog describing a padding oracle attack on XML encryption.

*** There is one very challenging padding oracle attack on standard TLS (also mitigated by current implementations). This deals with the problem of session drops/renegotiation by attacking data that remains constant across sessions — things like passwords or cookies.

Non-governmental crypto attacks

Over on Web 1.0, Steve Bellovin is asking an interesting question:

Does anyone know of any (verifiable) examples of non-government enemies exploiting flaws in cryptography?  I’m looking for real-world attacks on short key lengths, bad ciphers, faulty protocols, etc., by parties other than governments and militaries.  I’m not interested in academic attacks — I want to be able to give real-world advice — nor am I looking for yet another long thread on the evils and frailties of PKI.

The responses vary from the useful to the not-so-useful, occasionally punctuated by an all-out flamewar — pretty much par for the course in these things.

Here are a few of the responses that sound pretty reasonable. They’re (mostly) not mine, and I’ve tried to give credit where it’s due:

  1. Cases of breached databases where the passwords were hashed and maybe salted, but with an insufficient work factor enabling dictionary attacks.*
  2. NTLMv1/MSCHAPv1 dictionary attacks.*
  3. NTLMv2/MSCHAPv2 credentials forwarding/reflection attacks.*
  4. The fail0verflow break of poorly-nonced ECDSA as used in the Sony PlayStation 3.*
  5. DeCSS.*
  6. Various AACS reverse-engineering efforts.
  7. The HDCP master key leak.*
  8. Various attacks on pay satellite TV services.****
  9. GSM decryption, which seems to have gone beyond the academic and into commercial products.
  10. Factoring of the Texas Instruments 512-bit firmware signing key for calculators, and Elcomsoft’s factoring of the Quicken backup key.**
  11. Key recovery in WEP.
  12. Exploits on game consoles: the original XBox,*** Wii software signing.

There’s also some debate about recent claims that 512-bit RSA certificate signing keys were factored and used to sign malware. As much as I’d like to believe this, the evidence isn’t too solid. Some posters claim that there were also 1024-bit keys used in these attacks. If that’s true, it points more to key theft (aka Steve’s ‘evils and frailties of PKI’).

You’ll also notice I’m leaving lots of stuff off of this list, only because I don’t know of any specific attacks based on it. That would include all the padding oracle attacks of late, the BEAST attack on TLS, bad Debian keys, and so on.

So what’s the takeway from all of this? Well, it’s complicated. A quick glance at the list is enough to tell us that there are plenty of ‘real people’ (aka non-professional cryptographers) out there with the skills to exploit subtle crypto flaws. That definitely supports my view that proper crypto implementation is important, and that your code will be exploited if you screw it up.

Some people may take comfort from the fact that there’s no crypto ‘pearl harbor’ on this list, i.e., the cryptographic equivalent of a Conficker or Stuxnet. I would say: don’t get too cocky. Sure, software security is a mess, and it’s a whole lot easier to set up a dumb fuzzer than to implement sophisticated crypto exploits. (No offense to dumb fuzzers — I’m friends with several.)

But on the other hand, maybe this is misleading. We mostly learn about software 0days from mass malware, which is relatively easy to catch. If sophisticated crypto exploits are being implemented, I would guess that they’re not going into retail worms and trojans — they’re being very quietly applied against high-value targets. Banking systems, for example.

But again, this is just speculation. What do you think?

Notes:

* Marsh Ray.

** Solar Designer.

*** Tom Ritter.

**** commenter “Swiss Made”, below.

Attack of the week: XML Encryption

Unfortunately I had to skip this year’s CCS because I was traveling for work. This means I missed out on a chance to go to my favorite Chicago restaurant and on the architecture cruise. I’m told that there were also some research papers being given somewhere.

The Internet can’t make up for every missed opportunity, but it can help with the research. In fact, today’s news is all about one of those papers. I’m talking about Tibor Jager and Juraj Somorovsky’s work entitled “How To Break XML Encryption“.*

In case I had any doubts, it only took one glance at the first page to confirm that we’re back in “someone screwed up basic symmetric encryption” territory. But really, to describe Jager and Somorovsky’s findings like this does them no justice. It’s like comparing the time that kid built a nuclear reactor in his garage to, say, the incident at Fukushima.

Some background is in order.

What is XML Encryption and why should I care?

You probably already know that XML is the world’s most popular way to get structured data from point A to point B.** Following the success of HTML, the computing world decided that we needed a single “flexible” and ASCII-based format for handling arbitrary data types. (We’ve spent much of the subsequent period regretting this decision.)

The one upshot of this XML epidemic was the emergence of some standards and libraries, which — among other things — were supposed to help application developers process data in a reliable and secure way. One of those standards is SOAP, which is used to transport data in web services frameworks.

Another standard is the W3C XML Encryption Standard, which was dropped like a log in 2002 and doesn’t seem to have been updated since. The point of this standard was to give developers a uniform (and secure!) way to protect their XML documents so that we wouldn’t wind up with five thousand incompatible, insecure ways of doing it. (Oh the irony.)

Finally, a very common implementation of both standards can be found in the Apache Axis2 web services framework and in the RedHat JBoss framework. These are probably the most common open-source SOAP frameworks. They power a number of systems you don’t necessarily think about, but you probably should because they carry a lot of your personal information.

So what about the encryption?

To make a very long story short, the W3C standard recommends to encrypt messages using a block cipher configured using (our old friend) CBC-mode. Here’s a quick recap on CBC mode, courtesy Wikipedia:

CBC mode encryption. The message is first subdivided into equal-length blocks and encrypted as in the diagram.  The circular plus symbol denotes XOR.

There are two basic things you need to know about CBC mode, and ought to know if you ever plan to use it.***

First: CBC requires that every plaintext be an even multiple of the block size. In the case of AES, this means 16 bytes. If the message is not the right length, then CBC implementations will pad the message with additional bytes. Of course it must be possible to recognize this padding so that it can be stripped off after decryption. There are various schemes for doing this.

The W3C standard uses the following padding approach. Let “MM” indicate message bytes, and let “NN” be the total number of padding bytes being added. “XX” represents arbitrary padding bytes, which can hold any value you want. A padded block will look like this:

0x MM MM MM MM MM MM MM MM MM MM XX XX XX XX XX NN
When a decryptor encounters a padded message, it looks at the last byte to figure out how many bytes to strip off. If it encounters padding that doesn’t make sense, i.e., NN > 16 or NN < 1, then it should see something funny going on and reject the message.
Second: CBC ciphertexts are malleable. This means that you can modify a CBC-encrypted ciphertext such that your modifications will carry through decryption, and have a meaningful effect on the resulting decrypted plaintext.
In the simplest case, you can truncate the message by stripping blocks off the end. This does the same thing to the resulting plaintext.
More interestingly, you can flip any bit (or set of bits) in the IV of a CBC-encrypted message, and upon decryption you’ll find that the same bits have been flipped in the first block of the decrypted plaintext. You can also do a lot more, but it’s not that important right now.

Anything else I should know?

Yup. You need to know how XML messages are formatted. They’re encoded using UTF-8 — essentially ASCII — with some special rules.

These are described in the paper. Briefly: any XML message that contains bytes ranging from 0x00-0x1F (with the exception of tabs, LF and CR) may be considered invalid. Similarly, the ampersand (&) character is used as an escape character, and must be followed by a valid string. Lastly, open brackets “<” must be properly closed. Messages that don’t meet this standard should be rejected.  ****
Of course, the message will have to be decrypted (using the appropriate key) before any of these checks can be run.

This is all fascinating, but how does it lead to an attack?

There’s one more detail I haven’t given you. You see, in addition to the details above, the Axis2 server (ditto JBoss) is kind enough to let you know when you haven’t met its standards for an XML message.

Specifically, after it decrypts a message, it kicks back an error if the message either: (1) has bad padding, or (2) contains invalid characters. In both cases, the error is the same. And this error is your way in the door.

I’m not going to completely describe the attack, but I’ll try to give an overview.

Imagine that you’ve intercepted a legitimately-encoded, encrypted XML message (IV, C1, …, CN) and you want to know what it says. You also have access to an Axis2 or JBoss server that knows the key and can decrypt it. The server won’t tell you what the message says, but it will give you an error if the encoding doesn’t meet its standards.

Sending the original, intercepted message won’t tell you much. We know that it’s encoded correctly. But what if you tamper with the message? This is precisely what Jager and Somorovsky proceed to do.

Step 1: Truncate the ciphertext. A good way to start is to chop off everything after the first ciphertext block, and send the much-shorter message consisting of (IV, C1) in to be decrypted. Chances are good that this new message will produce a decryption error, since the server will interpret the last byte of the decrypted message as padding — a number that should be between 0x01 and 0x10.

Step 2: Tweak the padding. I already said that if you flip bits in the IV, this will result in a similar change to the decrypted plaintext. Using this concept, you can force the last byte of the IV through all possible values and ask the server to decrypt each version of the ciphertext. More formally, the ‘new’ last IV byte can be computed as (‘original last IV byte’ ⊕ i) for i from 0 to 255.

In the best case, 16 of these test ciphertexts will decrypt without error. These correspond to the decrypted values 0x01 through 0x10, i,e., valid padding. If fewer than 16 of your test ciphertexts decrypt, this means that there’s something wrong with the message: probably it contains an unclosed “<” tag.

Step 3: Squish the bug(s). This is no problem. If exactly only one of your ciphertexts decrypts successfully, that means the open “<” character must in the first byte of the message. You caused the last byte of the message to decrypt to 0x10 (16 decimal), and the decryptor treated the whole block as padding. There are no errors in an empty message.

If two ciphertexts decrypt successfully, then the “<” character must be in the second position of the message, because now there are only two padding lengths (16 and 15) that would cause it to be stripped off. And so on. The number of successful decryptions tells you where the “<” is.

Now that you know where it is, kill it by flipping the last bit of the appropriate byte in the IV. This turns “<” into a harmless “=”.

Step 4: Learn the last byte of the block. You should now be able to come up with 16 test ciphertexts that decrypt successfully. This means you have 16 values of “i” such that ‘last byte of the original IV’ ⊕ i results in a successful decryption. One of these “i” values will differ from the others in the fourth most significant bit. This one will correspond to the padding value 0x10.

If “x” is the original plaintext byte and “i” is the value you just identified, you know now the last byte of the block. Since we have x ⊕ i = 0x10, then through the very sophisticated process of rearranging a simple equation we have 0x10 ⊕ i = x.

Now, using our knowledge of the last byte of the plaintext, manually tweak the IV so that the last byte (padding) of the plaintext will decrypt to 0x01.

Step 5: Learn everything else. The rest of the attack hinges on the fact that all of the bytes in the message should be ‘acceptable’ UTF-8 characters. Thanks to our trick with the IV, we can now flip arbitrary bits in any given byte, and see whether or not that leads to another ‘acceptable’ character or not.

Believe it or not, the results of this experiment tell us enough to figure out each character in turn. I won’t belabor this, because it takes a little bit of squinting at the ASCII table, but fundamentally there’s not much magic beyond that. It’s very simple and elegant, and it works.

Step 6: Finish it. You’ve done this for one block. Now do it for the rest. Just set your ‘ciphertext’ to be the next block in the message, and your ‘IV’ to be the ciphertext block immediately preceding it. Now go back to Step 1.

And that’s the ballgame.

Obviously all of this takes a lot of tests, which you can reduce a bit using some of the optimizations suggested in the paper. Jager and Somorovsky are able to recover plaintexts with “14 requests per plaintext byte on average.” But at the end of the day, who cares if it takes 50? The box is sitting there ready and willing to decrypt ciphertexts. And it’s incredibly unlikely that anyone is paying attention.

Ok. Can’t you fix this by authenticating the ciphertexts?

This entire attack is a special example of an adaptive chosen ciphertext attack. (Specifically, it’s a super-duper variation of Vaudenay’s padding oracle attack, which he discovered in 2002, the same year the W3C standard hit!)

These attacks can almost always be prevented with proper authentication of the ciphertexts. If the decryptor checks that the ciphertexts are valid before decrypting them, the attacker can’t tamper with them. Hence, no useful information should leak out to permit these attacks at all.

The designers could have added a mandatory MAC or even a signature on the ciphertext, or they could have used an authenticated mode of operation.

But the W3C standard does provide for MACs and signatures!

Indeed. As the authors point out, there’s an optional field in the specification where you can add a signature or a MAC over the ciphertext and its meta data. But there’s one tiny, hilarious thing about that signature…

Did I mention it’s optional?

You can quite easily take a signed/MACed ciphertext and ‘convert’ it into one that’s not signed or MACed at all, simply by stripping the ciphertext contents out and placing them into a section that does not claim to have a signature. Since the spec doesn’t mandate that the authentication be on every message, the decryptor will be totally cool with this.

So in summary: optional MACs and signatures don’t buy you much.

Is there anything else to say about this?

Lots. Stop using unauthenticated CBC mode. Stop using any encryption standard designed before 2003 that hasn’t been carefully analyzed and recently updated. Stop using any unauthenticated encryption at all. Wrap your SOAP connections with TLS (a recent version!) if you can.

People didn’t really think much about active chosen ciphertext attacks on symmetric encryption before 2000, but now they’re the new hip thing. If your system is online and doesn’t have a solid, well-analyzed protection against them, don’t pretend that you’re doing anything at all to secure your data.

I wish I had a funny, pithy way to sum this all up. But honestly, I’m just a little depressed. It’s not Jager and Somorosky’s fault at all — this is a great paper. It’s just that there’s too much embarrassingly bad crypto going around.

While I love reading these papers, I’m starting to feel that this one-off approach is not sufficient to the problem. Maybe what the community needs is a centralized clearinghouse for this stuff, a website where suspect crypto standards (and implementations) can be identified and listed. Crowdsourced and analyzed. Probably many are just fine, but I doubt most have been looked at (or even thought about in a while).

Anyone looking for a project?

Notes:
  
* Tibor Jager and Juraj Somorovsky, “How To Break XML Encryption”. In ACM CCS 2011. ACM Press.

** Someone is inevitably going to tell me that JSON is the world’s most popular way to move structured data. Ok. I don’t care.

*** But better yet, don’t use it. Use a standard authenticated encryption scheme like CCM, OCB or GCM, and use someone else’s implementation according to the standards. These modes should prevent all of this nonsense.

**** These details are quoted straight from the Jager and Somorovsky paper. I’m not 100% sure if all implementations enforce this, but Axis2 does.

DESFire

If you skipped today’s crypto news, you missed some good press for Daviddesfire Oswald and Christof Paar’s recent side-channel attack on the Mifare DESFire MF3ICD40 chip.* I’m hardly an expert on side-channel attacks, but being uninformed has never stopped me before, and I figured it might be fun to read the paper and talk about this very neat attack.

What’s the DESFire MF3ICD40 and where is it used?

I’m guessing MF3ICD40 probably isn’t a household name where you live, but you might be familiar with one of the systems that use it.  These include the San Francisco Clipper card, the Czech “in-karta”, and a cute Australian card transit card known as the myki.  It also powers a number of access control systems, like the prox cards that let you into some high security buildings.
Clipper card.

The MF3ICD40, henceforth “MF3”, is a capable little chip.  It’s a Radio Frequency Identification (RFID) device, meaning that it communicates wirelessly with a reader — e.g., a Clipper card turnstyle.  It can store a certain amount of information and retrieve it on demand.  Like the DST chip I mentioned in an earlier post, every MF3 contains a cryptographic key and a cipher, which it can use to (among other things) perform a challenge/response authentication protocol.

Unlike the DST, the MF3 uses a pretty reasonable cipher for its authentication protocol. Specifically, it employs Triple-DES with a 112-bit key.  While 3DES is definitely getting a bit long in the tooth (it’s been deprecated in FIPS), it’s not thought to be vulnerable to any truly practical attacks.  Nor, to the best of my knowledge, is there anything wrong with the way that the MF3 uses 3DES.

So if the crypto is ok, what’s the problem?

Just like the DST, the MF3 chip contains no internal power source.  Instead, power for the chip is extracted from a magnetic field generated by the reader.  This is mostly a good thing — by powering the chip this way, the designers were able to dramatically reduce its size and cost.  Moreover, you don’t need to replace your MF3 when some crappy battery dies.

Unfortunately, the use of an external power source opens the device up to a special class of side-channel attack known as Correlation Power Analysis, or CPA.  Since the MF3 draws power from the reader, a malicious reader can actually tell — from moment to moment — how much power the device is using.

Ordinarily this wouldn’t be very interesting. However, if you know exactly how the device performs the computation, and you can repeatedly measure the device’s power consumption while sending it carefully crafted input values, you can actually learn quite a lot.  Specifically, when the device encrypts (or decrypts) your chosen inputs, those power fluctuations can actually leak the bits of the cryptographic key.  In the case of MF3, you can obtain the full 112-bit key in about seven hours.

Isn’t side channel analysis pretty well known?

Yes, and that’s what makes this interesting.  With caveats.

Side channel attacks themselves are hardly a new thing.  The concept of using power analysis to attack cipher implementations was first proposed in CRYPTO ’99 by Paul Kocher, Joshua Jaffe and Benjamin Jun, all of whom worked for Paul’s company, Cryptography Research.  In a truly inspired move, Paul and his researchers turned around and patented the standard set of countermeasures to the very attack they had invented.  These countermeasures can be found in just about every high-security smart card manufactured today.

This story is important because the MF3 clearly didn’t have the latest and greatest hardware countermeasures built into it.  If it had, the Oswald and Paar attacks wouldn’t have gone anywhere. (Though interestingly, it did have one countermeasure, see below.)

This seems to be a common situation for a certain class of RFID devices, of which the MF3 is one.  These devices toe a funny line between “contactless smart card” (where SCA countermeasures would be expected) and “plain-old RFID” where security isn’t thought to be as important. Unfortunately, the people who select these systems may not realize this, and that’s when bad things can happen.

I should also add that — unlike a recent side-channel attack on KeeLoq — this attack does not require the adversary to physically crack open the device. It’s implemented purely via the EM field.

So how does the attack work?

This is clearly the crux of the matter, and unfortunately this is where I’m going to have to let you off with a warning.  Side channel analysis is just not my area, and there’s little I can offer here that would improve on reading the paper itself.**, ***

But to make a very short go of it, DES works by breaking the key up into smaller pieces called “subkeys”, each of which is used in a different round of the cipher.  Portions of these subkeys are combined with bits of of an attacker-supplied input and sent to a series of small functions called S-boxes.  Since only a few bits go into each S-box, there is a relatively small set of possible input values, each of which results in a slightly different power consumption profile.

By observing the power consumed when calculating these S-boxes on different inputs — both for known and unknown keys, and over many, many experiments — the attacker can correlate these traces to figure out the bits that come from an unknown key.  The full key recovery attack takes ~250,000 traces, which require about 7 hours to collect.

The most interesting aspect of the paper is that the authors didn’t actually know how the DES implementation in the MF3 worked.  Much of the paper is devoted to reverse-engineering the necessary details of the operation by examining power traces.  What’s most interesting about this discussion is that along the way, they discovered that the MF3 actually does implement a countermeasure to CPA — specifically, it randomly adds up to eight “dummy rounds” to DES. Unfortunately for the MF3, this protection was not sufficient to prevent the attack.

Figure from the Oswald-Paar paper.  These charts show the operation of the DES cipher  on several input values.  The authors used this information to reverse-engineer the operation of the device.

So what’s next for MF3?

A quick glance at the MF3 website will tell you that the manufacturer “has begun phasing out the MF3ICD40”, but that “orders … will continue to be taken until December 31st, so start planning your new year’s party now!” (Ok, I made up the last bit.) From here on out, it looks like it’s going to be AES-based cards all the way.

The note doesn’t specifically mention this attack, but I assume that’s what’s behind it.  If so, that’s a pretty good response for a manufacturer.  To pull a profitable card off of the market probably cost them quite a bit, but it protects their reputation, not to mention their customers.  Unfortunately, there’s no indication that the newer AES-based cards actually have stronger protections against side channel analysis; they’re just using a different cipher.  But I guess that’s what next year’s CHES is for.

Notes:

* This post brought to you by David Oswald and Christof Paar.  Breaking Mifare DESFire MF3ICD40: Power Analysis and Templates in the Real World.  In CHES ’11.

** This is an extended version of the conference paper that I read. Thanks to Bart Coppens for finding the link.

*** If you want to hear from people who do know their side channels, see Luke Mather’s excellent post on DPA and distinguishers from the Bristol Cryptography Blog.

A diversion: BEAST Attack on TLS/SSL Encryption

Update 9/22/11: It appears that OpenSSL may have actually written a patch for the problem I describe below, all the way back in 0.9.6d (2002), so this attack may not be relevant to OpenSSL systems.  But I haven’t reviewed the OpenSSL code to verify this.  More on the patch at the bottom of this post.

Update 10/2/11: Yes, the attack works pretty much as described below (and see comments).  To advise future standards committees considering non-standard crypto, I have also prepared this helpful flowchart.

There’s some news going around about a successful attack on web browsers (and servers) that support TLS version 1.0.  Since this is ostensibly a blog, I figured this subject might be good for a little bit of real-world speculation.

My first thought is that “attacks” on TLS and SSL need to be taken with a grain of salt.  The protocol has been around for a long time, and new attacks typically fall into one of two categories: either they work around SSL altogether (e.g., tricking someone into not using SSL, or going to an untrusted site), or they operate on some obscure, little-used feature of SSL.

(Please don’t take this as criticism of the attacks I link to above.  At this point, any attack against TLS/SSL is a big deal, and those are legitimate vulnerabilities.  I’m trying to make a point about the strength of the protocol.)

This attack is by researchers Juliano Rizzo and Thai Duong, and if we’re to believe the press, it’s a bit more serious than usual.  Ostensibly it allows a man-in-the-middle to decrypt HTTPS-protected session cookies, simply by injecting a little bit of Javascript into the user’s web browser.  Since we’re constantly running third-party Javascript due to web advertising, this might not be a totally unrealistic scenario.

The attack is claimed to make use of a theoretical vulnerability present in all versions of SSL and TLS 1.0 (but not later versions of TLS).  Unfortunately there are few details in the news reports, so until the researchers present their findings on Friday, we have to rely on some educated guesses.  And thus, here’s a big caveat:

Every single thing I write below may be wrong!

What the heck is a Blockwise-Adaptive Chosen Plaintext Attack?

Based on a quick Google search, the attack may be based a paper written by Gregory Bard at the University of Maryland.  The basic idea of this attack is that it exploits the particular way that TLS encrypts long messages using block ciphers.  To do this, Bard proposes using Javascript.

Besides being incredibly annoying, Javascript is a huge potential security hole for most web browsers.  This is mostly mitigated in browsers, which place tight restrictions on the sorts of things that a Javascript program can do.  In terms of network communication, Javascript programs can barely do anything.  About the only exception to this rule is that they can communicate back to the server from which they were served.  If the page is HTTPS-protected, then communication gets bundled under the same TLS communication session used to secure the rest of the web page.

In practice, this means that any data sent by the Javascript program gets encrypted under the same key that is also used to ship sensitive data (e.g., cookies) up from the web browser to the server.  Thus, the adversary now has a way to encrypt any message she wants, under a key that matters.  All she has to do is sneak a Javascript program onto the user’s web page, then somehow intercept the encrypted data sent to the server by the browser.  This type of thing is known as a chosen plaintext attack.

Normally chosen plaintext attacks are not a big deal.  In fact, most encryption schemes are explicitly designed to deal with them.  But here is where things get interesting.

TLS 1.0 uses a block cipher to encrypt data, and arranges its use of the cipher using a variant of the Cipher Block Chaining mode of operation.  CBC mode is essentially a means of encrypting long messages using a cipher that only encrypts short, fixed-size blocks.

CBC mode encryption (diagram: Wikipedia).  The circular cross denotes the XOR operation.  The message is first divided into even-sized blocks.  A random Initialization Vector (IV) is used for the first block.  Each subsequent block is “chained” to the message by XORing the next block of plaintext with the previous block of ciphertext.

CBC mode encryption is illustrated above.  Note that it uses a block cipher as an ingredient.  It also depends on a random per-message nonce called an Initialization Vector, or IV for short.

If you asked me to describe the security properties of CBC, I would recite the following mantra: as long as you use a secure block cipher, and as long as you generate a new, random IV for each message, this mode is very secure against eavesdroppers — even when the adversary can obtain the encryption of chosen plaintexts.  If you doubted me, I would point you here for a formal proof.

You may have noticed that I emphasized the phrase “as long as”.  This caveat turns out to be important.

In the implementation of CBC mode, the TLS designers made one, tiny optimization.  Rather than generating a fresh IV for each message, the protocol re-uses the last block of the previous ciphertext message that was sent.  This value becomes the IV for the new message.

In practice, this tiny modification has huge implications.  If we assume that the adversary — let’s call her Sue — is sniffing the encrypted data sent by the browser, then she can obtain the IV that will be used for the next message.  Now let’s say she has also identified a different block of ciphertext C that she wants to decrypt, maybe because it contains the session cookie.  While she’s at it, she can easily obtain the block of ciphertext C’ that immediately precedes C.

None of this requires more than simple eavesdropping.  Sue doesn’t know what message C encrypts — that value, which we’ll call M, is what she wants to learn — but she does know that C can be expressed using the CBC encryption formula as:

      C = AES-ENC(key, M ⊕ C’)

Sue also knows that she can get her Javascript program to transmit any chosen block of data M* that she wants.  If she does that, it will produce a ciphertext C* that she can sniff.

Now let’s make one more huge assumption.  Imagine that Sue has a few guesses for what that unknown value of M might be.  It’s 16 bytes long, but maybe it only contains a couple of bytes worth of unknown data, such as a short PIN number.  If Sue generates her value M* based on one of those guesses, she can confirm whether C actually encrypts that value.

To cut through a lot of nonsense, let’s say that she guesses M correctly.  Then she’ll generate M* as:

M* = IV M C’

When her Javascript program sends out M* to be encrypted, the TLS layer will encrypt it as:

C* = AES-ENC(key, IVM*) =                       (which we can re-write as…)
AES-ENC(key, IVIVMC’ ) =      (and XORing cancels…)
AES-ENC(key, M ⊕ C’)

All I’ve done above is write out the components of M* in long form, and simplify the equation based on the fact that the IVIV term cancels out (a nice property of XOR).  The important thing to notice is that if Sue guessed correctly, the ciphertext C* that comes out of the browser will be identical to the captured ciphertext C she wants to decrypt!  If she guessed wrong, it won’t.

So assuming that Sue has time on her hands, and that there are only a few guesses, she can repeat this technique over and over again until she figures out the actual value of M.

But this guessing stuff is crazy!

If you’re paying attention, you’ll already have twigged to the one huge problem with this attack.  Yes, it works.  But it only works if she has a small number of guesses for the value M.  But in practice, M is 16 bytes long.  If she somehow knows the value of all but two bytes of that value, she might be able to guess the remaining bytes in about 2^15 (32,768) attempts on average.  But in the more likely case that she doesn’t know any of the plaintext, she has to try on average about 2^127 possible values.

In other words, she’ll be guessing until the sun goes out.

Let’s get aligned

And this is where the cleverness comes in.  I don’t know exactly what Rizzo and Duong did, but the paper by Bard hints at the answer.  Recall that when the browser encrypts a CBC-mode message, the first step is to chop the message into equal-length blocks.  If it’s using AES, these will each be 16 bytes long.

If Sue needs to guess the full contents of a random 16-byte block, she’s hosed — there are too many possibilities.  The only way this technique works in general is if she knows most of a given block, but not all of it.

But what if Sue could convince the browser to align the TLS messages in a manner that’s useful to her?  For example, she could fix things so that when the browser sends the cookie to the server, the cookie would be pre-pended with something that she knows — say, a fixed 15-byte padding.  That would mean that the stuff she wants to learn — the start of the cookie — takes up only the last byte of the block.

If Sue knows the padding, all she has to guess is the last byte.  A mere 256 possible guesses!

Now what if she can repeat this trick, but this time using a 14-byte known padding. The block would now end with two bytes of the cookie, the first of which she already knows. So again, 256 more guesses and she now knows two bytes of the cookie!  Rinse, repeat.

This technique could be quite effective, but notice that it relies on some strong assumptions about the way Sue can get the browser to format the data it sends.  Does the Rizzo and Duong attack do this?  I have absolutely no idea.  But based on my understanding of the attack and the descriptions I’ve read, it represents my best speculation.

I guess pretty soon we’ll all know how good I am at guessing.

Update 9/22/11: This post by James Yonan offers the first piece of confirmation suggesting that BEAST is based on the Bard attack.  Yonan also points out that OpenSSL has included a “patch” to defend against this issue, all the way back to version 0.9.6d (2002).  In a nutshell, the OpenSSL fix encrypts random message fragments at the start of each new message sent to the server.  This additional message fragment works like an IV, ensuring that M* is no longer structured as I describe above.  Yonan also notes that NSS very recently added a similar patch, which indicates that NSS-based browsers were the problem (this includes Firefox, Chrome).

I’m usually critical of OpenSSL for being a code nightmare.  But (pending verification) I have to give those guys a lot of credit for this one.