In memoriam: Tim Hartnell

Last week the students and I went looking for our long-lost GnuRadio USRP in a dusty hardware security lab down the hall. This particular USRP hasn’t been seen in about five years (I suspect it may have been deported with the lab’s previous occupant) so the whole thing was kind of a long shot.

Sometimes best part of a treasure hunt is what you find along the way. The students didn’t find the USRP, but they did uncover a fog machine and laser that someone had tucked under a workbench. This kept them happy ’til we got a whiff of the “fog” it was making. I scored something even better: a mint copy of Tim Hartnell‘s 1985 masterpiece, the Giant Book of Computer Games.

If you’re just a few years younger than me, you might think Games is a book about games. But of course, it literally is games: dozens of all-caps BASIC listings, printed in a font that was probably old when Wargames was new. Each game sits there on the page, pregnant with potential, waiting for a bored 9-year old to tap it into his C64 or Apple ][ and hit “RUN”. (Sadly, this could be a long wait.)

Flipping through Games brings back memories. The Chess AI was a bastard, routinely cheating even if you implemented it properly. And you never implemented anything properly, at least not on the first pass. This was part of the fun. Between typos and the fact that Hartnell apparently coded to his own BASIC standard, the first play usually went like this:

WELCOME TO SNARK HUNT
ENTER 1 FOR SNARK, 2 FOR HUNTER
> 2
YOU CHOSE SNARK
?SYNTAX ERROR AT LINE 3980
READY

You learned debugging fast. When that didn’t work, your last, desperate move was simply to delete the offending lines — ’til the program either (a) worked, or (b) got so crazy that you deleted it and loaded Bruce Lee off a cassette. Sometimes you hit the sweet spot between the two: my “Chess” AI would grab control of my pieces Agent Smith-style and send them hurtling towards my undefended King. I never saw this as a bug, though; I just thought it had style.

When I started writing this post I intended to make a broader point about how my experience with Games mirrors the way that modern implementers feel when faced with a mysterious, unjustified cryptographic standard. I think there is a point to be made here, and I’ll make it. Another day.

But when I googled to see what Hartnell is up to, I was saddened to learn that he died all the way back in 1991, only a few years after Games was published. He was only a few years older than I am today. So on reflection, I think I’ll just let this post stand as it is, and I’ll go spend some time with my kid.

Tim, wherever you are, please accept this belated tribute. Your book meant a lot to me.

EAX’, Knight Rider, and an admission of defeat

A few weeks back I found myself on the wrong side of Daniel Bernstein over something I’d tweeted the week before. This was pretty surprising, since my original tweet hadn’t seemed all that controversial.

What I’d said was that cryptographic standards aren’t always perfect, but non-standard crypto is almost always worse. Daniel politely pointed out that I was nuts — plenty of bad stuff appears in standards, and conversely, plenty of good stuff isn’t standardized. (As you can see, the conversation got a little weirder after that.)

Today I’m here to say that I’ve found religion. Not only do I see where Daniel’s coming from, I’m here to surrender, throw down my hat and concede defeat. Daniel: you win. I still think standards are preferable in theory, but only if they’re promulgated by reasonable standards bodies. And we seem to have a shortage of those.

My new convictions are apropos of an innocuous-looking ePrint just posted by Kazuhiko Minematsu, Hiraku Morita and Tetsu Iwata. These researchers have found serious flaws in an authenticated block cipher mode of operation called EAX’ (henceforth: EAXprime). EAXprime was recently adopted as the encryption mode for ANSI’s Smart Grid standard, and (until today) was practically a shoo-in to become a standalone NIST-certified mode of operation.

Ok, so standards get broken. Why I am I making such a big deal about this one? The simple reason is that EAXprime isn’t just another standard. It’s a slightly-modified version of EAX mode, which was proposed by Bellare, Rogaway and Wagner. And the important thing to know about EAX (non-prime) is that it comes with a formal proof of security.

It’s hard to explain how wonderful this is. The existence of such a proof means that (within limits) a vulnerability in EAX mode would indicate a problem with the underlying cipher (e.g., AES) itself. Since we’re pretty confident in the security of our standard block ciphers, we can extend that confidence to EAX. And the best part: this wonderful guarantee costs us almost nothing — EAX is a very efficient mode of operation.

But not efficient enough for ANSI, which decided to standardize on a variant called EAXprime. EAXprime is faster: it uses 3-5 fewer block cipher calls to encrypt each message, and (in the case of AES) about 40 bytes less RAM to store scheduled keys. (This is presumably important when your target is a tiny little embedded chip in a smart meter.)

Unfortunately, there’s a cost to that extra speed: EAXprime is no longer covered by the original EAX security proof. Which brings us towards the moral of the story, and to the Minematsu, Morita and Iwata paper.

Did you ever see that old episode of Knight Rider where the knight-rider bad guys figure out how to neutralize KITT‘s bulletproof coating? Reading this paper is kind of like watching the middle part of that episode. Everything pretty much looks the same but holy crap WTF the bullets aren’t bouncing off anymore.

The MMI attacks allow an adversary to create ciphertexts (aka forgeries) that seem valid even though they weren’t created by the actual encryptor. They’re very powerful in that sense, but they’re limited in others (they only work against very short messages). Still, at the end of the day, they’re attacks. Attacks that couldn’t possibly exist if the standards designers had placed a high value on EAX’s security proof, and had tried to maintain that security in their optimized standard.

And this is why I’m admitting defeat on this whole standards thing. How can I advocate for crypto standards when standards bodies will casually throw away something as wonderful as a security proof? At least when KITT lost his bulletproof coating it was because of something the bad guys did to him. Can you imagine the good guys doing that to him on purpose?

Useful cryptography resources

Over the past few weeks I’ve been scouring the Internet for a good list of crypto and security resources to link to from this blog. The ideal list would include technical blogs, of course, but also textbooks, courses and other helpful websites.

There are a few good resource lists out on the web. Some are even pretty good, but none of them meet all of my criteria. Per XKCD, I decided that the solution here is to obviate them all and create an even better list of my own. You can see my first draft here (it’s also linked from the sidebar at right, under ‘Useful crypto resources‘). This is very much a work in progress, and I hope that you’ll all help me to flesh it out.

Attack of the week: Datagram TLS

Nadhem Alfardan and Kenny Paterson have a paper set to appear in NDSS 2012 entitled ‘Plaintext-Recovery Attacks Against Datagram TLS‘.* This is obviously music to my ears. Plaintext recovery! Datagram TLS! Padding oracles. Oh my.

There’s just one problem: the paper doesn’t seem to be online yet (Update 1/10: It is now. See my update further below.) Infoworld has a few vague details, and it looks like the vulnerability fixes are coming fast and furious. So let’s put on our thinking caps and try to puzzle this one out.

What is Datagram TLS?

If you’re reading this blog, I assume you know TLS is the go-to protocol for encrypting data over the Internet. Most people associate TLS with reliable transport mechanisms such as TCP. But TCP isn’t the only game in town.

Audio/video streaming, gaming, and VoIP apps often use unreliable datagram transports like UDP. These applications don’t care if packets arrive out of order, or if they don’t arrive at all. The biggest priority is quickly handling the packets that do arrive.

Since these applications need security too, TLS tries to handle them via an extension called Datagram TLS (DTLS). DTLS addresses the two big limitations that make TLS hard to use on an unreliable datagram transport:

TLS handshake messages need to arrive whole and in the right order. This is easy when you’re using TCP, but doesn’t work so well with unreliable datagram transport. Moreover, these messages are bigger than the typical 1500 byte Ethernet frame, which means fragmentation (at best), and packet loss (at worst).
Ordinarily TLS decryption assumes that you’ve received all the previous data. But datagrams arrive when they want to — that means you need the ability to decrypt a packet even if you haven’t received its predecessor.

There are various ways to deal with these problems; DTLS mounts a frontal assault. The handshake is made reliable by implementing a custom ack/re-transmit framework. A protocol-level fragmentation mechanism is added to break large handshake messages up over multiple datagrams. And most importantly: the approach to encrypting records is slightly modified.

So what’s the problem?

To avoid radical changes, DTLS inherits most of the features of TLS. That includes its wonky (and obsolete) MAC-then-Encrypt approach to protecting data records. Encryption involves three steps:

Append a MAC to the plaintext to prevent tampering.
Pad the resulting message to a multiple of the cipher block size (16 bytes for AES). This is done by appending X bytes of padding, where each byte must contain the value X.
Encrypt the whole mess using CBC mode.

Cryptographers have long known that this kind of encryption can admit padding oracle attacks. This happens when a decryptor does something obvious (throw an error, for example) when it encounters invalid padding, i.e., padding that doesn’t meet the specification above.

CBC Mode encryption, courtesy Wikipedia.

This wouldn’t matter very much, except that CBC mode is malleable. This means an attacker can flip bits in an intercepted ciphertext, which will cause the same bits to be flipped when the ciphertext is ultimately decrypted. Padding oracle attacks work by carefully tweaking a ciphertext in specific ways before sending it to an honest decryptor. If the the decryptor returns a padding error, then the adversary knows something about the underlying plaintext. Given enough time the attacker can use these errors to completely decrypt the message.

I could spend a lot of time describing padding oracle attacks, but it’s mostly beside the point.** Standard TLS implementations know about this attack and deal with it in a pretty effective way. Whenever the decryptor encounters bad padding, it just pretends that it hasn’t. Instead, it goes ahead with the rest of the decryption procedure (i.e., checking the MAC) even if it knows that the message is already borked.

This is extra work, but it’s extra work with a purpose. If a decryptor doesn’t perform the extra steps, then messages with bad padding will get rejected considerably faster than other messages. A clever attacker can detect this condition by carefully timing the responses. Performing the unnecessary steps (mostly) neutralizes that threat.

Ok, so you say these attacks are already mitigated. Why are we still talking about this?

Before I go on, I offer one caveat: what I know about this attack comes from speculation, code diffs and some funny shapes I saw in the clouds this afternoon. I think what I’m saying is legit, but I won’t swear to it until I read Alfardan and Paterson’s paper.

But taking my best guess, there are two problems here. One is related to the DTLS spec, and the second is just an implementation problem. Either one alone probably wouldn’t be an issue; the two together spell big trouble.

The first issue is in the way that the DTLS spec deals with invalid records. Since standard TLS works over a reliable connection, the application should never receive invalid or out-of-order data except when packets are being deliberately tampered with. So when standard TLS encounters a bad record MAC (or padding) it takes things very seriously — in fact, it’s required to drop the connection.

This necessitates a new handshake, a new key, and generally makes it hard for attackers to run an honest padding oracle attack, since these attacks typically require hundreds or thousands of decryption attempts on a single key.***

DTLS, on the other hand, runs over an unreliable datagram transport, which may not correct for accidental packet errors. Dropping the connection for every corrupted packet just isn’t an option. Thus, the standard is relaxed. An invalid MAC (or padding) will cause a single record to be thrown away, but the connection itself goes on.

This still wouldn’t matter much if it wasn’t for the second problem, which is specific to the implementation of DTLS in libraries like OpenSSL and GnuTLS.

You see, padding oracle vulnerabilities in standard TLS are understood and mitigated. In OpenSSL, for example, the main decryption code has been carefully vetted. It does not return specific padding errors, and to avoid timing attacks it performs the same (unnecessary) operations whether or not the padding checks out.

In a perfect world DTLS decryption would do all the same things. But DTLS encryption is subtly different from standard TLS encryption, which means it’s implemented in separate code. Code that isn’t used frequently, and doesn’t receive the same level of scrutiny as the main TLS code. Thus — two nearly identical implementations, subject to the same attacks, with one secure and one not. (Update 1/11: There’s a decent explanation for this, see my update below.)

And if you’re the kind of person who needs this all tied up with a bow, I would point you to this small chunk of the diff just released for the latest OpenSSL fix. It comes from the DTLS-specific file d1_pkt.c:

+ /* To minimize information leaked via timing, we will always

+        * perform all computations before discarding the message.

+        */

+ decryption_failed_or_bad_record_mac = 1;

I guess that confirms the OpenSSL vulnerability. Presumably with these fixes in place, the MAC-then-Encrypt usage in DTLS will now go back to being, well, just theoretically insecure. But not actively so.

Update 1/11/2012: Kenny Paterson has kindly sent me a link to the paper, which wasn’t available when I wrote the original post. And it turns out that while the vulnerability is along the lines above, the attack is much more interesting.

An important aspect that I’d missed is that DTLS does not return error messages when it encounters invalid padding — it just silently drops the packet. This helps to explain the lack of countermeasures in the DTLS code, since the lack of responses would seem to be a padding oracle attack killer.

Alfardan and Paterson show that this isn’t the case. They’re able to get the same information by timing the arrival of ‘heartbeat’ messages (or any predictable responses sent by an upper-level protocol). Since DTLS decryption gums up the works, it can slightly delay the arrival of these packets. By measuring this ‘gumming’ they can determine whether padding errors have ocurred. Even better, they can amplify this gumming by sending ‘trains’ of valid or invalid packets.

All in all, a very clever attack. So clever, in fact, that it makes me despair that we’ll ever have truly secure systems. I guess I’ll have to be satisfied with one less insecure one.

Notes:

* N.J.A. Alfardan and K.G. Paterson, Plaintext-Recovery Attacks Against Datagram TLS, To appear in NDSS 2012.

** See here for one explanation. See also a post from this blog describing a padding oracle attack on XML encryption.

*** There is one very challenging padding oracle attack on standard TLS (also mitigated by current implementations). This deals with the problem of session drops/renegotiation by attacking data that remains constant across sessions — things like passwords or cookies.

A very casual introduction to Fully Homomorphic Encryption

Craig Gentry on board the mothership. (credit)

A couple of weeks ago I polled readers for the subjects that they were interested in. You gave me some excellent responses, and I promise they’re all in the hopper.

By far the most popular request was for some background on the recent results in computing on encrypted data, or ‘Fully-Homomorphic Encryption’. Even though the current techniques are still in the research phase — way outside the domain of the ‘practical’ crypto I usually talk about — this topic is so neat that it deserves a few words.

Before I get started, I want to make a few important stipulations. First, I’m hardly the world’s leading expert on the subject. Moreover, plenty of real experts have already published highly accessible introductory pieces. If you’re interested, you should check out Craig Gentry’s fantastic intro paper, or even his (surprisingly readable) PhD thesis. Alternatively, you can go directly to some of the recent papers on FHE.

My last warning is that this subject is kind of involved. I’m going to do my best to keep this explanation relatively non-technical (see the papers above if you want the gory details), but it could still get fairly long.

In this first post I’m going to cover some of the background behind FHE, and explain why it’s such a neat problem.

**Why encryption is not like a safe**

(credit)

People love to use analogies to talk about encryption. Sometimes these are helpful, sometimes they’re just limiting. Consider this one:

Encrypting a document is like placing it inside of a locked safe.

The locked safe is a great teaching example because cryptography and physical safes (usually) serve the same purpose: they ensure the confidentiality of sensitive data. In practice, they also share many of the same drawbacks.

If you’ve ever worked in an environment where safe-storage is required (e.g., a bank or intelligence agency) you probably know what I’m talking about. Once you lock a document into a safe, your document is locked inside of a damn safe.

Consequently, people tend to remove useful documents from safe storage at the first chance they get. This exposes them to all the usual threats, and explains why so few cases of document theft involve safecracking. Typically the same principle holds for encryption. People decrypt their data so they can use it.

But analogies are never perfect. Encrypting a document isn’t the same as putting it into a physical lockbox. And this is a good thing! Because in fact, there is a kind of encryption that allows us to bypass some of these limitations. We refer to this as homomorphic encryption, and its defining characteristic is this: you can perform useful operations on encrypted values without decrypting them first.

This may seem like an exotic property. Trust me, it’s not. In fact, cryptographers have put a lot of effort into removing the homomorphic properties from common public-key schemes like Elgamal and RSA. Without those protections, both schemes are homomorphic with respect to (modular) multiplication. This means you can multiply together any two Elgamal ciphertexts, and upon decryption you’ll find that the (single) resulting ciphertext now embeds the product of the two original plaintexts. Neat!

Homomorphic encryption has some immediate practical applications. Consider the Paillier scheme that’s used in several electronic voting protocols. Paillier is homomorphic with respect to addition. Now imagine: each voter encrypts their their ballot as a number (0 or 1) and publishes it to the world. Anyone can now tally up the results into a final ciphertext, which makes it hard for a corrupt election judge to throw away legitimate votes. Decrypting the final ciphertext reveals only the total.*

A few bits of history

Homomorphic encryption is hardly a new discovery, and cryptographers have long been aware of its promise. Way back in 1978 (about five seconds after the publication of RSA), Rivest, Adleman and Dertouzos proposed homomorphic encryption schemes that supported interesting functions on encrypted data. Regrettably, those first attempts kind of sucked.** Thus, the agenda for researchers was twofold: (1) come up with secure encryption schemes that could handle useful homomorphisms, and (2) figure out how to do interesting things with them.

To be interesting, a homomorphic encryption scheme should at very least permit the evaluation of useful mathematical functions, e.g., polynomials. But no computer scientist in history has ever been satisfied with mere polynomials. The holy grail was something much neater: a scheme that could handle arbitrary computations — embodied as real computer programs! — on securely encrypted inputs.

This idea — sometimes called ‘cryptocomputing’, or ‘computing on encrypted data‘ — has a way of capturing the imagination. There’s something fascinating about a computer that works on data it can’t see. More practically, a technology like this would eliminate a very real weakness in many security systems — the need to decrypt before processing data. It could even spawn a whole business based on outsourcing your computations to outside parties. (Something you obviously wouldn’t do without strong cryptographic protections.)

Anyway, it was a beautiful dream. There was just one problem: it didn’t work.

To explain why, let’s go back to some of the encryption schemes I mentioned above. Throughout the ’80s and ’90s researchers came up with these, and many more interesting schemes. Quite a few supported some kind of homomorphism, usually multiplication or addition. However, none seemed capable of handling even both operations simultaneously — at least not without serious limitations.

For researchers this was frustrating. Coming up with such a ‘doubly homomorphic’ scheme was an obvious first step towards the higher purpose. Even better, they quickly realized, this ‘first step’ was also the last step they’d need to achieve arbitrary computation.

How’s that? Well, imagine that you have a doubly homomorphic encryption scheme that encrypts bits, meaning that every plaintext is either 0 or 1. Given ciphertexts encrypting bits A and B, we could use this scheme to compute the simple function 1+A*B. Keeping in mind that all arithmetic is binary (i.e., modulo 2), such a function would produce the following truth table:

A B : 1+A*B

0 0 1

0 1 1

1 0 1

1 1 0

Why the excitement? Well, this table describes a NAND gate. And any computer engineer can tell you that NAND is a big deal: once you’ve got it, you can derive all of the other useful boolean logic gates: AND, OR, NOT, XOR and XNOR.*** And that means you can implement circuits.

To a theoretical computer scientist this is a Big Deal. Given an encryption scheme like this, we could encrypt our input one bit at a time, then send the encrypted values to a third party for processing. This party would run an arbitrary program just by rendering it into a huge circuit — a series of connected boolean logic gates — and evaluating the result one gate at a time. At the end of the process we’d get back a bunch of ciphertexts containing the (bit) results.

In theory, the existence of an appropriate encryption scheme would give us everything we need to, for example, play Halo on encrypted inputs. This would obviously be a poor gaming experience. But it would be possible. If only we had such an encryption scheme.

A brief note

At this point I’d like to take a quick break to address the more practical kind of reader, who (I suspect) is recoiling in horror. I know what you’re thinking: I came here for computing, and this is what you’re giving me? Break the input into single bits and process them one gate at a time?

Well, yes. That’s exactly how it’s going to work — at least, if we want general computation. And I stipulate that in many ways it’s going to suck. Consider, for example, a loop like this one:

while (encrypted_value < 100) {  perform_some_operation_on(&encrypted_value); }

Just try converting that into a circuit. I mean, it’s not impossible to unroll loops (if you know the maximum number of iterations), but the resulting circuit is not likely to be practical. Moreover, this isn’t purely an issue with the use of circuits, but rather with the use of encrypted data. No matter what computational model you employ, you’re always going to have difficulty with things like control flow changes that depend on input data that the executing party can’t see.

This makes it tough to implement the efficient programs that we’re accustomed to running on typical random access machines. Simply writing a bit to encrypted ‘RAM’ might require you to recalculate every bit in memory, at least, if the write location is dependent on the input data.

And no, I’m not going to reassure you that it gets better from here. Actually it’s going to get a lot worse once cryptography comes into the picture. That’s because each of these ‘bits’ is actually going to become a ciphertext — potentially hundreds or thousands of bits in length. Not to mention that evaluating those logic gates is going to require some pretty serious computing.

I’m pointing this out not to dismiss the research — which we’ll get to, and is pretty amazing — but rather, to point out that it is research. We aren’t going to be outsourcing general programs with this anytime soon — and in fact, we may never do so. What we might do is find ways to implement specialized subroutines with very high sensitivity requirements: e.g., stock trading models, proprietary bioinformatics processes, etc. By combining these with other less-general techniques, we could accomplish something pretty useful.

In Summary

I’ve written just about all I can fit in a reasonable blog post, and I realize that I’ve barely covered any of the actual research.

What I did accomplish was to lay out some of the background behind the recent developments in fully-homomorphic encryption. In the next post we’ll talk about the search for an appropriate encryption scheme, some of the failures, and Gentry’s eventual success.

Notes:

* Obviously there’s more to this. See, for example, this paper for some of the complexity.

** This might sound insulting, but it’s not. As I’ve said before, ‘suck’ is a purely technical term for schemes that aren’t semantically secure, i.e., indistinguishable under chosen plaintext attack.

*** Two notes here: First, you can obviously derive these gates more directly. For example, AND is (A*B). Second, while I’ve used the example of a scheme that encrypts only bits (meaning that addition and multiplication are always mod 2), the encryption scheme doesn’t have to be limited this way. For example, consider a scheme that encrypts arbitrary integers (say, a finite ring). As long as you know that the inputs (A, B) are both in {0, 1}, you can implement the NAND gate as 1-(A*B). This is a more common description and you’ll see it in most papers on the subject.

OpenSSL and NSS are FIPS 140 certified. Is the Internet safe now?

People like standards. The more important something is, the greater the likelihood that someone, somewhere has drawn up a standard for how it should be done. Although this can get out of hand (viz: the EU standard for cucumbers), I do think that standards can serve a useful purpose. This is particularly true when you’re talking about something as complicated (and easy to screw up) as encrypting data on the Internet.

Given all this, you’d expect that I’d be excited, nay thrilled, that two of the major crypto libraries in use today — OpenSSL and Mozilla’s NSS — offer a FIPS-approved mode. Our children can sleep safely at night! And yet, having spent part of the weekend poring over nightmarish NSS code, I feel obliged to offer the following opinion: FIPS compliance, while certainly not a bad thing, doesn’t really have anything to do with the security of a cryptographic library. Certainly not a library like OpenSSL or NSS.

For those who have no clue what FIPS is, let me start from the beginning.

If there’s one organization that really likes standards, it’s the US Federal government. For IT alone the US has an entire fleet of standards known collectively as FIPS — the Federal Information Processing Standards. These deal with lots of boring stuff like the standard codes to be used when referencing US states. More interestingly, FIPS also includes the standards for designing cryptographic modules. These are laid out in a series of documents headlined by FIPS 140-2 (obligatory warning: do not drive or operate heavy machinery while reading FIPS publications).

FIPS 140-2 outlines a set of requirements that must be adhered to by any US-government certified cryptographic module. This applies to products purchased by Federal agencies, and implicitly to outside systems that exchange sensitive-but-unclassified data with Federal computers (think: banks). To give these standards some teeth, the US NIST and Canada’s CSEC jointly run something called the Cryptographic Module Validation Program (CMVP). Only serious cryptographic modules survive CMVP, at least in theory. So NSS and OpenSSL’s validation is a Big Deal.

Like I said, that’s the theory.

To understand why I’m not so excited about FIPS, you have to understand the what, who and why. Specifically, what the heck a ‘cryptographic module’ is, who does the validation, and why we’re validating these modules in the first place.

What is a cryptographic module?

A 1970s-era hardware voice encryption
module.

To understand the FIPS viewpoint on this, you need to hark back to the early 90s when the FIPS 140-1 standard was first being hashed out. You also need to pretend that you worked for the US DoD in the previous decades. At this point you should have a very clear picture of a cryptographic module in your head. It looks something like the picture on the right.

Ok, so that’s a Dutch voice encryption module, and it dates from 1973. The point I’m trying to make is that a cryptographic module is a piece of hardware. It’s contained within some kind of metal enclosure. Plaintext data comes in on a physical port, encrypted data comes out of another. Keys are entered by punching them into the front, or alternatively, by inserting some kind of key storage token.

In fact, to achieve “FIPS 140 Level 2” (or higher) certification, you still need hardware. But what does FIPS 140-2 tell us about evaluating pure software cryptographic modules? How do we map these hardware rules onto something like OpenSSL?

Mostly by doing a lot of creative re-interpretation. Module boundaries have to become machine boundaries. Logical boundaries have to be defined around the software itself. The flow of data across these boundaries has to be relentlessly mapped into a hardware metaphor, even if using that metaphor doesn’t provide any tangible security benefit.

(To give a concrete example: FIPS key management at higher levels requires entering only encrypted keys. Neat. But if the module is a piece of software, what’s the point? You might find yourself encrypting keys on one machine, so that they can be decrypted in another process on the same machine. This is silly. But if that’s how NIST will interpret the standard, then that’s what we’ll do.)

In fairness, a new iteration of the standard — FIPS 140-3 — is supposed to make validation of software modules a whole lot clearer. Unfortunately the finalization date is currently listed as TBD. Moreover, when it comes to government standards, sometimes you’re better off with the devil you know.

Who does the validation?

Here’s another thing about CMVP to keep in mind. It’s not a free service offered by the US government to those modules it finds really important.

Yes, final approval of modules is performed by a team at NIST. However, this is only the tip of the iceberg. In order to reach NIST, you need to first retain a private testing laboratory, who will actually do most of the validation. NIST mostly evaluates the results of this process. You have a choice of labs, and yes, they compete on price. I’ve had the good fortune of working with excellent testing labs, but that doesn’t mean that my results are typical.

Why (and what) are we validating?

Finally, we get to the most important question: why are you evaluating the module. Actually, this is more of a what question — what does the evaluation look for, what kind of bugs is it going to turn up, and how much assurance do you really have at the end of the process.

And this is the depressing part. FIPS is not about detailed software evaluation. It’s not code review, at least not the kind of rigorous code review that will find the inevitable (and devastating) software vulnerabilities that your development team missed. It’s certainly not going to detect subtle cryptographic attacks that stem from bad use of padding or the presence of useful timing channels. And it’s definitely not going to detect clever backdoors.

What it is going to tell you is whether you have a solid checklist describing how the system should be used. It’ll tell whether you’re using some sort of authentication to launch the module. It’ll ensure that you’re using only FIPS-approved algorithms, and only in a way that’s specified by NIST. Finally, validation will tell you whether you’re implementing those algorithms correctly — or more accurately, whether your outputs match a set of test vectors provided by your testing lab, which is mostly, but not entirely the same thing.

These are all good things, and they probably serve to lop off the low-hanging fruit — the genuinely dumb bugs that lead to obvious compromise. (As a side benefit, FIPS prohibits the use of RC4.) Moreover, the process of getting through CMVP is so time-consuming that it probably increases the likelihood that your engineers will find bugs in the code along the way. This is a good thing, but it’s hardly a first order effect.

So what does this have to do with NSS and OpenSSL?

Good question. What are the problems in these libraries, and how is FIPS supposed to solve them?

The big problem that I see with both NSS and OpenSSL is not that they use bad algorithms, or that they have bad security policies. Rather, it’s that their code is uncommented and hard to read, which occasionally hides serious (but extremely subtle) vulnerabilities. Moreover, even when the implementation is faithful, they each support legacy standards that contain a lot of questionable nonsense (e.g., obsolete padding schemes) which needs to be hacked around. The above issues would be academic, except that to a surprising extent these two libraries are responsible for securing the Internet.

Even more problematic is that between OpenSSL and NSS we have two separate and competing libraries, in a world with enough interested brains and eyeballs to review at most one. Optimistically.

This is how NSS managed to stay vulnerable to the BEAST attack until this year, despite the fact that OpenSSL had ‘empty fragment‘ mitigations in place since 2002. Similarly, it’s how OpenSSL managed all of this. It’s why OpenSSL code often looks like this, and why NSS is no better.

Unfortunately the problems that FIPS evaluation solves are mostly orthogonal to these.

Now, to be fair, nobody in either the OpenSSL project or Mozilla is claiming that FIPS compliance makes these libraries magically secure. I’m sure they not only know better, but they curse FIPS privately behind closed doors because it requires a whole lot of extra code in exchange for a dubious security benefit. Moreover, both libraries are developed by smart people and have done fairly well considering what they’re up against.

Still, there’s a huge need for a real security standard for cryptographic modules, one that takes into account all of the things we know can go wrong. At very least we could mandate a coding standard with requirements for commenting, indentation, sensible error handling, and variable names that aren’t plain crazy. (Even if a single review doesn’t find the bugs, mandating a coding standard could increase the probability that others do.)

The whole point of this post is to remind readers that FIPS 140-2 is not that standard. Encrypt with caution.

2011 Redux

I started this blog just a few months ago and it’s been something of a learning experience. Someday maybe I’ll talk about that, but not in this post. What I’m here to talk about is regret — regret that there are so many things I missed the opportunity to blog about in 2011.

Since I’m finally done traveling and the alternative is to actually be productive, I thought this might be a good opportunity to make up for some of that lost blogging ground. Hence this post: my attempt to sum up what’s happened in practical cryptography in 2011.

Before anyone objects, I’m going to clarify the ground rules. This list is naturally incomplete, and moreover, I’m going to take ‘practical‘ seriously. That rules out reduced-round attacks (on anything); improvements in Fully-Homomorphic Encryption (getting faster!); and any paper with ‘composable’ in the title. I will cover implementation and usability issues. But I don’t really plan to take any of my own rules that seriously. If you think I missed something important, please feel free to follow up in comments.

Phishers don’t mess with the
imaginary SecurID key storage facility.

SecurID not so secure after all. RSA’s SecurID has become practically synonymous with two-factor authentication. And it’s not a bad system. Back in 2010 if you’d asked me about the most likely avenue for a compromise, I would have guessed (a) theft of secrets from a local SecurID server, or (b) some kind of bug in the authentication software.What I wouldn’t have guessed was (c) compromise of RSA’s master seed data. Obviously that wasn’t going to happen — given the sensitivity of this information, RSA naturally took good care of it. Remember the underground research facility in Resident Evil? My vision of the RSA master seed storage facility was kind of like that, only without the friendly computer or Milla Jovovich.
In retrospect this seems a bit naive, but really: SecurID was everywhere. Even military contractors used it. If local banks had learned to store their secrets in hardware security modules, then certainly RSA would take modest precautions with something as important as their master seed database.

What 2011 taught us is that just because you think something will be done a certain way, it won’t necessarily be so. Perhaps 2012 will be different.
Still more attacks on AES. I promised to keep this to the practical, and the latest attacks against AES certainly aren’t. The best of them still only shave a couple of bits off of the security of an AES key (although the new attacks don’t depend on related keys).Still, AES is such an important cipher that any reduction in its security margin is going to have some kind of practical impact. The question now is what lies down the road: (a) an improved AES beefed up with additional rounds, (b) a new standards competition, (c) apathy, or (d) something even weirder. Maybe 2012 will put us on a path to finding that answer.
D-Wave actually builds a quantum computer. With two caveats: it’s not really a computer, and it may or may not be quantum.In case you’ve never heard of D-Wave, they’re a private company that purports to sell the first working adiabatic quantum computer, complete with “128 pair-wise coupled superconducting flux qubits”. If you’re not sure what to make of this, you’re not the only one. Thanks to D-Wave’s NDA-heavy business model, there’s been some doubt in learned quarters as to whether the D-Wave device actually performs useful quantum computation at all.
But D-Wave has an active and respectable research department. Just this past May they published an article in Nature demonstrating apparent quantum tunneling amongst eight qubits. This is a far cry from the 128 qubits claimed in their commercial products, it doesn’t demonstrate entanglement, and of course, it doesn’t result in computation. Still it’s not nothing. So does this mean practical implementations of Shor’s algorithm are just around corner? Probably not.
TLS falls to the BEAST. This fall Thai Duong and Juliano Rizzo raised the pulse of the security industry by demonstrating the most exciting practical attack against SSL/TLS in years. This attack is dear to my heart because it’s one of the first attacks I wrote about on this blog.BEAST is based on a five year-old academic attack on a twelve year old standard. Now there’s a bit more to the attack that Rizzo/Duong implemented, but still — how in the world does such a thing stay in the wild so long? A colleague describes Rizzo/Duong’s research strategy as ‘working their way through the Eurocrypt archives, actually exploiting the theoretical vulnerabilities that academics couldn’t be bothered to follow through on’. If this sounds dismissive, it wasn’t — he only wished that he’d done it first.
Lesson: there’s a hell of a gap between academic crypto and ‘the real world’. We’re only safe as long as you believe in attackers who are sophisticated enough to spear-phish SecurID, but not bright enough to read a crypto paper. Time will tell if that belief is a good one.
Nothing much happens with hash functions. Well, that’s a bit unfair. Things did happen, and there are some papers to show for it. The SHA3 finalists, announced late last year, continued to percolate through the system. (Go BLAKE!) Still, 2011 was a huge disappointment for those of us you who thought that it would be the year of SHA1 collisions. I guess there’s always 2012.
Unsigned code running on iOS. Many cryptographers would exclude this from a list of ‘crypto’ happenings. Still, a theme of this blog is that your crypto doesn’t matter if your implementation undermines it.This is exactly what Charlie Miller demonstrated when he showed how to execute malicious application code on an iPhone. Even though iOS devices are only supposed to run signed code, it turns out that a few bad checks created a tiny section of memory in which these requirements were irrelevant. To thank him, Apple created a tiny section of their developer program in which Charlie Miller is irrelevant.
Certificate Authorities still a giant mess. Some will say I’m exaggerating — there were only a few major hacks this year, notably Dutch CA DigiNotar, and a smaller CA called Gemnet. But don’t forget, Microsoft also revoked Digicert Malaysia — not because they were hacked, but for using 512-bit RSA keys to sign certificates (!!) And this is only a partial list of the problems we know about. All in all, it was not a good year to be a CA.
Secure police radios not all that secure. APCO Project 25 is a digital radio standard used by law enforcement. Naturally it supports encryption — a valuable tool when you’re discussing a confidential source or razzing those Occupy protestors. And the P25 encryption itself isn’t half bad (it’s unauthenticated AES). Rather, the problem with P25 is that simply getting encryption turned on is a daunting prospect — to the point where it often doesn’t happen.

To illustrate the problem, a team from UPenn spent two years snarfing up confidential traffic. The agents generating most of this traffic either thought it was encrypted, or would have encrypted it if only they could’ve figured out how to set the keys. Still more evidence that usability is every bit as important as crypto, implementation, or any other issue that we come across in our field.
XML Encryption nearly as awful as XML itself. This fall brought news of a clever attack on the unauthenticated W3C XML Encryption standard. The attack, by researchers Tibor Jager and Juraj Somorovsky, used various features of XML encoding to implement what I would call a ‘padding oracle attack-plus-plus’ — which completely decrypts protected XML on Axis2 and JBoss systems.The W3C response: “We are so sorry for promulgating this terrible standard and recommend that you never, ever listen to anything we say regarding security every again.” No, of course I’m just kidding. Actually, they’re adding AES-GCM as an option. Should take care of everything.
Quantum key distribution still a work in progress. Quantum Key Distribution (QKD), sometimes known as ‘quantum crypto’ is a promising technique for distributing keys over geographic distances without the need to rely on complicated stuff like hard mathematical problems. Instead, you just need to run direct fiber-optic cables to the person you want to communicate with (much easier). On the bright side, the security of QKD is supposedly based on fundamental quantum-physical properties.In general, the caveat in any QKD design is that security only holds if the physical device works as expected. This year researchers from Singapore and Norway showed that this is a big assumption: by exploiting certain limitations of existing detector equipment, they were able to extract an entire secret key without anyone being the wiser. None of these attacks are fundamental ‘breaks’ of QKD, and indeed they’ve already been mitigated. But it goes to show how you should never trust a system just because someone tells you it’s ‘theoretically unbreakable’.

And that’s 2011. I’m sure I’m missing tons of important stuff — and that’s what comments are for. Please enjoy the rest of your holidays and be safe in the new year.

What’s TLS Snap Start?

Google is promoting a new TLS extension which is designed to speed sugar-snap-peas up the negotiation of TLS handshakes. The extension is called called Snap Start, and it’s already been deployed in Chrome (and presumably throughout Google’s back end.)

So what is Snap Start, why is it different/better than simply resuming a TLS session, and most importantly: how snappy is it?

Background: The TLS Handshake

To understand Snap Start we need to quickly review the TLS handshake. Snap Start only supports the classic (non-ephemeral) handshake protocol, so that’s what we’ll be talking about here. Leaving out a lot of detail, that handshake can be reduced to the following four messages:

C -> S: Client sends a random nonce CR.
C <- S: Server responds with its public key (certificate) and another nonce SR.
C -> S: Client generates a pre-master secret PMS, encrypts it with the server’s public key and sends the ciphertext to the server.
C <- S: The client and server both derive a ‘master secret’ by hashing* together (CR, SR, PMS). The server and client also MAC all previous handshake messages to make sure nothing’s been tampered with. Finally the server notifies the client that it’s ready to communicate.

I’m ignoring many, many important details here. These include ciphersuite negotiation, parameter data, and — most importantly — client authentication, which we’ll come back to later. But these four messages are what we need to think about right now.

So why Snap Start?

It only takes a glance at the above protocol to identify the problem with the classic handshake: it’s frickin’ slow. Every new TLS connection requires two latency-inducing round-trips. This can be a drag when you’re trying to deploy TLS everywhere.

Now technically you don’t need to run the whole handshake each time you connect. Once you’ve established a TLS session you can resume it anytime you want — provided that both client and server retain the master secret.** Session resumption reduces communication overhead, but it isn’t the answer to everything. Most people will balk at the idea of hanging onto secret keys across machine reboots, or even browser restarts. Moreover, it’s not clear that a busy server can afford to securely cache the millions of keys it establishes every day.

What’s needed is a speedy handshake protocol that doesn’t rely on caching secret information. And that’s where Snap Start comes in.

The intuition behind Snap Start is simple: if TLS requires too many communication rounds, then why not ditch some. In this case, the target for ditching is the server’s message in step (2). Of course, once you’ve done that you may as well roll steps (1) and (3) into one mutant mega-step. This cuts the number of handshake messages in half.

There are two wrinkles with this approach — one obvious and one subtle. The obvious one is that step (2) delivers the server’s certificate. Without this certificate, the client can’t complete the rest of the protocol. Fortunately server certificates don’t change that often, so the client can simply cache one after the first time it completes the full handshake.***

Replays

The subtle issue has to do with the reason those nonces are there in the first place. From a security perspective, they’re designed to prevent replay attacks. Basically, this is a situation where an attacker retransmits captured data from one TLS session back to a server in the hopes that the server will accept it as fresh. Even if the data is stale, there are various scenarios in which replaying it could be useful.

Normally replays are prevented because the server picks a distinct (random) nonce SR in step (2), which has implications throughout the rest of the handshake. But since we no longer have a step (2), a different approach is required. Snap Start’s solution is simple: let the client propose the nonce SR, and just have the server make sure that value hasn’t been used before.

Obviously this requires the server to keep a list of previously-used SR values (called a ‘strike list’), which — assuming a busy server — could get out of control pretty fast.

The final Snap Start optimization is to tie proposed SR values with time periods. If the client suggests an SR that’s too old, reject it. This means that the server only has to keep a relatively short strike list relating to the last few minutes or hours. There are other optimizations to deal with cases where multiple servers share the same certificate, but it’s not all that interesting.

So here’s the final protocol:

The client generates a random nonce CR and a proposed nonce SR. It also generates a pre-master secret PMS, encrypts it with the server’s public key and sends the ciphertext and nonces to the server.
The server checks that SR hasn’t been used before/recently. Both the client and server both derive a ‘master secret’ by hashing* together (CR, SR, PMS). The server notifies the client that it’s ready to communicate.

The best part is that if anything goes wrong, the server can always force the client to engage in a normal TLS handshake.

Now for the terrible flaw in Snap Start

No, I’m just kidding. There doesn’t seem to be anything wrong with Snap Start. With caveats. It requires some vaguely synchronized clocks. Moreover, it’s quite possible that a strike list could get wiped out in a server crash, which would open the server up to limited replays (a careful implementation could probably avoid this). Also, servers that share a single certificate could wind up vulnerable to cross-replays if their administrator forgets to configure them correctly.

One last thing I didn’t mention is that Snap Start tries to use as much of the existing TLS machinery as possible. So even though the original step (2) (‘Server Hello’ message) no longer exists, it’s ‘virtually’ recreated for the purposes of computing the TLS Finished hash check, which hashes over all preceding handshake messages. Ditto for client authentication signatures. Some new Snap-specific fields are also left out of these hashes.

As a consequence, I suppose there’s a hypothetical worry that an attack on Snap Start (due to a bad server implementation, for example) could be leveraged into an attack that works even when the client requests a normal TLS handshake. The basic idea would be to set up a man-in-the-middle that converts the client’s standard TLS handshake request into a Snap Start request, and feeds convincing lies back to the client. I’m fairly certain that the hash checks and extra Snap Start messages will prevent this attack, but I’m not 100% sure from reading the spec.

Beyond that, all of this extra logic opens the door for implementation errors and added complexity. I haven’t looked at any server-side implementations, but I would definitely like to. Nonetheless, for the moment Snap Start seems like a darned good idea. I hope it means a lot more TLS in 2012.

Notes:

* Technically this is a denoted as a PRF, but it’s typically implemented using hash functions.

** At session resumption the master secret is ‘updated’ by combining it with new client and server randomness.

*** Certificate revocation is still an issue, so Snap Start also requires caching of ‘stapled’ OCSP messages. These are valid for a week.

A brief note on end-of-year giving

I wanted to take a quick break from the technical to bug you about something important.

The end of the year is coming up and no doubt there are some folks thinking about last minute charitable donations. There are many, many worthy causes you can support. All things being equal, I’d give to my local food bank first, given how much need there is, and how far these institutions can stretch a charitable dollar ($1 at a food bank buys the equivalent of $20 at a retail supermarket).

But if you have something left over I’d strongly recommend that you give to the Electronic Frontier Foundation. In case you haven’t noticed, there’s a lot of crazy stuff going on with technology and the law these days. I recently poked fun at how small the EFF’s budget is, but I meant it with love (and with reason!). They’re fighting a tough uphill battle with minimal resources.

I have a personal reason for supporting the EFF. Back when I was a grad student, some colleagues and I reverse-engineered a commercial device as part of a research project. This is something that security researchers do from time to time, and it’s something we should be able to do. Our goal was expose flaws in industrial security systems, and hopefully to spur the adoption of better technology. (Note: better technology is now out there, and no, I’m not taking credit. But scrutiny is generally a good thing.)

Anyway, we knew that there were legal obstacles related to this work, we just didn’t realize how significant they’d be. When we first disclosed our findings, there were some… unpleasant phone calls at high levels. The University’s legal counsel politely informed us that in the event of a lawsuit — even a frivolous one — we’d be bearing the expense on our own. This is not a pleasant prospect for a newly-married grad student who’s just signed mortgage papers.

It’s possible that without the EFF we’d have called the whole thing off right then. But the EFF did support us. They took our case (for free!), and worked miracles.

While our story has a happy ending, white hat security research in the US is still a minefield. Sadly this state of affairs doesn’t seem to be improving. The EFF is just about the only group I know of that stands up for security researchers. Even if you’re not a researcher, you probably benefit indirectly from their work.

So please take a minute to donate. It’s tax deductible and some employers will match. If you donate at least $65 and become a member, they’ll even send you an awesome T-shirt (I have one from 1999 that’s still going strong — it’s ugly as sin but damn, the build quality is high.) Again, I’m not saying this should be the only donation you make this year, but it certainly would be a good one.

A question for you

This post is addressed to you, my patient and wonderful audience.

I realize that my interests may not be entirely representative of yours. So what would you like to read about? I don’t guarantee that I’ll write on any of your suggestions, but I do promise to at least think about the topics you propose.

Feel free to make your suggestions in comments or Twitter. If you don’t, be warned: you’ll be as much to blame for that future post on Leakage Resilient Cryptography as I will.

A Few Thoughts on Cryptographic Engineering

Some random thoughts about crypto. Notes from a course I teach. Pictures of my dachshunds.

In memoriam: Tim Hartnell

EAX’, Knight Rider, and an admission of defeat

Useful cryptography resources

Attack of the week: Datagram TLS

What is Datagram TLS?

A very casual introduction to Fully Homomorphic Encryption

**Why encryption is not like a safe**

A few bits of history

A brief note

In Summary

OpenSSL and NSS are FIPS 140 certified. Is the Internet safe now?

What is a cryptographic module?

Who does the validation?

Why (and what) are we validating?

So what does this have to do with NSS and OpenSSL?

2011 Redux

What’s TLS Snap Start?

Background: The TLS Handshake

So why Snap Start?

Replays

Now for the terrible flaw in Snap Start

A brief note on end-of-year giving

A question for you

Menu