Falling through the KRACKs

The big news in crypto today is the KRACK attack on WPA2 protected WiFi networks. Discovered by Mathy Vanhoef and Frank Piessens at KU Leuven, KRACK (Key Reinstallation Attack) leverages a vulnerability in the 802.11i four-way handshake in order to facilitate decryption and forgery attacks on encrypted WiFi traffic.

The paper is here. It’s pretty easy to read, and you should.

I don’t want to spend much time talking about KRACK itself, because the vulnerability is pretty straightforward. Instead, I want to talk about why this vulnerability continues to exist so many years after WPA was standardized. And separately, to answer a question: how did this attack slip through, despite the fact that the 802.11i handshake was formally proven secure?

A quick TL;DR on KRACK

For a detailed description of the attack, see the KRACK website or the paper itself. Here I’ll just give a brief, high level description.

The 802.11i protocol (also known as WPA2) includes two separate mechanisms to ensure the confidentiality and integrity of your data. The first is a record layer that encrypts WiFi frames, to ensure that they can’t be read or tampered with. This encryption is (generally) implemented using AES in CCM mode, although there are newer implementations that use GCM mode, and older ones that use RC4-TKIP (we’ll skip these for the moment.)

The key thing to know is that AES-CCM (and GCM, and TKIP) is a stream cipher, which means it’s vulnerable to attacks that re-use the same key and “nonce”, also known as an initialization vector. 802.11i deals with this by constructing the initialization vector using a “packet number” counter, which initializes to zero after you start a session, and always increments (up to 2^48, at which point rekeying must occur). This should prevent any nonce re-use, provided that the packet number counter can never be reset.

The second mechanism you should know about is the “four way handshake” between the AP and a client (supplicant) that’s responsible for deriving the key to be used for encryption. The particular message KRACK cares about is message #3, which causes the new key to be “installed” (and used) by the client.

393px-4-way-handshake-svg — I’m a four-way handshake. Client is on the left, AP is in the right. (courtesy Wikipedia, used under CC).

The key vulnerability in KRACK (no pun intended) is that the acknowledgement to message #3 can be blocked by adversarial nasty people.* When this happens, the AP re-transmits this message, which causes (the same) key to be reinstalled into the client (note: see update below*). This doesn’t seem so bad. But as a side effect of installing the key, the packet number counters all get reset to zero. (And on some implementations like Android 6, the key gets set to zero — but that’s another discussion.)

The implication is that by forcing the AP to replay this message, an adversary can cause a connection to reset nonces and thus cause keystream re-use in the stream cipher. With a little cleverness, this can lead to full decryption of traffic streams. And that can lead to TCP hijacking attacks. (There are also direct traffic forgery attacks on GCM and TKIP, but this as far as we go for now.)

How did this get missed for so long?

If you’re looking for someone to blame, a good place to start is the IEEE. To be clear, I’m not referring to the (talented) engineers who designed 802.11i — they did a pretty good job under the circumstances. Instead, blame IEEE as an institution.

One of the problems with IEEE is that the standards are highly complex and get made via a closed-door process of private meetings. More importantly, even after the fact, they’re hard for ordinary security researchers to access. Go ahead and google for the IETF TLS or IPSec specifications — you’ll find detailed protocol documentation at the top of your Google results. Now go try to Google for the 802.11i standards. I wish you luck.

The IEEE has been making a few small steps to ease this problem, but they’re hyper-timid incrementalist bullshit. There’s an IEEE program called GET that allows researchers to access certain standards (including 802.11) for free, but only after they’ve been public for six months — coincidentally, about the same time it takes for vendors to bake them irrevocably into their hardware and software.

This whole process is dumb and — in this specific case — probably just cost industry tens of millions of dollars. It should stop.

The second problem is that the IEEE standards are poorly specified. As the KRACK paper points out, there is no formal description of the 802.11i handshake state machine. This means that implementers have to implement their code using scraps of pseudocode scattered around the standards document. It happens that this pseudocode leads to the broken implementation that enables KRACK. So that’s bad too.

And of course, the final problem is implementers. One of the truly terrible things about KRACK is that implementers of the WPA supplicant (particularly on Linux) managed to somehow make Lemon Pledge out of lemons. On Android 6 in particular, replaying message #3 actually sets an all-zero key. There’s an internal logic behind why this happens, but Oy Vey. Someone actually needs to look at this stuff.

What about the security proof?

The fascinating thing about the 802.11i handshake is that despite all of the roadblocks IEEE has thrown in people’s way, it (the handshake, at least) has been formally analyzed. At least, for some definition of the term.

(This isn’t me throwing shade — it’s a factual statement. In formal analysis, definitions really, really matter!)

A paper by He, Sundararajan, Datta, Derek and Mitchell (from 2005!) looked at the 802.11i handshake and tried to determine its security properties. What they determined is that yes, indeed, it did produce a secret and strong key, even when an attacker could tamper with and replay messages (under various assumptions). This is good, important work. The proof is hard to understand, but this is par for the course. It seems to be correct.

wifihandshake — Representation of the 4-way handshake from the paper by He *et al.* Yes, I know you’re like “*what?*“. But that’s why people who do formal verification of protocols don’t have many friends.

Even better, there are other security proofs showing that — provided the nonces are never repeated — encryption modes like CCM and GCM are highly secure. This means that given a secure key, it should be possible to encrypt safely.

So what went wrong?

The critical problem is that while people looked closely at the two components — handshake and encryption protocol — in isolation, apparently nobody looked closely at the two components as they were connected together. I’m pretty sure there’s an entire geek meme about this.

czx0o-twqaaeali — Two unit tests, 0 integration tests, thanks Twitter.

Of course, the reason nobody looked closely at this stuff is that doing so is just plain hard. Protocols have an exponential number of possible cases to analyze, and we’re just about at the limit of the complexity of protocols that human beings can truly reason about, or that peer-reviewers can verify. The more pieces you add to the mix, the worse this problem gets.

In the end we all know that the answer is for humans to stop doing this work. We need machine-assisted verification of protocols, preferably tied to the actual source code that implements them. This would ensure that the protocol actually does what it says, and that implementers don’t further screw it up, thus invalidating the security proof.

This needs to be done urgently, but we’re so early in the process of figuring out how to do it that it’s not clear what it will take to make this stuff go live. All in all, this is an area that could use a lot more work. I hope I live to see it.

===

* Update: An early version of this post suggested that the attacker would replay the third message. This can indeed happen, and it does happen in some of the more sophisticated attacks. But primarily, the paper describes forcing the AP to resend it by blocking the acknowledgement from being received at the AP. Thanks to Nikita Borisov and Kyle Birkeland for the fix!

Beyond public key encryption

One of the saddest and most fascinating things about applied cryptography is how little cryptography we actually use. This is not to say that cryptography isn’t widely used in industry — it is. Rather, what I mean is that cryptographic researchers have developed so many useful technologies, and yet industry on a day to day basis barely uses any of them. In fact, with a few minor exceptions, the vast majority of the cryptography we use was settled by the early-2000s.*

Most people don’t sweat this, but as a cryptographer who works on the boundary of research and deployed cryptography it makes me unhappy. So while I can’t solve the problem entirely, what I can do is talk about some of these newer technologies. And over the course of this summer that’s what I intend to do: talk. Specifically, in the next few weeks I’m going to write a series of posts that describe some of the advanced cryptography that we don’t generally see used.

Today I’m going to start with a very simple question: what lies beyond public key cryptography? Specifically, I’m going to talk about a handful of technologies that were developed in the past 20 years, each of which allows us to go beyond the traditional notion of public keys.

This is a wonky post, but it won’t be mathematically-intense. For actual definitions of the schemes, I’ll provide links to the original papers, and references to cover some of the background. The point here is to explain what these new schemes do — and how they can be useful in practice.

Identity Based Cryptography

In the mid-1980s, a cryptographer named Adi Shamir proposed a radical new idea. The idea, put simply, was to get rid of public keys.

To understand where Shamir was coming from, it helps to understand a bit about public key encryption. You see, prior to the invention of public key crypto, all cryptography involved secret keys. Dealing with such keys was a huge drag. Before you could communicate securely, you needed to exchange a secret with your partner. This process was fraught with difficulty and didn’t scale well.

Public key encryption (beginning with Diffie-Hellman and Shamir’s RSA cryptosystem) hugely revolutionized cryptography by dramatically simplifying this key distribution process. Rather than sharing secret keys, users could now transmit their public key to other parties. This public key allowed the recipient to encrypt to you (or verify your signature) but it could not be used to perform the corresponding decryption (or signature generation) operations. That part would be done with a secret key you kept to yourself.

While the use of public keys improved many aspects of using cryptography, it also gave rise to a set of new challenges. In practice, it turns out that having public keys is only half the battle — people still need to use distribute them securely.

For example, imagine that I want to send you a PGP-encrypted email. Before I can do this, I need to obtain a copy of your public key. How do I get this? Obviously we could meet in person and exchange that key on physical media — but nobody wants to do this. It would much more desirable to obtain your public key electronically. In practice this means either (1) we have to exchange public keys by email, or (2) I have to obtain your key from a third piece of infrastructure, such as a website or key server. And now we come to the problem: if that email or key server is untrustworthy (or simply allows anyone to upload a key in y our name), I might end up downloading a malicious party’s key by accident. When I send a message to “you”, I’d actually be encrypting it to Mallory.

f64f315ec47f0b041e3d881177039414 — Mallory.

Solving this problem — of exchanging public keys and verifying their provenance — has motivated a huge amount of practical cryptographic engineering, including the entire web PKI. In most cases, these systems work well. But Shamir wasn’t satisfied. What if, he asked, we could do it better? More specifically, he asked: could we replace those pesky public keys with something better?

Shamir’s idea was exciting. What he proposed was a new form of public key cryptography in which the user’s “public key” could simply be their identity. This identity could be a name (e.g., “Matt Green”) or something more precise like an email address. Actually, it didn’t realy matter. What did matter was that the public key would be some arbitrary string — and not a big meaningless jumble of characters like “7cN5K4pspQy3ExZV43F6pQ6nEKiQVg6sBkYPg1FG56Not”.

Of course, using an arbitrary string as a public key raises a big problem. Meaningful identities sound great — but I don’t own them. If my public key is “Matt Green”, how do I get the corresponding private key? And if I can get out that private key, what stops some other Matt Green from doing the same, and thus reading my messages? And ok, now that I think about this, what stops some random person who isn’t named Matt Green from obtaining it? Yikes. We’re headed straight into Zooko’s triangle.

Shamir’s idea thus requires a bit more finesse. Rather than expecting identities to be global, he proposed a special server called a “key generation authority” that would be responsible for generating the private keys. At setup time, this authority would generate a single master public key (MPK), which it would publish to the world. If you wanted to encrypt a message to “Matt Green” (or verify my signature), then you could do so using my identity and the single MPK of an authority we’d both agree to use. To decrypt that message (or sign one), I would have to visit the same key authority and ask for a copy of my secret key. The key authority would compute my key based on a master secret key (MSK), which it would keep very secret.

With all algorithms and players specified, whole system looks like this:

Overview of an Identity-Based Encryption (IBE) system. The **Setup** algorithm of the Key Generation Authority generates the master public key (MPK) and master secret key (MSK). The authority can use the **Extract** algorithm to derive the secret key corresponding to a specific ID. The encryptor (left) encrypts using only the identity and MPK. The recipient requests the secret key for her identity, and then uses it to decrypt. (Icons by Eugen Belyakoff)

This design has some important advantages — and more than a few obvious drawbacks. On the plus side, it removes the need for any key exchange at all with the person you’re sending the message to. Once you’ve chosen a master key authority (and downloaded its MPK), you can encrypt to anyone in the entire world. Even cooler: at the time you encrypt, your recipient doesn’t even need to have contacted the key authority yet. She can obtain her secret key after I send her a message.

Of course, this “feature” is also a bug. Because the key authority generates all the secret keys, it has an awful lot of power. A dishonest authority could easily generate your secret key and decrypt your messages. The polite way to say this is that standard IBE systems effectively “bake in” key escrow.**

Putting the “E” in IBE

All of these ideas and more were laid out by Shamir in his 1984 paper. There was just one small problem: Shamir could only figure out half the problem.

Specifically, Shamir’s proposed a scheme for identity-based signature (IBS) — a signature scheme where the public verification key is an identity, but the signing key is generated by the key authority. Try as he might, he could not find a solution to the problem of building identity-based encryption (IBE). This was left as an open problem.***

It would take more than 16 years before someone answered Shamir’s challenge. Surprisingly, when the answer came it came not once but three times.

The first, and probably most famous realization of IBE was developed by Dan Boneh and Matthew Franklin much later — in 2001. The timing of Boneh and Franklin’s discovery makes a great deal of sense. The Boneh-Franklin scheme relies fundamentally on elliptic curves that support an efficient “bilinear map” (or “pairing”).**** The algorithms needed to compute such pairings were not known when Shamir wrote his paper, and weren’t employed constructively — that is, as a useful thing rather than an attack — until about 2000. The same can be said about a second scheme called Sakai-Kasahara, which would be independently discovered around the same time.

(For a brief tutorial on the Boneh-Franklin IBE scheme, see this page.)

The third realization of IBE was not as efficient as the others, but was much more surprising. This scheme was developed by Clifford Cocks, a senior cryptologist at Britain’s GCHQ. It’s noteworthy for two reasons. First, Cocks’ IBE scheme does not require bilinear pairings at all — it is based in the much older RSA setting, which means in principle it spent all those years just waiting to be found. Second, Cocks himself had recently become known for something even more surprising: discovering the RSA cryptosystem, nearly five years before RSA did. To bookend that accomplishment with a second major advance in public key cryptography was a pretty impressive accomplishment.

In the years since 2001, a number of additional IBE constructions have been developed, using all sorts of cryptographic settings. Nonetheless, Boneh and Franklin’s early construction remains among the simplest and most efficient.

Even if you’re not interested in IBE for its own sake, it turns out that this primitive is really useful to cryptographers for many things beyond simple encryption. In fact, it’s more helpful to think of IBE as a way of “pluralizing” a single public/secret master keypair into billions of related keypairs. This makes it useful for applications as diverse as blocking chosen ciphertext attacks, forward-secure public key encryption, and short signature schemes.

Attribute Based Encryption

Of course, if you leave cryptographers alone with a tool like IBE, the first thing they’re going to do is find a way to ~~make things more complicated~~ improve on it.

One of the biggest such improvements is due to Sahai and Waters. It’s called Attribute-Based Encryption, or ABE.

The origin of this idea was not actually to encrypt with attributes. Instead Sahai and Waters were attempting to develop an Identity-Based encryption scheme that could encrypt using biometrics. To understand the problem, imagine I decide to use a biometric like your iris scan as the “identity” to encrypt you a ciphertext. Later on you’ll ask the authority for a decryption key that corresponds to your own iris scan — and if everything matches up and you’ll be able to decrypt.

The problem is that this will almost never work.

The issue here is that biometric readings (like iris scans or fingerprint templates) are inherently error-prone. This means every scan will typically be very close, but often there will be a few bits that disagree. With standard IBE

this is fatal: if the encryption identity differs from your key identity by even a single bit, decryption will not work. You’re out of luck.

Sahai and Waters decided that the solution to this problem was to develop a form of IBE with a “threshold gate”. In this setting, each bit of the identity is represented as a different “attribute”. Think of each of these as components you’d encrypt under — something like “bit 5 of your iris scan is a 1” and “bit 23 of your iris scan is a 0”. The encrypting party lists all of these bits and encrypts under each one. The decryption key generated by the authority embeds a similar list of bit values. The scheme is defined so that decryption will work if and only if the number of matching attributes (between your key and the ciphertext) exceeds some pre-defined threshold: e.g., any 2024 out of 2048 bits must be identical in order to decrypt.

The beautiful thing about this idea is not fuzzy IBE. It’s that once you have a threshold gate and a concept of “attributes”, you can more interesting things. The main observation is that a threshold gate can be used to implement the boolean AND and OR gates, like so:

gates

Even better, you can stack these gates on top of one another to assign a fairly complex boolean formula — which will itself determine what conditions your ciphertext can be decrypted under. For example, switching to a more realistic set of attributes, you could encrypt a medical record so that either a pediatrician in a hospital could read it, or an insurance adjuster could. All you’d need is to make sure people received keys that correctly described their attributes (which are just arbitrary strings, like identities).

ABEFormula — A simple “ciphertext policy”, in which the ciphertext can be decrypted if and only if a key matches an appropriate set of attributes. In this case, the key satisfies the formula and thus the ciphertext decrypts. The remaining key attributes are ignored.

The other direction can be implemented as well. It’s possible to encrypt a ciphertext under a long list of attributes, such as creation time, file name, and even GPS coordinates indicating where the file was created. You can then have the authority hand out keys that correspond to a very precise slice of your dataset — for example, “this key decrypts any radiology file encrypted in Chicago between November 3rd and December 12th that is tagged with ‘pediatrics’ or ‘oncology'”.

Functional Encryption

Once you have a related of primitives like IBE and ABE, the researchers’ instinct is to both extend and generalize. Why stop at simple boolean formulae? Can we make keys (or ciphertexts) that embed arbitrary computer programs? The answer, it turns out, is yes — though not terribly efficiently. A set of recent works show that it is possible to build ABE that works over arbitrary polynomial-size circuits, using various lattice-based assumptions. So there is certainly a lot of potential here.

This potential has inspired researchers to generalize all of the above ideas into a single class of encryption called “functional encryption“. Functional encryption is more conceptual than concrete — it’s just a way to look at all of these systems as instances of a specific class. The basic idea is to represent the decryption procedure as an algorithm that computes an arbitary function F over (1) the plaintext inside of a ciphertext, and (2) the data embedded in the key. This function has the following profile:

output = F(key data, plaintext data)

In this model IBE can be expressed by having the encryption algorithm encrypt (identity, plaintext) and defining the function F such that, if “key input == identity”, it outputs the plaintext, and outputs an empty string otherwise. Similarly, ABE can be expressed by a slightly more complex function. Following this paradigm, once can envision all sorts of interesting functions that might be computed by different functions and realized by future schemes.

But those will have to wait for another time. We’ve gone far enough for today.

So what’s the point of all this?

For me, the point is just to show that cryptography can do some pretty amazing things. We rarely see this on a day-to-day basis when it comes to industry and “applied” cryptography, but it’s all there waiting to be used.

Perhaps the perfect application is out there. Maybe you’ll find it.

Notes:

* An earlier version of this post said “mid-1990s”. In comments below, Tom Ristenpart takes issue with that and makes the excellent point that many important developments post-date that. So I’ve moved the date forward about five years, and I’m thinking about how to say this in a nicer way.

** There is also an intermediate form of encryption known as “certificateless encryption“. Proposed by Al-Riyami and Paterson, this idea uses a combination of standard public key encryption and IBE. The basic idea is to encrypt each message using both a traditional public key (generated by the recipient) and an IBE identity. The recipient must then obtain a copy of the secret key from the IBE authority to decrypt. The advantages here are twofold: (1) the IBE key authority can’t decrypt the message by itself, since it does not have the corresponding secret key, which solves the “escrow” problem. And (2) the sender does not need to verify that the public key really belongs to the sender (e.g., by checking a certificate), since the IBE portion prevents imposters from decrypting the resulting message. Unfortunately this system is more like traditional public key cryptography than IBE, and does not have the neat usability features of IBE.

*** A part of the challenge of developing IBE is the need to make a system that is secure against “collusion” between different key holders. For example, imagine a very simple system that has only 2-bit identities. This gives four possible identities: “00”, “01”, “10”, “11”. If I give you a key for the identity “01” and I give Bob a key for “10”, we need to ensure that you two cannot collude to produce a key for identities “00” and “11”. Many earlier proposed solutions have tried to solve this problem by gluing together standard public encryption keys in various ways (e.g., by having a separate public key for each bit of the identity and giving out the secret keys as a single “key”). However these systems tend to fail catastrophically when just a few users collude (or their keys are stolen). Solving the collusion problem is fundamentally what separates real IBE from its faux cousins.

**** A full description of Boneh and Franklin’s scheme can be found here, or in the original paper. Some code is here and here and here. I won’t spend more time on it, except to note that the scheme is very efficient. It was patented and implemented by Voltage Security, now part of HPE.

Zero Knowledge Proofs: An illustrated primer, Part 2

This post is the second in a two-part series on zero-knowledge proofs. Click here to read Part 1.

In this post I’m going to continue the short, (relatively) non-technical overview of zero knowledge proofs that I started a couple of years ago. Yes, that was a very long time! If you didn’t catch the first post, now would be an excellent time to go read it.

Before we go much further, a bit of a warning. While this series is still intended as a high-level overview, at a certain point it’s necessary to dig a bit deeper into some specific algorithms. So you should expect this post to get a bit wonkier than the last.

A quick recap, and a bit more on Zero Knowledge(ness)

First, a brief refresher.

In the last post we defined a zero knowledge proof as an interaction between two computer programs (or Turing machines) — respectively called a Prover and a Verifier — where the Prover works to convince the Verifier that some mathematical statement is true. We also covered a specific example: a clever protocol by Goldreich, Micali and Wigderson that allows us to prove, in zero knowledge, that a graph possesses a three-coloring.

In the course of that discussion, we described three critical properties that any zero knowledge proof must satisfy:

Completeness: If the Prover is honest, then she will eventually convince the Verifier.
Soundness: The Prover can only convince the Verifier if the statement is true.
Zero-knowledge(ness): The Verifier learns no information beyond the fact that the statement is true.

The real challenge turns out to be finding a way to formally define the last property. How do you state that a Verifier learns nothing beyond the truth of a statement?

In case you didn’t read the previous post — the answer to this question came from Goldwasser, Micali and Rackoff, and it’s very cool. What they argued is that a protocol can be proven zero knowledge if for every possible Verifier, you can demonstrate the existence of an algorithm called a ‘Simulator’, and show that this algorithm has some very special properties.

From a purely mechanical perspective, the Simulator is like a special kind of Prover. However, unlike a real Prover — which starts with some special knowledge that allows it to prove the truth of a statement — the Simulator gets no special knowledge at all.* Nonetheless, the Simulator (or Simulators) must be able to ‘fool’ every Verifier into believing that the statement is true, while producing a transcript that’s statistically identical top (or indistinguishable from) the output of a real Prover.

The logic here flows pretty cleanly: since Simulator has no ‘knowledge’ to extract in the first place, then clearly a Verifier can’t obtain any meaningful amount of information after interacting with it. Moreover, if the transcript of the interaction is distributed identically to a real protocol run with a normal Prover, then the Verifier can’t do better against the real prover than it can do against the Simulator. (If the Verifier could do better, then that would imply that the distributions were not statistically identical.) Ergo, the Verifier can’t extract useful information from the real protocol run.

This is incredibly wonky, and worse, it seems contradictory! We’re asking that a protocol be both sound — meaning that a bogus Prover can’t trick some Verifier into accepting a statement unless it has special knowledge allowing it to prove the statement — but we’re also asking for the existence of an algorithm (the simulator) that can literally cheat. Clearly both properties can’t hold at the same time.

The solution to this problem is that both properties don’t hold at the same time.

To build our simulator, we’re allowed to do things to the Verifier that would never happen in the real world. The example that I gave in the previous post was to use a ‘time machine’ — that is, our ‘Simulator’ can rewind the Verifier program’s execution in order to ‘fool’ it. Thus, in a world where we can wind the Verifier back in time, it’s easy to show that a Simulator exists. In the real world, of course it doesn’t. This ‘trick’ gets us around the contradiction.

As a last reminder, to illustrate all of these ideas, we covered one of the first general zero knowledge proofs, devised by Goldreich, Micali and Wigderson (GMW). That protocol allowed us to prove, in zero knowledge, that a graph supports a three-coloring. Of course, proving three colorings isn’t terribly interesting. The real significance of the GMW result is theoretical. Since graph three coloring is known to be in the complexity class NP-complete, the GMW protocol can be used to prove any statement in the class NP. And that’s quite powerful.

Let me elaborate slightly on what that means:

If there exists any decision problem (that is, a problem with a yes/no answer) whose witness (solution) can be verified in polynomial time, then:
We can prove that said solution exists by (1) translating the problem into an instance of the graph three-coloring problem, and (2) running the GMW protocol.*

This amazing result gives us interactive zero knowledge proofs for every statement in NP. The only problem is that it’s almost totally unusable.

From theory into practice

If you’re of a practical mindset, you’re probably shaking your head at all this talk of ZK proofs. That’s because actually using this approach would be an insanely expensive and stupid thing to do. Most likely you’d first represent your input problem as a boolean circuit where the circuit is satisfied if and only if you know the correct input. Then you’d have to translate your circuit into a graph, resulting in some further blowup. Finally you’d need to run the GMW protocol, which is damned expensive all by itself.

So in practice nobody does this. It’s really considered a ‘feasibility’ result. Once you show that something is possible, the next step is to make it efficient.

But we do use zero knowledge proofs, almost every day. In this post I’m going to spend some time talking about the more practical ZK proofs that we actually use. To do that I just need give just a tiny bit of extra background.

Proofs vs. Proofs of Knowledge

Before we go on, there’s one more concept we need to cover. Specifically, we need to discuss what precisely we’re proving when we conduct a zero knowledge proof.

Let me explain. At a high level, there are two kinds of statement you might want to prove in zero knowledge. Roughly speaking, these break up as follows.

Statements about “facts”. For example, I might wish to prove that “a specific graph has a three coloring” or “some number N is in the set of composite numbers“. Each of these is a statement about some intrinsic property of the universe.

Statements about my personal knowledge. Alternatively, I might wish to prove that I know some piece information. Examples of this kind of statement include: “I know a three coloring for this graph”, or “I know the factorization of N”. These go beyond merely proving that a fact is true, and actually rely on what the Prover knows.

It’s important to recognize that there’s a big difference between these two kinds of statements! For example, it may be possible to prove that a number N is composite even if you don’t know the full factorization. So merely proving the first statement is not equivalent to proving the second one.

The second class of proof is known as a “proof of knowledge”. It turns out to be extremely useful for proving a variety of statements that we use in real life. In this post, we’ll mostly be focusing on this kind of proof.

The Schnorr identification protocol

Now that we’ve covered some of the required background, it’s helpful to move on to a specific and very useful proof of knowledge that was invented by Claus-Peter Schnorr in the 1980s. At first glance, the Schnorr protocol may seem a bit odd, but in fact it’s the basis of many of our modern signature schemes today.

Schnorr wasn’t really concerned with digital signatures, however. His concern was with identification. Specifically, let’s imagine that Alice has published her public key to the world, and later on wants to prove that she knows the secret key corresponding to that public key. This is the exact problem that we encounter in real-world protocols such as public-key SSH, so it turns out to be well-motivated.

Schnorr began with the assumption that the public key would be of a very specific format. Specifically, let p be some prime number, and let g be a generator of a cyclic group of prime-order q. To generate a keypair, Alice would first pick a random integer a between 1 and q, and then compute the keypair as:

$PK_{A} = g^a~mod~p$ , $SK_{A} = a$

(If you’ve been around the block a time or two, you’ll probably notice that this is the same type of key used for Diffie-Hellman and the DSA signing algorithm. That’s not a coincidence, and it makes this protocol very useful.)

Alice keeps her secret key to herself, but she’s free to publish her public key to the world. Later on, when she wants to prove knowledge of her secret key, she conducts the following simple interactive protocol with Bob:

There’s a lot going on in here, so let’s take a minute to unpack things.

First off, we should ask ourselves if the protocol is complete. This is usually the easiest property to verify: if Alice performs the protocol honestly, should Bob be satisfied at the end of it? In this case, completeness is pretty easy to see just by doing a bit of substitution:

Proving soundness

The harder property is soundness. Mainly because we don’t yet have a good definition of what it means for a proof of knowledge to be sound. Remember that what we want to show is the following:

If Alice successfully convinces Bob, then she must know the secret key a.

It’s easy to look at the equations above and try to convince yourself that Alice’s only way to cheat the protocol is to know a. But that’s hardly a proof.

When it comes to demonstrating the soundness of a proof of knowledge, we have a really nice formal approach. Just as with the Simulator we discussed above, we need to demonstrate the existence of a special algorithm. This algorithm is called a knowledge extractor, and it does exactly what it claims to. A knowledge extractor (or just ‘Extractor’ for short) is a special type of Verifier that interacts with a Prover, and — if the Prover succeeds in completing the proof — the Extractor should be able to extract the Prover’s original secret.

And this answers our question above. To prove soundness for a proof of knowledge, we must show that an Extractor exists for every possible Prover.

Of course this again seems totally contradictory to the purpose of a zero knowledge protocol — where we’re not supposed to be able to learn secrets from a Prover. Fortunately we’ve already resolved this conundrum once for the case of the Simulator. Here again, we take the same approach. The Extractor is not required to exist during a normal run of the protocol. We simply show that it exists if we’re allowed to take special liberties with the Prover — in this case, we’ll use ‘rewinding’ to wind back the Prover’s execution and allow us to extract secrets.

The extractor for the Schnorr protocol is extremely clever — and it’s also pretty simple. Let’s illustrate it in terms of a protocol diagram. Alice (the Prover) is on the left, and the Extractor is on the right:

The key observation here is that by rewinding Alice’s execution, the Extractor can ‘trick’ Alice into making two different proof transcripts using the same k. This shouldn’t normally happen in a real protocol run, where Alice specifically picks a new k for each execution of the protocol.

If the Extractor can trick Alice into doing this, then he can solve the following simple equation to recover Alice’s secret:

It’s worth taking a moment right now to note that this also implies a serious vulnerability in bad implementations of the Schnorr protocol. If you ever accidentally use the same k for two different runs of the protocol, an attacker may be able to recover your secret key! This can happen if you use a bad random number generator.

Indeed, those with a bit more experience will notice that this is similar to a real attack on systems (with bad random number generators) that implement ECDSA or DSA signatures! This is also not a coincidence. The (EC)DSA signature family is based on Schnorr. Ironically, the developers of DSA managed to retain this vulnerability of the Schorr family of protocols while at the same time ditching the security proof that makes Schnorr so nice.

Proving zero-knowledge(ness) against an honest Verifier

Having demonstrated that Schnorr signatures are complete and sound, it remains only to prove that they’re ‘zero knowledge’. Remember that to do this, normally we require a Simulator that can interact with any possible Verifier and produce a ‘simulated’ transcript of the proof, even if the Simulator doesn’t know the secret it’s proving it knows.

The standard Schnorr protocol does not have such a Simulator, for reasons we’ll get into in a second. Instead, to make the proof work we need to make a special assumption. Specifically, the Verifier needs to be ‘honest’. That is, we need to make the special assumption that it will run its part of the protocol correctly — namely, that it will pick its challenge “c” using only its random number generator, and will not choose this value based on any input we provide it. As long as it does this, we can construct a Simulator.

Here’s how the Simulator works.

Let’s say we are trying to prove knowledge of a secret $a$ for some public key $g^a~mod~p$ — but we don’t actually know the value $a$ . Our Simulator assumes that the Verifier will choose some value $c$ as its challenge, and moreover, it knows that the honest Verifier will choose the value $c$ only based on its random number generator — and not based on any inputs the Prover has provided.

First, output some initial $g^{k_1}$ as the Prover’s first message, and find out what challenge $c$ the Verifier chooses.
Rewind the Verifier, and pick a random integer $z$ in the range $\{0,\dots,q-1\}$ .
Compute $g^{k_2} = g^z * g^{a (-c)}$ and output $g^{k_2}$ as the Prover’s new initial message.
When the Verifier challenges on $c$ again, output $z$ .

Notice that the transcript $g^k, c, z$ will verify correctly as a perfectly valid, well-distributed proof of knowledge of the value $a$ . The Verifier will accept this output as a valid proof of knowledge of $a$ , even though the Simulator does not know $a$ in the first place!

What this proves is that if we can rewind a Verifier, then (just as in the first post in this series) we can always trick the Verifier into believing we have knowledge of a value, even when we don’t. And since the statistical distribution of our protocol is identical to the real protocol, this means that our protocol must be zero knowledge — against an honest Verifier.

From interactive to non-interactive

So far we’ve shown how to use the Schnorr protocol to interactively prove knowledge of a secret key $a$ that corresponds to a public key $g^{a}$ . This is an incredibly useful protocol, but it only works if our Verifier is online and willing to interact with us.

An obvious question is whether we can make this protocol work without interaction. Specifically, can I make a proof that I can send you without you even being online. Such a proof is called a non-interactive zero knowledge proof (NIZK). Turning Schnorr into a non-interactive proof seems initially quite difficult — since the protocol fundamentally relies on the Verifier picking a random challenge. Fortunately there is a clever trick we can use.

This technique was developed by Fiat and Shamir in the 1980s. What they observed was that if you have a decent hash function lying around, you can convert an interactive protocol into a non-interactive one by simply using the hash function to pick the challenge.

Specifically, the revised protocol for proving knowledge of $a$ with respect to a public key $g^a$ looks like this:

The Prover picks $g^k$ (just as in the interactive protocol).
Now, the prover computes the challenge as $c = H(g^k || M)$ where $H()$ is a hash function, and M is an (optional) and arbitary message string.
Compute $ac + k~mod~q$ (just as in the interactive protocol).

The upshot here is that the hash function is picking the challenge $c$ without any interaction with the Verifier. In principle, if the hash function is “strong enough” (meaning, it’s a random oracle) then the result is a completely non-interactive proof of knowledge of the value $a$ that the Prover can send to the Verifier. The proof of this is relatively straightforward.

The particularly neat thing about this protocol is that it isn’t just a proof of knowledge, it’s also a signature scheme. That is, if you put a message into the (optional) value $M$ , you obtain a signature on $M$ , which can only be produced by someone who knows the secret key $a$ . The resulting protocol is called the Schnorr signature scheme, and it’s the basis of real-world protocols like EdDSA.

Phew.

Yes, this has been a long post and there’s probably a lot more to be said. Hopefully there will be more time for that in a third post — which should only take me another three years.

Notes:

* In this definition, it’s necessary that the statement be literally true.

How do we pay for privacy?

If you haven’t read Julia Angwin’s excellent profile of GnuPG’s lead developer Werner Koch, now would be a great time to check it out. Koch, who single-handedly wrote GnuPG in 1997, has been doggedly maintaining the codebase ever since — and not getting paid very well for it. Despite good intentions on all sides, Koch has been essentially going broke.

The news is not all bad. In response to Angwin’s piece, ‘the Internet’ rallied to GnuPG’s aid. So far individual and corporate donors have coughed up over EU 200,000 to pay Koch and even hire him some help. It looks like GnuPG is saved, and so are its users — for the moment.

But is this model really sustainable? I’m pretty skeptical.

Sooner or later this money will run out. And next time this happens, the Internet might not have a quarter million in spare change to fork over. In fact, that’s already the normal state of affairs for most privacy tools — software ranging from GPGTools to OTR — most of which subsist on meager donations and volunteer time. The scary part is that thousands of people depend on these tools for their privacy and even their physical safety.

This raises a question: how can we support the long-term development and maintenance of privacy tools? It turns out that nobody really knows the answer to this — but a few people are groping towards a solution. In this (entirely non-technical) post I’m going to talk a bit about what people are doing today — and how we might do better in the future.NB: I should mention that most of the smart ideas in this post come from Meredith Whittaker, who leads Google’s Open Source Research Group, and helped found Simply Secure. The dumb ideas are all mine.

How we support privacy tools today

If you’re developing, or are interested in developing privacy tools in 2015, there are a handful of funding sources for you to choose from. They include:

Self-funding. The vast majority of privacy tools come from engineers working in their spare time. This is a great way to develop prototypes and simple tools, but it tends not to be very sustainable — particularly when developers have to choose between family and code maintenance.
Donations, charges and crowd-funding. A few major projects pull in some cash through donations, but this seems to work well only during a catastrophe or when the spotlight is on a particular project. GnuPG, for example, made a ton of money following their recent publicity, but before that they averaged a minuscule $20k per year — and this is for one of the most widely used privacy tools on the Internet! Projects like GPG Tools recently began charging for their software, which may work a bit better. Unfortunately this is anathema for young projects that rely on network effect for their success.
Industry grants. From time to time major corporations give out modest chunks of money to tool developers, particularly when those companies use tools internally. Case in point: Google Stripe and Facebook just gave $50k each to GnuPG, and the Linux Foundation Core Infrastructure Initiative (essentially an industry funding group*) kicked in an additional $60k. Unfortunately this money is tragically difficult to come by — for the average developer it might as well not exist.
Government and NGOs. Perhaps the most promising source of money comes from the U.S. government, and a set of NGOs that have sprung up to disburse it. The State Dept. directly funds the Tor Project, and the government also provides block grants to groups such as the Open Technology Fund (via Radio Free Asia) and the — confusingly similar — Open Technology Institute (at New America Foundation). OTF in particular has done a phenomenal job at funding both development and audit of privacy tools.
Internal industry funding. Once in a blue moon a company proposes to internally develop a privacy tool like Google/Yahoo End-to-End. I’ll believe this works when I see it.
Academic research funding. A few academic tools have managed to slip into this space, most notably OTR and Tor. But this model is awfully hard to sustain, mostly because academics don’t get paid to do things. We get paid to teach and write papers. It’s hard to sustain software development this way.
Bitcoin wallet theft. This is mostly a joke.

Of these funding sources, the U.S. government is by far the heaviest hitter — responsible for funding many well-known projects such as Tor and TextSecure. While I tend to think any money is good money in the hands of right people, I should point out that this view is not universally shared. In part this is because we, admittedly, don’t have much of a process to validate code and convince non-experts that this process isn’t producing compromised code.

As Jillian York points out, US government funding also comes with some political baggage, and sadly, tends to attract more than its fair share of paranoia.

Developers need more than just money!

If you give a starving privacy tool a million bucks, you get a well-fed privacy tool. Unfortunately it may not actually be a better privacy tool. That’s not because people are trying to waste your cash. It turns out that software development, crypto, and security are just plain hard.

So yes, people need to eat — that’s a baseline. But beyond that what developers also need are things like expert guidance, security audits, analysis tools, and collaboration with other developers. They also really need help with the hardest problem in computer science, which is turning a pile of code into a product that people want to use.

A few groups like OTF (in case you’re not getting the hint, I really like them*) are trying to help with some of this. They fund professional code audits through groups like iSEC Partners. They fund usability resources and provide communications help. They host regular meetings where project members can get together and dork out about how to handle encrypted spam. A friend calls this a ‘dog park for developers who haven’t been outside for a while.’ This sounds silly, but it really, really helps.

Beyond those things, there’s still an awful problem of coordinating all of this technical stuff so that auditing results adhere to some consistent standard and produce knowledge that gets retained within the organization, as well as seeing that tools get proper academic analysis where necessary. And finally, there are useful services such as connecting developers with UI designers and helping them to turn their tools into real, usable products.

A few groups have done well at this all on their own. Moxie’s Open Whisper Systems not only launched a popular messaging app, but managed to get TextSecure (the protocol) into 600 million WhatsApp clients. Unfortunately this kind of success doesn’t come easy to people and requires a lot of assistance. Institutions can really help.

How can we do better?

There are a lot of answers to this question. But since this is a blog post, let’s swing for the fences. What’s really needed is a privacy incubator. A place that provides both funding (or at least, guides funding) as well as in-house technical staff and researchers, non-technical help such a communications, infrastructure, a great advisory board, and access to tools and engineers.

In essence this center would combine all the best parts of NGOs, academic institutions, and corporate research into one center. It would help with projects ranging from research to development, and would also provide infrastructure for developers — helping to keep them from re-inventing the wheel with each new idea, and perhaps even helping projects to merge when one has strengths. Connecting them with corporations who could conceivably deploy their tool.

This organization could also provide resources ranging from legal advice to marketing, two areas that software developers are notoriously bad at. It might even provide important, but miscellaneous resources like healthcare.

Please keep in mind I’m not advocating the creation of an entirely new organization — god forbid, we have enough of those already (the XKCD cartoon at right comes to mind). Instead, the goal should be to identify organizations that are already working and either connect that, or build up their capabilities with a large infusion of cash.

Anyway, we can all dream. But this is a dream that would actually make some difference.

So will this happen?

I guess it depends on the will and the money. It also depends on us: that is, on the willingness of the technically focused privacy/security community to accept that many of the elements we need to succeed are outside of our personal realm of expertise, and we need help with them.

Friends of mine also keep telling me that there are major philanthropic organizations out there looking to make a difference in this area. I’m still waiting to see it happen, but wheels turn slowly. One thing I can tell you: it wouldn’t take much to do better than what we have today.

* Full disclosure: I’m on the advisory board for Linux Foundation’s Core Infrastructure Initiative. I also once sat in on an advisory board meeting for OTF. Nobody paid me — but they did feed me lunch.

Noodling about IM protocols

The last couple of months have been a bit slow in the blogging department. It’s hard to blog when there are exciting things going on. But also: I’ve been a bit blocked. I have two or three posts half-written, none of which I can quite get out the door.

Instead of writing and re-writing the same posts again, I figured I might break the impasse by changing the subject. Usually the easiest way to do this is to pick some random protocol and poke at it for a while to see what we learn.

The protocols I’m going to look at today aren’t particularly ‘random’ — they’re both popular encrypted instant messaging protocols. The first is OTR (Off the Record Messaging). The second is Cryptocat’s group chat protocol. Each of these protocols has a similar end-goal, but they get there in slightly different ways.

I want to be clear from the start that this post has absolutely no destination. If you’re looking for exciting vulnerabilities in protocols, go check out someone else’s blog. This is pure noodling.

The OTR protocol

OTR is probably the most widely-used protocol for encrypting instant messages. If you use IM clients like Adium, Pidgin or ChatSecure, you already have OTR support. You can enable it in some other clients through plugins and overlays.

OTR was originally developed by Borisov, Goldberg and Brewer and has rapidly come to dominate its niche. Mostly this is because Borisov et al. are smart researchers who know what they’re doing. Also: they picked a cool name and released working code.

OTR works within the technical and usage constraints of your typical IM system. Roughly speaking, these are:

Messages must be ASCII-formatted and have some (short) maximum length.
Users won’t bother to exchange keys, so authentication should be “lazy” (i.e., you can authenticate your partners after the fact).
Your chat partners are all FBI informants so your chat transcripts must be plausibly deniable — so as to keep them from being used as evidence against you in a court of law.

Coming to this problem fresh, you might find goal (3) a bit odd. In fact, to the best of my knowledge no court in the history of law has ever used a cryptographic transcript as evidence that a conversation occurred. However it must be noted that this requirement makes the problem a bit more sexy. So let’s go with it!

“Dammit, they used a deniable key exchange protocol” said no Federal prosecutor ever.

The OTR (version 2/3) handshake is based on the SIGMA key exchange protocol. Briefly, it assumes that both parties generate long-term DSA public keys which we’ll denote by (pubA, pubB). Next the parties interact as follows:

The OTRv2/v3 AKE. Diagram by Bonneau and Morrison, all colorful stuff added. There’s also
an OTRv1 protocol that’s too horrible to talk about here.

There are four elements to this protocol:

Hash commitment. First, Bob commits to his share of a Diffie-Hellman key exchange (g^x) by encrypting it under a random AES key r and sending the ciphertext and a hash of g^x over to Alice.
Diffie-Hellman Key Exchange. Next, Alice sends her half of the key exchange protocol (g^y). Bob can now ‘open’ his share to Alice by sending the AES key r that he used to encrypt it in the previous step. Alice can decrypt this value and check that it matches the hash Bob sent in the first message.Now that both sides have the shares (g^x, g^y) they each use their secrets to compute a shared secret g^{xy} and hash the value several ways to establish shared encryption keys (c’, Km2, Km’2) for subsequent messages. In addition, each party hashes g^{xy} to obtain a short “session ID”.
The sole purpose of the commitment phase (step 1) is to prevent either Alice or Bob from controlling the value of the shared secret g^{xy}. Since the session ID value is derived by hashing the Diffie-Hellman shared secret, it’s possible to use a relatively short session ID value to authenticate the channel, since neither Alice nor Bob will be able to force this ID to a specific value.
Exchange of long-term keys and signatures. So far Alice and Bob have not actually authenticated that they’re talking to each other, hence their Diffie-Hellman exchange could have been intercepted by a man-in-the-middle attacker. Using the encrypted channel they’ve previously established, they now set about to fix this.Alice and Bob each send their long-term DSA public key (pubA, pubB) and key identifiers, as well as a signature on (a MAC of) the specific elements of the Diffie-Hellman message (g^x, g^y) and their view of which party they’re communicating with. They can each verify these signatures and abort the connection if something’s amiss.**
Revealing MAC keys. After sending a MAC, each party waits for an authenticated response from its partner. It then reveals the MAC keys for the previous messages.
Lazy authentication. Of course if Alice and Bob never exchange public keys, this whole protocol execution is still vulnerable to a man-in-the-middle (MITM) attack. To verify that nothing’s amiss, both Alice and Bob should eventually authenticate each other. OTR provides three mechanisms for doing this: parties may exchange fingerprints (essentially hashes) of (pubA, pubB) via a second channel. Alternatively, they can exchange the “session ID” calculated in the second phase of the protocol. A final approach is to use the Socialist Millionaires’ Problem to prove that both parties share the same secret.

The OTR key exchange provides the following properties:

Protecting user identities. No user-identifying information (e.g., long-term public keys) is sent until the parties have first established a secure channel using Diffie-Hellman. The upshot is that a purely passive attacker doesn’t learn the identity of the communicating partners — beyond what’s revealed by the higher-level IM transport protocol.*

Unfortunately this protection fails against an active attacker, who can easily smash an existing OTR connection to force a new key agreement and run an MITM on the Diffie-Hellman used during the next key agreement. This does not allow the attacker to intercept actual message content — she’ll get caught when the signatures don’t check out — but she can view the public keys being exchanged. From the client point of view the likely symptoms are a mysterious OTR error, followed immediately by a successful handshake.

One consequence of this is that an attacker could conceivably determine which of several clients you’re using to initiate a connection.

Weak deniability. The main goal of the OTR designers is plausible deniability. Roughly, this means that when you and I communicate there should be no binding evidence that we really had the conversation. This rules out obvious solutions like GPG-based chats, where individual messages would be digitally signed, making them non-repudiable.

Properly defining deniability is a bit complex. The standard approach is to show the existence of an efficient ‘simulator’ — in plain English, an algorithm for making fake transcripts. The theory is simple: if it’s trivial to make fake transcripts, then a transcript can hardly be viewed as evidence that a conversation really occurred.

OTR’s handshake doesn’t quite achieve ‘strong’ deniability — meaning that anyone can fake a transcript between any two parties — mainly because it uses signatures. As signatures are non-repudiable, there’s no way to fake one without actually knowing your public key. This reveals that we did, in fact, communicate at some point. Moreover, it’s possible to create an evidence trail that I communicated with you, e.g., by encoding my identity into my Diffie-Hellman share (g^x). At very least I can show that at some point you were online and we did have contact.

But proving contact is not the same thing as proving that a specific conversation occurred. And this is what OTR works to prevent. The guarantee OTR provides is that if the target was online at some pointand you could contact them, there’s an algorithm that can fake just about any conversation with the individual. Since OTR clients are, by design, willing to initiate a key exchange with just about anyone, merely putting your client online makes it easy for people to fake such transcripts.***

Towards strong deniability. The ‘weak’ deniability of OTR requires at least tacit participation of the user (Bob) for which we’re faking the transcript. This isn’t a bad property, but in practice it means that fake transcripts can only be produced by either Bob himself, or someone interacting online with Bob. This certainly cuts down on your degree of deniability.

A related concept is ‘strong deniability‘, which ensures that any party can fake a transcript using only public information (e.g., your public keys).

OTR doesn’t try achieve strong deniability — but it does try for something in between. The OTR version of deniability holds that an attacker who obtains the network traffic of a real conversation — even if they aren’t one of the participants — should be able alter the conversation to say anything he wants. Sort of.

The rough outline of the OTR deniability process is to generate a new message authentication key for each message (using Diffie-Hellman) and then reveal those keys once they’ve been used up. In theory, a third party can obtain this transcript and — if they know the original message content — they can ‘maul’ the AES-CTR encrypted messages into messages of their choice, then they can forge their own MACs on the new messages.

OTR message transport (source: Bonneau and Morrison, all colored stuff added).

Thus our hypothetical transcript forger can take an old transcript that says “would you like a Pizza“ and turn it into a valid transcript that says, for example, “would you like to hack STRATFOR“… Except that they probably can’t, since the first message is too short and… oh lord, this whole thing is a stupid idea — let’s stop talking about it.

The OTRv1 handshake. Oh yes, there’s also an OTRv1 protocol that has a few issues and isn’t really deniable. Even better, an MITM attacker can force two clients to downgrade to it, provided both support that version. Yuck.

So that’s the OTR protocol. While I’ve pointed out a few minor issues above, the truth is that the protocol is generally an excellent way to communicate. In fact it’s such a good idea that if you really care about secrecy it’s probably one of the best options out there.

Cryptocat

Since we’re looking at IM protocols I thought it might be nice to contrast with another fairly popular chat protocol: Cryptocat‘s group chat. Cryptocat is a web-based encrypted chat app that now runs on iOS (and also in Thomas Ptacek’s darkest nightmares).

Cryptocat implements OTR for ‘private’ two-party conversations. However OTR is not the default. If you use Cryptocat in its default configuration, you’ll be using its hand-rolled protocol for group chats.

The Cryptocat group chat specification can be found here, and it’s remarkably simple. There are no “long-term” keys in Cryptocat. Diffie-Hellman keys are generated at the beginning of each session and re-used for all conversations until the app quits. Here’s the handshake between two parties:

Cryptocat group chat handshake (current revision). Setting is Curve25519. Keys are generated when the application launches, and re-used through the session.

If multiple people join the room, every pair of users repeats this handshake to derive a shared secret between every pair of users. Individuals are expected to verify each others’ keys by checking fingerprints and/or running the Socialist Millionaire protocol.

Unlike OTR, the Cryptocat handshake includes no key confirmation messages, nor does it attempt to bind users to their identity or chat room. One implication of this is that I can transmit someone else’s public key as if it were my own — and the recipients of this transmission will believe that the person is actually part of the chat.

Moreover, since public keys aren’t bound to the user’s identity or the chat room, you could potentially route messages between a different user (even a user in a different chat room) while making it look like they’re talking to you. Since Cryptocat is a group chat protocol, there might be some interesting things you could do to manipulate the conversation in this setting.****

Does any of this matter? Probably not that much, but it would be relatively easy (and good) to fix these issues.

Message transmission and consistency. The next interesting aspect of Cryptocat is the way it transmits encrypted chat messages. One of the core goals of Cryptocat is to ensure that messages are consistent between individual users. This means that all users should be able to verify that the other user is receiving the same data as it is.

Cryptocat uses a slightly complex mechanism to achieve this. For each pair of users in the chat, Cryptocat derives an AES key and an MAC key from the Diffie-Hellman shared secret. To send a message, the client:

Pads the message by appending 64 bytes of random padding.
Generates a random 12-byte Initialization Vector for each of the Nusers in the chat.
Encrypts the message using AES-CTR under the shared encryption key for each user.
Concatenates all of the N resulting ciphertexts/IVs and computes an HMAC of the whole blob under each recipient’s key.
Calculates a ‘tag’ for the message by hashing the following data:
padded plaintext || HMAC-SHA512alice || HMAC-SHA512bob || HMAC-SHA512carol || …
Broadcasts the N ciphertexts, IVs, MACs and the single ‘tag’ value to all users in the conversation.

When a recipient receives a message from another user, it verifies that:

The message contains a valid HMAC under its shared key.
This IV has not been received before from this sender.
The decrypted plaintext is consistent with the ‘tag’.

Roughly speaking, the idea here is to make sure that every user receives the same message. The use of a hashed plaintext is a bit ugly, but the argument here is that the random padding protects the message from guessing attacks. Make what you will of this.

Anti-replay. Cryptocat also seeks to prevent replay attacks, e.g., where an attacker manipulates a conversation by simply replaying (or reflecting) messages between users, so that users appear to be repeating statements. For example, consider the following chat transcripts:

Replays and reflection attacks.

Replay attacks are prevented through the use of a global ‘IV array’ that stores all previously received and sent IVs to/from all users. If a duplicate IV arrives, Cryptocat will reject the message. This is unwieldy but it generally seems ok to prevent replays and reflection.

A limitation of this approach is that the IV array does not live forever. In fact, from time to time Cryptocat will reset the IV array without regenerating the client key. This means that if Alice and Bob both stay online, they can repeat the key exchange and wind up using the same key again — which makes them both vulnerable to subsequent replays and reflections. (Update: This issue has since been fixed).

In general the solution to these issues is threefold:

Keys shouldn’t be long-term, but should be regenerated using new random components for each key exchange.
Different keys should be derived for the Alice->Bob and Bob->Alice direction
It would be be more elegant to use a message counter than to use this big, unwieldy key array.

The good news is that the Cryptocat developers are working on a totally new version of the multi-party chat protocol that should be enormously better.

In conclusion

I said this would be a post that goes nowhere, and I delivered! But I have to admit, it helps to push it out of my system.

None of the issues I note above are the biggest deal in the world. They’re all subtle issues, which illustrates two things: first, that crypto is hard to get right. But also: that crypto rarely fails catastrophically. The exciting crypto bugs that cause you real pain are still few and far between.

Notes:

* In practice, you might argue that the higher-level IM protocol already leaks user identities (e.g., Jabber nicknames). However this is very much an implementation choice. Moreover, even when using Jabber with known nicknames, you might access the Jabber server using one of several different clients (your computer, phone, etc.). Assuming you use Tor, the only indication of this might be the public key you use during OTR. So there’s certainly useful information in this protocol.

** Notice that OTR signs a MAC instead of a hash of the user identity information. This happens to be a safe choice given that the MAC used is based on HMAC-SHA2, but it’s not generally a safe choice. Swapping the HMAC out for a different MAC function (e.g., CBC-MAC) would be catastrophic.

*** To get specific, imagine I wanted to produce a simulated transcript for some conversation with Bob. Provided that Bob’s client is online, I can send Bob any g^x value I want. It doesn’t matter if he really wants to talk to me — by default, his client will cheerfully send me back his own g^y and a signature on (g^x, g^y, pub_B, keyid_B) which, notably, does not include my identity. From this point on all future authentication is performed using MACs and encrypted under keys that are known to both of us. There’s nothing stopping me from faking the rest of the conversation.

**** Incidentally, a similar problem exists in the OTRv1 protocol.

Attack of the week: OpenSSL Heartbleed

Ouch. (Logo from heartbleed.com)

I start every lecture in my security class by asking the students to give us any interesting security or crypto news they’ve seen recently, preferably with a focus on vulnerabilities. The start of my last class was pretty lame, which meant either (1) we’d finally learned how to make our crypto software secure, or (2) something terrible was about to happen.

I’ll let you take a guess which one it was.

The news today is all about the Heartbleed bug in OpenSSL (nb: a truly awful name that makes me glad there are no other vital organs referenced in the TLS code). Unlike all the fancy crypto-related attacks we’ve seen recently — BEAST, CRIME, Lucky13, etc. — Heartbleed doesn’t require any interesting crypto. In fact it’s the result of a relatively mundane coding error. And predictably, this makes it more devastating than all of those fancy attacks put together.

I’m going to talk about Heartbleed using the ‘fun’ question/answer format I save for this kind of post.

What’s Heartbleed and why should I care about OpenSSL?

In case you haven’t read the Heartbleed website, go do that. Here I’ll just give a quick overview.

Heartbleed is a surprisingly small bug in a piece of logic that relates to OpenSSL’s implementation of the TLS ‘heartbeat’ mechanism. The bug is present in OpenSSL versions 1.0.1 through 1.0.1f (and not in other versions). Sadly, these versions have seen a great deal of adoption lately, because security professionals have been urging developers to implement more recent versions of TLS (1.1 and 1.2). Deployment of TLS 1.2 currently stands at about 30% of the SSL Pulse data set. Many of those servers are likely vulnerable.

The problem is fairly simple: there’s a tiny vulnerability — a simple missing bounds check — in the code that handles TLS ‘heartbeat’ messages. By abusing this mechanism, an attacker can request that a running TLS server hand over a relatively large slice (up to 64KB) of its private memory space. Since this is the same memory space where OpenSSL also stores the server’s private key material, an attacker can potentially obtain (a) long-term server private keys, (b) TLS session keys, (c) confidential data like passwords, (d) session ticket keys.

Alleged Yahoo user credentials visible due to Heartbleed (source: Mark Loman).

Any of the above may allow an attacker to decrypt ongoing TLS sessions or steal useful information. However item (a) above is by far the worst, since an attacker who obtains the server’s main private keys can potentially decrypt past sessions (if made using the non-PFS RSA handshake) or impersonate the server going forward. Worst of all, the exploit leaves no trace.

You should care about this because — whether you realize it or not — a hell of a lot of the security infrastructure you rely on is dependent in some way on OpenSSL. This includes many of the websites that store your personal information. And for better or for worse, industry’s reliance on OpenSSL is only increasing.

What’s the remedy?

Unfortunately it’s pretty nasty.

You can test if a given server is vulnerable using one of these tools (note: use at your own risk). Having identified a problem, the first step is to patch OpenSSL. Fortunately this is relatively easy. The 1.0.1g version is not vulnerable, and Debian has a patch. You can also recompile OpenSSL with the –DOPENSSL_NO_HEARTBEATS option.

Sadly, this is only the beginning. Since there’s no way to tell whether a server has been exploited (and exploit code is now in the wild) you need to assume that it is. This means the safe move is to revoke your certificate and get a new one. Have fun.

What’s the bug?

The TLS Heartbeat mechanism is designed to keep connections alive even when no data is being transmitted. Heartbeat messages sent by one peer contain random data and a payload length. The other peer is suppose to respond with a mirror of exactly the same data.

   When a HeartbeatRequest message is received and sending a
   HeartbeatResponse is not prohibited as described elsewhere in this
   document, the receiver MUST send a corresponding HeartbeatResponse
   message carrying an exact copy of the payload of the received
   HeartbeatRequest.

The data structure of the request looks like this. Note the two-byte payload length:

   struct {
      HeartbeatMessageType type;
      uint16 payload_length;
      opaque payload[HeartbeatMessage.payload_length];
      opaque padding[padding_length];

} HeartbeatMessage;

Which brings us to the bug. The original bug was introduced in this Git commit. The code appears in different files (for DTLS and TLS). Here we’ll look at the file t1_lib.c:

         /* Read type and payload length first */

         hbtype = *p++;

         n2s(p, payload);

         pl = p;

2416

         if (s->msg_callback)

                 s->msg_callback(0, s->version, TLS1_RT_HEARTBEAT,

                         &s->s3->rrec.data[0], s->s3->rrec.length,

                         s, s->msg_callback_arg);

2421

         if (hbtype == TLS1_HB_REQUEST)

{

                 unsigned char *buffer, *bp;

                 int r;

2426

                 /* Allocate memory for the response, size is 1 bytes

                  * message type, plus 2 bytes payload length, plus

                  * payload, plus padding

*/

                 buffer = OPENSSL_malloc(1 + 2 + payload + padding);

                 bp = buffer;

2433

                 /* Enter response type, length and copy payload */

                 *bp++ = TLS1_HB_RESPONSE;

                 s2n(payload, bp);

                 memcpy(bp, pl, payload);

2438

                 r = ssl3_write_bytes(s, TLS1_RT_HEARTBEAT, buffer, 3 + payload + padding);

As you can see, the incoming (adversarially-generated) data contains a payload length (“payload”) which is trusted without bounds checks. OpenSSL then allocates a buffer for its response, and copies “payload” data bytes from the pointer “pl” into it. Unfortunately, there’s no check to make sure that there are actually “payload” bytes in data, or that this is in bounds. Hence the attacker gets a slice of data from main memory — one that’s up to 64KB in length.

The fix is equally simple. Just add a bounds check:

+    /* Read type and payload length first */
+    if (1 + 2 + 16 > s->s3->rrec.length)
+        return 0; /* silently discard */
+    hbtype = *p++;
+    n2s(p, payload);
+    if (1 + 2 + payload + 16 > s->s3->rrec.length)
+        return 0; /* silently discard per RFC 6520 sec. 4 */
+    pl = p;
+

Wasn’t that dull?

Come on guys, next time please: use a cache timing weakness in AES-GCM encryption. This whole ‘effectively breaking OpenSSL by finding simple C coding bugs’ isn’t even challenging.

Should we rail against the OpenSSL developers for this?

Don’t even think about it. The OpenSSL team, which is surprisingly small, has been given the task of maintaining the world’s most popular TLS library. It’s a hard job with essentially no pay. It involves taking other folks’ code (as in the case of Heartbeat) and doing a best-possible job of reviewing it. Then you hope others will notice it and disclose it responsibly before disasters happen.

The OpenSSL developers have a pretty amazing record considering the amount of use this library gets and the quantity of legacy cruft and the number of platforms (over eighty!) they have to support. Maybe in the midst of patching their servers, some of the big companies that use OpenSSL will think of tossing them some real no-strings-attached funding so they can keep doing their job.

Dear Apple: Please set iMessage free

Normally I avoid complaining about Apple because (a) there are plenty of other people carrying that flag, and (b) I honestly like Apple and own numerous lovely iProducts. I’m even using one to write this post.

Moroever, from a security point of view, there isn’t that much to complain about. Sure, Apple has a few irritating habits — shipping old, broken versions of libraries in its software, for example. But on the continuum of security crimes this stuff is at best a misdemeanor, maybe a half-step above ‘improper baby naming‘. Everyone’s software sucks, news at 11.

There is, however, one thing that drives me absolutely nuts about Apple’s security posture. You see, starting about a year ago Apple began operating one of the most widely deployed encrypted text message services in the history of mankind. So far so good. The problem is that they still won’t properly explain how it works.

And nobody seems to care.

I am, of course, referring to iMessage, which was deployed last year in iOS Version 5. It allows — nay, encourages — users to avoid normal carrier SMS text messages and to route their texts through Apple instead.

Now, this is not a particularly new idea. But iMessage is special for two reasons. First it’s built into the normal iPhone texting application and turned on by default. When my Mom texts another Apple user, iMessage will automatically route her message over the Internet. She doesn’t have to approve this, and honestly, probably won’t even know the difference.

Secondly, iMessage claims to bring ‘secure end-to-end encryption‘ (and authentication) to text messaging. In principle this is huge! True end-to-end encryption should protect you from eavesdropping even by Apple, who carries your message. Authentication should protect you from spoofing attacks. This stands in contrast to normal SMS which is often not encrypted at all.

So why am I looking a gift horse in the mouth? iMessage will clearly save you a ton in texting charges and it will secure your messages for free. Some encryption is better than none, right?

Well maybe.

To me, the disconcerting thing about iMessage is how rapidly it’s gone from no deployment to securing billions of text messages for millions of users. And this despite the fact that the full protocol has never been published by Apple or (to my knowledge) vetted by security experts. (Note: if I’m wrong about this, let me know and I’ll eat my words.)

What’s worse is that Apple has been hyping iMessage as a secure protocol; they even propose it as a solution to some serious SMS spoofing bugs. For example:

Apple takes security very seriously. When using iMessage instead of SMS, addresses are verified which protects against these kinds of spoofing attacks. One of the limitations of SMS is that it allows messages to be sent with spoofed addresses to any phone, so we urge customers to be extremely careful if they’re directed to an unknown website or address over SMS.

And this makes me nervous. While iMessage may very well be as secure as Apple makes it out to be, there are plenty of reasons to give the protocol a second look.

For one thing, it’s surprisingly complicated.

iMessage is not just two phones talking to each other with TLS. If this partial reverse-engineering of the protocol (based on the MacOS Mountain Lion Messages client) is for real, then there are lots of moving parts. TLS. Client certificates. Certificate signing requests. New certificates delivered via XML. Oh my.

As a general rule, lots of moving parts means lots of places for things to go wrong. Things that could seriously reduce the security of the protocol. And as far as I know, nobody’s given this much of a look. It’s surprising.

Moreover, there are some very real questions about what powers Apple has when it comes to iMessage. In principle ‘end-to-end’ encryption should mean that only the end devices can read the connection. In practice this is almost certainly not the case with iMessage. A quick glance at the protocol linked above is enough to tell me that Apple operates as a Certificate Authority for iMessage devices. And as a Certificate Authority, it may be able to substantially undercut the security of the protocol. When would Apple do this? How would it do this? Are we allowed to know?

Finally, there have been several reports of iMessages going astray and even being delivered to the wrong (or stolen) devices. This stuff may all have a reasonable explanation, but it’s yet another set of reasons why we it would be nice to understand iMessage better than we do now if we’re going to go around relying on it.

So what’s my point with all of this?

This is obviously not a technical post. I’m not here to present answers, which is disappointing. If I knew the protocol maybe I’d have some. Maybe I’d even be saying good things about it.

Rather, consider this post as a plea for help. iMessage is important. People use it. We ought to know how secure it is and what risks those people are taking by using it. The best solution would be for Apple to simply release a detailed specification for the protocol — even if they need to hold back a few key details. But if that’s not possible, maybe we in the community should be doing more to find out.

Remember, it’s not just our security at stake. People we know are using these products. It would be awfully nice to know what that means.

A missing post (updated)

Update (8/27): I’ve put the original post back up.

Update (8/9): I’ve re-written this post to include a vague, non-specific explanation of the bug. I’ve now confirmed the problem with one vendor, who has asked for a week to issue a patch. So far I haven’t had a response from the DCP’s technical people. And yes, I do realize someone PasteBinned the original post while it was up.

A few people have asked what happened to the post that was in this space just a few hours ago. No, you’re not going crazy. It was here.

The post contained a long, detailed evaluation of the HDCP v2 protocol. My idea was to do real-time evaluation of an industry protocol that hasn’t been deployed yet — a kind of ‘liveblogging’ cryptanalysis. What I expected to find was some bad practices, which I would gently poke fun at. I didn’t expect to find anything serious.

I was wrong in that initial judgement, with some caveats. I’m going to give a vague and non-specific summary here, and I hope to re-post the detailed technical post in a few days when I’ve heard (something, anything!) from DCP, the organization that maintains HDCP.

In case you’ve never heard of it, HDCP is a security protocol used to ‘protect’ video traveling over wired and wireless networks. There are two versions. Version 1 is in your TV today, and was seriously compromised in 2010. Version 2 is much better, but has only been deployed in a few products — including those that implement MiraCast (formerly Wi-Fi Display).

Version 2 contains a key agreement protocol that’s designed to establish a session encryption key between a transmitter (your phone, for example) and a receiver (a TV). Once this key is established, the transmitter can encrypt all video data going over the wire.

What I discovered in my brief analysis is a flaw in the key agreement protocol that may allow a man-in-the-middle to recover the session key (actually the ‘master secret’ used to derive the session key). This could potentially allow them to decrypt content. More on that in a minute, though.

I also discovered some slightly less serious flaws elsewhere in the protocol. It turns out that the DCP already knows about those, thanks to some enterprising work by a smart guy at an unnamed vendor (who deserves credit, and will get it once I put the original post back up).

Now for a few big caveats about the session key bug.

The bug I found does not get you all the way to decrypting HDCPv2 streams in practice, thanks to a tiny additional protection I missed while writing the original post. I don’t think much of this protection, since it involves a secret that’s stored in every single HDCPv2-compliant device. That’s a pretty lousy way to keep a secret.

And of course I haven’t personally verified this in any real HDCP devices, since I don’t own any. Although if I did, I could use this nifty HDCP plugin for WireShark to do some of the work.

The issue has been confirmed by one vendor, who is working on a patch for their product. Their products are used in real things that you’ve heard of, so I’m trusting that they’d know.

The last thing I want to address is why I published this, and why I subsequently pulled it.

When I wrote the original post I thought HDCP v2 was just a ‘paper spec’ — that there were no devices actually using it. Shortly after posting, I came across one commercial product that does claim to support HDCPv2. Later I discovered a few others. To be on the safe side, I decided to pull the post until I could notify the vendors. Then through sheer ineptitude I briefly re-posted it. Now I’m doing my best to put the toothpaste back in the tube.

As soon as I get some feedback I intend to put the post back up. A post which, incidentally, was not intended to break anything, but rather to serve as a lesson in just how complicated it is to design your own protocol. I suppose it’s achieved that goal.

Anyway, I’m putting this up as a placeholder in case you’re curious about what happened or why the heck I’m not blogging. Writing a long technical post and then having to can it is a drag. But hopefully we’ll be back to our regularly-scheduled programming in no time at all.

Indifferentiability

After umpteen weeks writing about broken stuff, I’m thrilled to tell you that for once, nothing awful is happening in the crypto world. It won’t last. But while it does, let’s grab the opportunity to talk about something constructive.

Now a word of warning: what I’m going to talk about today is fairly wonky and (worse), involves hash functions. If you’re not feeling up for this, this is your cue to bail and go read something nice about buffer overflows.

For those still with me, the subject of this post is the design of hash functions, and more specifically: the indifferentiability proofs that designers write to argue for their security. I was surprised to find that most people have never heard of these proofs, and thus have no idea why they’re useful. That’s too bad, since they’re extremely important to the way we analyze hash functions today.

Merkle-Damgård

This is not Ivan Damgård. (Seriously Google?)

The best way to begin any discussion of hash function design is to take a quick peek inside of the hash functions we actually use. Since the most popular hashes today are MD5 (ugh) and SHA, the right place to start is with the ‘Merkle-Damgård’ paradigm.

To understand Merkle-Damgård, you need to understand that cryptographers love to build complicated things out of simpler components. Under the hood of most block ciphers you’ll find S-boxes. Similarly, if you take the lid off a Merkle-Damgård hash function — surprise! — you find block ciphers. Or at least, something very much like them.

This approach dates to a 1979 proposal by a young cryptographer named Ralph Merkle. What Merkle showed is a way to build hash functions with a variable-length input, using any fixed one-way compression function (a one-way function that spits out fewer bits than it takes in). While Merkle wasn’t specific about the function, he suggested that DES might be a good candidate.

Expressed as a colorful diagram, the Merkle construction looks something like this:

Merkle-Damgård Construction (source: Wikipedia because I’m too lazy to
draw my own diagrams). IV is a fixed value. f is a one-way compression function.

The beauty of Merkle’s proposal is that it’s relatively simple to understand. You simply chop your message into blocks, then feed each block into the function f along with the output of the previous function evaluation. Throw in a finalization stage and you’re done.

Of course there’s a difference between proposing a technique and showing that it actually works. It would take ten more years, but at CRYPTO 1989, Merkle and another cryptographer named Ivan Damgård independently submitted formal analyses of Merkle’s proposal. What they showed is that as long as the function f has certain ideal properties, the resulting hash function is guaranteed to be collision-resistant. The rest, as they say, is history.

The popularity of Merkle-Damgård can be attributed in part to its security proof. But it also owes something to some major practical advantages:

You can use any secure block cipher as the function f, with just a few tweaks.
M-D hash functions can be pretty darn fast, again depending on f and how you use it.
M-D hashes allow you to digest ‘live’ data streams, where you don’t know in advance how much data you’re going to be hashing.

Of course, Merkle-Damgård hashes also have serious weaknesses. The most famous is the ‘length extension attack‘ in which an attacker, given only H(M) for some unknown message M, can ‘tack on’ additional blocks of her own choosing. This issue spells big trouble for people who think that H(key || message) is a good Message Authentication Code. What’s interesting about the length-extension issue is not that it leads to broken MACs. I mean, that is interesting, and it’s why you should use HMAC. But what’s really interesting is that this flaw doesn’t represent a violation of the collision-resistance guarantee. The two issues are in fact completely orthogonal. And this tells us something fundamental. Namely: collision-resistance is not enough.Today’s implementers do all kinds of crazy things with hash functions, and many of those applications require much more than collision-resistance. To achieve the necessary properties, we first need to figure out what they are. And that requires us to think hard about the following question:

What the heck is a secure hash function?

If you crack a typical security textbook (or visit the Wikipedia page on hash functions), you’ll see a long list of things of things a hash function ‘must’ accomplish. The list usually starts with these:

Collision resistance. It should be hard to find any pair of messages M1, M2 such that H(M1) == H(M2).
Pre-image resistance. Given only h it should be hard to find a ‘pre-image’ M2 such that H(M2) == h.

Now leave aside the technical fact that none of the unkeyed hash functions we use today are ‘truly’ collision-resistant. Or that the above definition of pre-image resistance implies that I can hash my cat’s name (‘fluffy’) and nobody can invert the hash (note: not true. Go ask LinkedIn if you don’t believe me.) The real problem is that these definitions don’t cover the things that people actually do with hash functions.

For example, take the construction of PRNGs. A common PRNG design hashes together large pools of gathered entropy in the hope that the result will be sufficiently uniform for cryptographic work. This is so common that it’s probably happening somewhere on your computer right now. And yet, absolutely nothing in the definitions above implies that this technique is safe!* Similar problems exist for key derivation functions, and even for signature schemes like ECDSA which clearly require hash functions that are more than just collision-resistant.

The more you look into the way that people use hash functions, the more you realize that they really need something that produces ‘random-looking’ output. Unfortunately, this notion is surprisingly hard to formalize. Hash functions are unkeyed, so they’re not pseudo-random functions. What in the world are people asking for?

Random oracles and indifferentiability

The answer, if you dig hard enough, is that people want hash functions to be random oracles.

Random oracles are cryptographers’ conception of what an ‘ideal’ hash function should be. Put succinctly, a random oracle is a perfectly random function that you can evaluate quickly. Random functions are beautiful not just because the output is random-looking (of course), but also because they’re automatically collision-resistant and pre-image resistant. It’s the only requirement you ever need.

The problem with random functions is that you just can’t evaluate them quickly: you need exponential storage space to keep them, and exponential time to evaluate one. Moreover, we know of nothing in the ‘real’ world that can approximate them. When cryptographers try to analyze their schemes with random functions, they have to go off into an imaginary fantasy world that we call the ‘random oracle model‘.

But ok, this post is not to judge. For the moment, let’s imagine that we are willing to visit this fantasy world. An obvious question is: what would it take to build a random oracle? If we had a compression function that was good enough — itself a random function — could we use a technique like Merkle-Damgård to get the rest of the way?

In 2004, Maurer, Renner and Holenstein gave us a powerful tool for answering this question. What they showed is that it’s always possible to replace functionality A (e.g., a random oracle) with another functionality B (e.g., an ideal compression function) provided that the following rules are satisfied:

There exists a way to ‘construct’ something ‘like’ A out of B.
There exists a way to ‘simulate’ something ‘like’ B using A.
An attacker who interacts with {constructed A-like thing, B} cannot tell the difference (i.e., can’t differentiate it) from {A, simulated B-like thing}

The definition of simulation gets a bit wonky. but expressed in simpler language all this means is: if you can show that your hash function, instantiated with an ‘ideal’ compression function, looks indistinguishable from a random oracle. And you can show that a manufactured compression function, built using a random oracle as an ingredient, looks indistinguishable from an ideal compression function, then you can always replace one with the other. That is, your hash function is ‘good enough’ to be a random oracle.

The following year, Coron, Dodis, Malinaud and Puniya applied this framework to Merkle-Damgård-hash functions. Their first result was immediate: such a proof does not work for Merkle-Damgård. Of course this shouldn’t actually surprise us. We already know that Merkle-Damgård doesn’t behave like a random oracle, since random oracles don’t exhibit length-extension attacks. Still it’s one thing to know this, and another to see a known problem actually turn up and screw up a proof. So far, no problem.

What Coron et al. showed next is much more interesting:

They proved formally that Merkle-Damgård can be made indifferentiable from a random oracle, as long as you apply a prefix-free encoding to the input before hashing it. Prefix-free encodings prevent length-extensions by ensuring that no message can ever be a prefix of another.
Next, they proved the security of HMAC applied to a Merkle-Damgård hash.
Finally, and best of all, they showed that if you simply drop some bits from the last output block — something called a ‘chop’ construction — you can make Merkle-Damgård hashes secure with much less work.

The best part of Coron et al.‘s findings is that the chop construction is already (inadvertently) in place on SHA384, which is constructed by dropping some output bits from its big-brother hash SHA512. The modern hash variants SHA512/224 and SHA512/256 also have this property.** So this theoretical work already has one big payoff: we know that (under certain assumptions) these hashes may be better than some of the others.

And these results have bigger implications. Now that we know how to do this, we can repeat the process for just about every candidate hash function anyone proposes. This lets us immediately weed out obvious bugs, and avoid standardizing another hash with problems like the length extension attack. This process has become so common that all of the SHA3 candidates now sport exactly such an indifferentiability proof.

Of course, in the real world, indifferentiability only takes you so far. It does tell us something, but it doesn’t tell us everything. Sure, if the compression function is perfect, you obtain a strong result about the hash function. But compression functions are never perfect. Real compression functions have glitches and oddities that can make these theoretical results irrelevant. This is why we’ll always need smart people to arm wrestle over which hash we get to use next.

In conclusion

If I had it in me, I’d go on to talk about the SHA3 candidates, and the techniques that each uses to achieve security in this model. But this has already been a long post, so that will have to wait for another time.

I want to say only one final thing.

This is a practical blog, and I admit that I try to avoid theory. What fascinates me about this area is that it’s a great example of a place where theory has directly come to the aid of practice. You may think of hash functions as whizzing little black boxes of ad-hoc machinery, and to some extent they are. But without theoretical analysis like this, they’d be a whole lot more ad-hoc. They might not even work.

Remember this when NIST finally gets around to picking Keccak BLAKE.

Notes:

* For a ridiculous example, imagine that you have a secure (collision-resistant, pre-image resistant) hash function H. Now construct a new hash function H’ such that H'(M) = {“long string of 0s” || H(M)}. This function is as collision-resistant as the original, but won’t be very useful if you’re generating keys with it.

** Thanks to Paulo Barreto for fixing numerous typos and pointing out that SHA512/256 and /224 make excellent candidates for chop hashes!

Flame, certificates, collisions. Oh my.

Update 6/6: Microsoft has given us more details on what’s going on here. If these collision attacks are truly novel, this tells us a lot about who worked on Flame and how important it is.

Update 6/7: Yes, it’s officially a novel MD5 collision attack, with the implication that top cryptographers were involved in Flame’s creation. We truly are living in the future.

See detailed updates (and a timeline) at the bottom of this post.

If you pay attention to these things, you’ve heard that there’s a new piece of government-issued malware making the rounds. It’s called Flame (or sometimes Skywiper), and it’s basically the most sophisticated — or most bloated — piece of malicious software ever devised. Kaspersky gets the credit for identifying Flame, but the task of tearing it to pieces has fallen to a whole bunch of different people.

Normally I don’t get too excited about malware, cause, well, that kind of thing is somebody else’s problem. But Flame is special: if reports are correct, this is the first piece of malware ever to take advantage of an MD5 certificate collision to enable code signing. Actually, it may be the first real anything to use MD5 certificate collisions. Neat stuff.

What we know is pretty sketchy, and is based entirely on the information-sparse updates coming from Microsoft’s security team. Here’s the story so far:

Sometime yesterday (June 3), Microsoft released an urgent security advisory warning administrators to revoke two Intermediate certificates hanging off of the Microsoft root cert. These belong to the Terminal Services Licensing service. Note that while these are real certificates — with keys and everything — they actually have nothing to do with security: their sole purpose was to authorize, say, 20 seats on your Terminal Services machine.

Despite the fact that these certs don’t serve any real security purpose, and shouldn’t be confused with anything that matters, Microsoft hung them off of the same root as the real certificates it uses for, well, important stuff. Stuff like signing Windows Update packages. You see where this is going?

Actually, this shouldn’t have been a real problem, because these licensing certs (and any new certificates beneath them) should have been recognizably different from code-signing certificates. Unfortunately, someone in Product Licensing appears to have really screwed the pooch. These certificates were used to generate Licensing certificates for customers. And each of those certificates had the code signing bit set.

The result? Anyone who paid for a Terminal Services license could (apparently) sign code as if they were Microsoft. If you’re wondering what that looks like, it looks like this.

This is obviously a bad thing. For many reasons. Mikko Hypponen points out that many organizations whitelist software signing by Microsoft, so even if institutions were being paranoid, this malware basically had carte blanche.

Ok, so far this is really bad, but just a garden-variety screwup. I mean, a huge one, to be sure. But nothing really interesting.

Here’s where that changes. Just today, Microsoft released a new update. This is the new piece:

The Flame malware used a cryptographic collision attack in combination with the terminal server licensing service certificates to sign code as if it came from Microsoft. However, code-signing without performing a collision is also possible.

Now, I’m not sure why the ‘collision attack’ is necessary if code-signing is possible without a collision. But who cares! A collision attack! In the wild! On top-secret government-issued Malware! Dan Brown couldn’t hold a candle to this.

If we look at the Licensing certificates in question, we do indeed see that at least one — created in 2009 — uses the MD5 hash function, something everyone knows is a bad idea:

Certificate: Data: Version: 3 (0x2) Serial Number: 3a:ab:11:de:e5:2f:1b:19:d0:56 Signature Algorithm: md5WithRSAEncryption Issuer: OU=Copyright (c) 1997 Microsoft Corp., OU=Microsoft Corporation, CN=Microsoft Root Authority Validity Not Before: Dec 10 01:55:35 2009 GMT Not After : Oct 23 08:00:00 2016 GMT Subject: C=US, ST=Washington, L=Redmond, O=Microsoft Corporation, OU=Copyright (c) 1999 Microsoft Corp., CN=Microsoft Enforced Licensing Intermediate PCA Subject Public Key Info: Public Key Algorithm: rsaEncryption Public-Key: (2048 bit)

And so… Well, I don’t really know. That’s where we run out of information. And end up with a lot of questions.

For one thing, why did the Flame authors need a collision? What extra capabilities did it get them? (Update: the best theory I’ve heard comes from Nate Lawson, who theorizes that the collision might have allowed the attackers to hide their identity. Update 6/6: The real reason has to do with certificate extensions and compatibility with more recent Windows versions. See the end of this post.)

More importantly, what kind of resources would it take to find the necessary MD5 collision? That would tell us a lot about the capabilities of the people government contractors who write malware like this. In 2008 it took one day on a supercomputer, i.e., several hundred PS3s linked together.

And just out of curiosity, what in the world was Microsoft doing issuing MD5-based certificates in 2009? I mean, the code signing bit is disastrous, but at least that’s a relatively subtle issue — I can understand how someone might have missed it. But making MD5 certificates one year after they were definitively shown to be insecure, that’s hard to excuse. Lastly, why do licensing certificates even need to involve a Certificate Signing Request at all, which is the vector you’d use for a collision attack?

I hope that we’ll learn the answers to these questions over the next couple of days. In the mean time, it’s a fascinating time to be in security. If not, unfortunately, a very good time for anyone else.

***

Update 6/5: For those who aren’t familiar with certificate collision attacks, I should clear up the basic premise. Most Certificate Authorities sign a Certificate Signing Request (CSR) issued by a possibly untrustworthy customer. The idea of an MD5 collision-finding attack is to come up with two different CSRs that hash to the same thing. One would be legitimate and might contain your Terminal Services licensing number. The other would contain… something else. By signing the first CSR, Microsoft would be implicitly creating a certificate on the second batch of information.

Finding collisions is a tricky process, since it requires you to muck with the bits of the public key embedded in the certificate (see this paper for more details). Also, some CAs embed a random serial number into the certificate, which really messes with the attack. Microsoft did not.

Finally, some have noticed that Microsoft is still using a root certificate based on MD5, one that doesn’t expire until 2020, and hasn’t been revoked. What makes this ok? The simple answer is that it’s not ok. However, Microsoft probably does not sign arbitrary CSRs with that root certificate, meaning that collision attacks are not viable against it. It’s only the Intermediate certs, the ones that actually sign untrusted CSRs you need to worry about — today.

***

Update 6/6: Microsoft has finally given us some red meat. Short summary: the Terminal Services Licensing certs do work to sign code on versions of Windows prior to Vista, with no collision attack needed. All in all, that’s a heck of a bad thing. But it seems that more recent versions of Windows objected to the particular X.509 extensions that were in the TS licensing certs. The government (or their contractors) therefore used a collision attack to make a certificate that did not have these extensions. From what I’m reading on mailing lists, the collision appears to be “new, and of scientific interest”.

If this pans out, it’s a big deal. Normally the government doesn’t blow a truly sophisticated cryptanalytic attack on some minor spying effort. Getting MD5 collisions up and running is a big effort; it’s not something you can do from a cookbook. But developing novel collision techniques is a step beyond that. I take this to mean that Flame’s authors were breaking out the big guns, which tells us something about its importance to our government. It also tells us that the creation of Flame may have involved some of the top mathematicians and cryptographers in (our?) intelligence services. This is an important datapoint.

***

Update 6/7: It looks like it does pan out. Marc Stevens — one of the authors of the earlier MD5 certificate collision papers — confirms that Flame’s certificate uses a novel collision-finding technique.

“Flame uses a completely new variant of a ‘chosen prefix collision attack’ to impersonate a legitimate security update from Microsoft. The design of this new variant required world-class cryptanalysis”

We can only speculate about why this would be necessary, but a good guess is that Flame is old (or, at very least, the crypto techniques are). If the collision technique was in the works prior to 2009 (and why wouldn’t it be?), Stevens et al. wouldn’t have had much relevance. For those who like details, a (very incomplete) timeline looks something like this:

Mid-1990s: MD5 is effectively obsolete. Switch to a better hash function!
August 2004: Wang and Yu present first (‘random’), manually-found collision on MD5.
2004-2007: Collisions extended to ‘meaningful’ documents. Lots of stuff omitted here.
May 2007: (CWI) Stevens et al. publish a first impractical technique for colliding certificates.
Late 2008: (CWI) Stevens et al. develop a practical certificate collision.
December 2008: Vendors notified.
Feburary 2009: Paper submitted to CRYPTO.
Summer 2009: Source code released.

Even assuming that CWI team had perfect security with their result, conference program committees are hardly built for secrecy — at least, not from goverments. So it’s reasonable to assume that the Stevens et al. technique was known by February 2009, possibly earlier. Which means that the Flame developers may have had their own approach under development sometime before that date.

The really interesting question is: when did secret collision research get ahead of the public, academic stuff? Post-2007? Post-2004? Pre-2004? I doubt we’ll ever know the answer to this question. I sure would love to.

Update 6/12: Alex Sotirov has a great Summercon presentation with many of the details. The word from those who know is: don’t take his $20,000 figure too literally. We know nothing about the techniques used to find this collision.

A Few Thoughts on Cryptographic Engineering

Some random thoughts about crypto. Notes from a course I teach. Pictures of my dachshunds.

Category: Uncategorized

Falling through the KRACKs

A quick TL;DR on KRACK

How did this get missed for so long?

What about the security proof?

Beyond public key encryption

Identity Based Cryptography

Putting the “E” in IBE

Functional Encryption

So what’s the point of all this?

Zero Knowledge Proofs: An illustrated primer, Part 2

A quick recap, and a bit more on Zero Knowledge(ness)

From theory into practice

Proofs vs. Proofs of Knowledge

The Schnorr identification protocol

Proving soundness

Proving zero-knowledge(ness) against an honest Verifier

From interactive to non-interactive

Phew.

How do we pay for privacy?

How we support privacy tools today

Developers need more than just money!

How can we do better?

So will this happen?

Noodling about IM protocols

The OTR protocol

Cryptocat

In conclusion

Attack of the week: OpenSSL Heartbleed

Dear Apple: Please set iMessage free

A missing post (updated)

Indifferentiability

Merkle-Damgård

What the heck is a secure hash function?

Random oracles and indifferentiability

In conclusion

Flame, certificates, collisions. Oh my.

Menu