Falling through the KRACKs

The big news in crypto today is the KRACK attack on WPA2 protected WiFi networks. logo-smallDiscovered by Mathy Vanhoef and Frank Piessens at KU Leuven, KRACK (Key Reinstallation Attack) leverages a vulnerability in the 802.11i four-way handshake in order to facilitate decryption and forgery attacks on encrypted WiFi traffic.

The paper is here. It’s pretty easy to read, and you should.

I don’t want to spend much time talking about KRACK itself, because the vulnerability is pretty straightforward. Instead, I want to talk about why this vulnerability continues to exist so many years after WPA was standardized. And separately, to answer a question: how did this attack slip through, despite the fact that the 802.11i handshake was formally proven secure?

A quick TL;DR on KRACK

For a detailed description of the attack, see the KRACK website or the paper itself. Here I’ll just give a brief, high level description.

The 802.11i protocol (also known as WPA2) includes two separate mechanisms to ensure the confidentiality and integrity of your data. The first is a record layer that encrypts WiFi frames, to ensure that they can’t be read or tampered with. This encryption is (generally) implemented using AES in CCM mode, although there are newer implementations that use GCM mode, and older ones that use RC4-TKIP (we’ll skip these for the moment.)

The key thing to know is that AES-CCM (and GCM, and TKIP) is a stream cipher, which means it’s vulnerable to attacks that re-use the same key and “nonce”, also known as an initialization vector. 802.11i deals with this by constructing the initialization vector using a “packet number” counter, which initializes to zero after you start a session, and always increments (up to 2^48, at which point rekeying must occur). This should prevent any nonce re-use, provided that the packet number counter can never be reset.

The second mechanism you should know about is the “four way handshake” between the AP and a client (supplicant) that’s responsible for deriving the key to be used for encryption. The particular message KRACK cares about is message #3, which causes the new key to be “installed” (and used) by the client.

393px-4-way-handshake-svg
I’m a four-way handshake. Client is on the left, AP is in the right. (courtesy Wikipedia, used under CC).

The key vulnerability in KRACK (no pun intended) is that the acknowledgement to message #3 can be blocked by adversarial nasty people.* When this happens, the AP re-transmits this message, which causes (the same) key to be reinstalled into the client (note: see update below*). This doesn’t seem so bad. But as a side effect of installing the key, the packet number counters all get reset to zero. (And on some implementations like Android 6, the key gets set to zero — but that’s another discussion.)

The implication is that by forcing the AP to replay this message, an adversary can cause a connection to reset nonces and thus cause keystream re-use in the stream cipher. With a little cleverness, this can lead to full decryption of traffic streams. And that can lead to TCP hijacking attacks. (There are also direct traffic forgery attacks on GCM and TKIP, but this as far as we go for now.)

How did this get missed for so long?

If you’re looking for someone to blame, a good place to start is the IEEE. To be clear, I’m not referring to the (talented) engineers who designed 802.11i — they did a pretty good job under the circumstances. Instead, blame IEEE as an institution.

One of the problems with IEEE is that the standards are highly complex and get made via a closed-door process of private meetings. More importantly, even after the fact, they’re hard for ordinary security researchers to access. Go ahead and google for the IETF TLS or IPSec specifications — you’ll find detailed protocol documentation at the top of your Google results. Now go try to Google for the 802.11i standards. I wish you luck.

The IEEE has been making a few small steps to ease this problem, but they’re hyper-timid incrementalist bullshit. There’s an IEEE program called GET that allows researchers to access certain standards (including 802.11) for free, but only after they’ve been public for six months — coincidentally, about the same time it takes for vendors to bake them irrevocably into their hardware and software.

This whole process is dumb and — in this specific case — probably just cost industry tens of millions of dollars. It should stop.

The second problem is that the IEEE standards are poorly specified. As the KRACK paper points out, there is no formal description of the 802.11i handshake state machine. This means that implementers have to implement their code using scraps of pseudocode scattered around the standards document. It happens that this pseudocode leads to the broken implementation that enables KRACK. So that’s bad too.

And of course, the final problem is implementers. One of the truly terrible things about KRACK is that implementers of the WPA supplicant (particularly on Linux) managed to somehow make Lemon Pledge out of lemons. On Android 6 in particular, replaying message #3 actually sets an all-zero key. There’s an internal logic behind why this happens, but Oy Vey. Someone actually needs to look at this stuff.

What about the security proof?

The fascinating thing about the 802.11i handshake is that despite all of the roadblocks IEEE has thrown in people’s way, it (the handshake, at least) has been formally analyzed. At least, for some definition of the term.

(This isn’t me throwing shade — it’s a factual statement. In formal analysis, definitions really, really matter!)

A paper by He, Sundararajan, Datta, Derek and Mitchell (from 2005!) looked at the 802.11i handshake and tried to determine its security properties. What they determined is that yes, indeed, it did produce a secret and strong key, even when an attacker could tamper with and replay messages (under various assumptions). This is good, important work. The proof is hard to understand, but this is par for the course. It seems to be correct.

wifihandshake
Representation of the 4-way handshake from the paper by He et al. Yes, I know you’re like “what?“. But that’s why people who do formal verification of protocols don’t have many friends.

Even better, there are other security proofs showing that — provided the nonces are never repeated — encryption modes like CCM and GCM are highly secure. This means that given a secure key, it should be possible to encrypt safely.

So what went wrong?

The critical problem is that while people looked closely at the two components — handshake and encryption protocol — in isolation, apparently nobody looked closely at the two components as they were connected together. I’m pretty sure there’s an entire geek meme about this.

czx0o-twqaaeali
Two unit tests, 0 integration tests, thanks Twitter.

Of course, the reason nobody looked closely at this stuff is that doing so is just plain hard. Protocols have an exponential number of possible cases to analyze, and we’re just about at the limit of the complexity of protocols that human beings can truly reason about, or that peer-reviewers can verify. The more pieces you add to the mix, the worse this problem gets.

In the end we all know that the answer is for humans to stop doing this work. We need machine-assisted verification of protocols, preferably tied to the actual source code that implements them. This would ensure that the protocol actually does what it says, and that implementers don’t further screw it up, thus invalidating the security proof.

This needs to be done urgently, but we’re so early in the process of figuring out how to do it that it’s not clear what it will take to make this stuff go live. All in all, this is an area that could use a lot more work. I hope I live to see it.

===

* Update: An early version of this post suggested that the attacker would replay the third message. This can indeed happen, and it does happen in some of the more sophisticated attacks. But primarily, the paper describes forcing the AP to resend it by blocking the acknowledgement from being received at the AP. Thanks to Nikita Borisov and Kyle Birkeland for the fix!

Patching is hard; so what?

It’s now been about a week since Equifax announced the record-breaking breach that Equifax-Genericaffected 143 million Americans. We still don’t know enough — but a few details have begun to come out about the causes of the attack. It’s now being reported that Equifax’s woes stem from an unpatched vulnerability in Apache Struts that dates from March 2017, nearly two months before the breach began. This flaw, which allows remote command execution on affected servers, somehow allowed an attacker to gain access to a whopping amount of Equifax’s customer data.

While many people have criticized Equifax for its failure, I’ve noticed a number of tweets from information security professionals making the opposite case. Specifically, these folks point out that patching is hard. The gist of these points is that you can’t expect a major corporation to rapidly deploy something as complex as a major framework patch across their production systems. The stronger version of this point is that the people who expect fast patch turnaround have obviously never patched a production server.

I don’t dispute this point. It’s absolutely valid. My very simple point in this post is that it doesn’t matter. Excusing Equifax for their slow patching is both irrelevant and wrong. Worse: whatever the context, statements like this will almost certainly be used by Equifax to excuse their actions. This actively makes the world a worse place.

I don’t operate production systems, but I have helped to design a couple of them. So I understand something about the assumptions you make when building them.

If you’re designing a critical security system you have choices to make. You can build a system that provides defense-in-depth — i.e., that makes the assumption that individual components will fail and occasionally become insecure. Alternatively, you can choose to build systems that are fragile — that depend fundamentally on the correct operation of all components at all times. Both options are available to system designers, and making the decision is up to those designers; or just as accurately, the managers that approve their design.

The key point is that once you’ve baked this cake, you’d better be willing to eat it. If your system design assumes that application servers will not contain critical vulnerabilities — and you don’t have resilient systems in place to handle the possibility that they do — then you’ve implicitly made the decision that you’re never ever going to allow those vulnerabilities to fester. Once an in-the-wild vulnerability is detected in your system, you’d damn well better have a plan to patch, and patch quickly. That may involve automated testing. It may involve taking your systems down, or devoting enormous resources to monitoring activity. If you can’t do that, you’d better have an alternative. Running insecure is not an option.

So what would those systems look like? Among more advanced system designs I’ve begun to see a move towards encrypting back-end data. By itself this doesn’t do squat to protect systems like Equifax’s, because those systems are essentially “hot” databases that have to provide cleartext data to application servers — precisely the systems that Equifax’s attackers breached.

The common approach to dealing with this problem is twofold. First, you harden the cryptographic access control components that handle decryption and key management for the data — so that a breach in an application server doesn’t lead to the compromise of the access control gates. Second, you monitor, monitor, monitor. The sole advantage that encryption gives you here is that your gates for access control are now reduced to only the systems that manage encryption. Not your database. Not your web framework. Just a — hopefully — small and well-designed subsystem that monitors and grants access to each record. Everything else is monitoring.

Equifax claims to have resilient systems in place. Only time will tell if they looked like this. What seems certain is that whatever those systems are, they didn’t work. And given both the scope and scale of this breach, that’s a cake I’d prefer not to have to eat.

Beyond public key encryption

One of the saddest and most fascinating things about applied cryptography is how 6689264031_4c7516b3e1_zlittle cryptography we actually use. This is not to say that cryptography isn’t widely used in industry — it is. Rather, what I mean is that cryptographic researchers have developed so many useful technologies, and yet industry on a day to day basis barely uses any of them. In fact, with a few minor exceptions, the vast majority of the cryptography we use was settled by the early-2000s.*

Most people don’t sweat this, but as a cryptographer who works on the boundary of research and deployed cryptography it makes me unhappy. So while I can’t solve the problem entirely, what I can do is talk about some of these newer technologies. And over the course of this summer that’s what I intend to do: talk. Specifically, in the next few weeks I’m going to write a series of posts that describe some of the advanced cryptography that we don’t generally see used.

Today I’m going to start with a very simple question: what lies beyond public key cryptography? Specifically, I’m going to talk about a handful of technologies that were developed in the past 20 years, each of which allows us to go beyond the traditional notion of public keys.

This is a wonky post, but it won’t be mathematically-intense. For actual definitions of the schemes, I’ll provide links to the original papers, and references to cover some of the background. The point here is to explain what these new schemes do — and how they can be useful in practice.

Identity Based Cryptography

In the mid-1980s, a cryptographer named Adi Shamir proposed a radical new idea. The idea, put simply, was to get rid of public keys.

To understand where Shamir was coming from, it helps to understand a bit about public key encryption. You see, prior to the invention of public key crypto, all cryptography involved secret keys. Dealing with such keys was a huge drag. Before you could communicate securely, you needed to exchange a secret with your partner. This process was fraught with difficulty and didn’t scale well.

Public key encryption (beginning with Diffie-Hellman and Shamir’s RSA cryptosystem) hugely revolutionized cryptography by dramatically simplifying this key distribution process. Rather than sharing secret keys, users could now transmit their public key to other parties. This public key allowed the recipient to encrypt to you (or verify your signature) but it could not be used to perform the corresponding decryption (or signature generation) operations. That part would be done with a secret key you kept to yourself.

While the use of public keys improved many aspects of using cryptography, it also gave rise to a set of new challenges. In practice, it turns out that having public keys is only half the battle — people still need to use distribute them securely.

For example, imagine that I want to send you a PGP-encrypted email. Before I can do this, I need to obtain a copy of your public key. How do I get this? Obviously we could meet in person and exchange that key on physical media — but nobody wants to do this. It would much more desirable to obtain your public key electronically. In practice this means either (1) we have to exchange public keys by email, or (2) I have to obtain your key from a third piece of infrastructure, such as a website or key server. And now we come to the  problem: if that email or key server is untrustworthy (or simply allows anyone to upload a key in your name), I might end up downloading a malicious party’s key by accident. When I send a message to “you”, I’d actually be encrypting it to Mallory.

f64f315ec47f0b041e3d881177039414
Mallory.

Solving this problem — of exchanging public keys and verifying their provenance — has motivated a huge amount of practical cryptographic engineering, including the entire web PKI. In most cases, these systems work well. But Shamir wasn’t satisfied. What if, he asked, we could do it better? More specifically, he asked: could we replace those pesky public keys with something better?

Shamir’s idea was exciting. What he proposed was a new form of public key cryptography in which the user’s “public key” could simply be their identity. This identity could be a name (e.g., “Matt Green”) or something more precise like an email address. Actually, it didn’t realy matter. What did matter was that the public key would be some arbitrary string — and not a big meaningless jumble of characters like “7cN5K4pspQy3ExZV43F6pQ6nEKiQVg6sBkYPg1FG56Not”.

Of course, using an arbitrary string as a public key raises a big problem. Meaningful identities sound great — but I don’t own them. If my public key is “Matt Green”, how do I get the corresponding private key? And if I can get out that private key, what stops some other Matt Green from doing the same, and thus reading my messages? And ok, now that I think about this, what stops some random person who isn’t named Matt Green from obtaining it? Yikes. We’re headed straight into Zooko’s triangle.

Shamir’s idea thus requires a bit more finesse. Rather than expecting identities to be global, he proposed a special server called a “key generation authority” that would be responsible for generating the private keys. At setup time, this authority would generate a single master public key (MPK), which it would publish to the world. If you wanted to encrypt a message to “Matt Green” (or verify my signature), then you could do so using my identity and the single MPK of an authority we’d both agree to use. To decrypt that message (or sign one), I would have to visit the same key authority and ask for a copy of my secret key. The key authority would compute my key based on a master secret key (MSK), which it would keep very secret.

With all algorithms and players specified, whole system looks like this:

IBE
Overview of an Identity-Based Encryption (IBE) system. The Setup algorithm of the Key Generation Authority generates the master public key (MPK) and master secret key (MSK). The authority can use the Extract algorithm to derive the secret key corresponding to a specific ID. The encryptor (left) encrypts using only the identity and MPK. The recipient requests the secret key for her identity, and then uses it to decrypt. (Icons by Eugen Belyakoff)

This design has some important advantages — and more than a few obvious drawbacks. On the plus side, it removes the need for any key exchange at all with the person you’re sending the message to. Once you’ve chosen a master key authority (and downloaded its MPK), you can encrypt to anyone in the entire world. Even cooler: at the time you encrypt, your recipient doesn’t even need to have contacted the key authority yet. She can obtain her secret key after I send her a message.

Of course, this “feature” is also a bug. Because the key authority generates all the secret keys, it has an awful lot of power. A dishonest authority could easily generate your secret key and decrypt your messages. The polite way to say this is that standard IBE systems effectively “bake in” key escrow.**

Putting the “E” in IBE

All of these ideas and more were laid out by Shamir in his 1984 paper. There was just one small problem: Shamir could only figure out half the problem.

Specifically, Shamir’s proposed a scheme for identity-based signature (IBS) — a signature scheme where the public verification key is an identity, but the signing key is generated by the key authority. Try as he might, he could not find a solution to the problem of building identity-based encryption (IBE). This was left as an open problem.***

It would take more than 16 years before someone answered Shamir’s challenge. Surprisingly, when the answer came it came not once but three times.

The first, and probably most famous realization of IBE was developed by Dan Boneh and Matthew Franklin much later — in 2001. The timing of Boneh and Franklin’s discovery makes a great deal of sense. The Boneh-Franklin scheme relies fundamentally on elliptic curves that support an efficient “bilinear map” (or “pairing”).**** The algorithms needed to compute such pairings were not known when Shamir wrote his paper, and weren’t employed constructively — that is, as a useful thing rather than an attack — until about 2000. The same can be said about a second scheme called Sakai-Kasahara, which would be independently discovered around the same time.

(For a brief tutorial on the Boneh-Franklin IBE scheme, see this page.)

The third realization of IBE was not as efficient as the others, but was much more surprising. This scheme was developed by Clifford Cocks, a senior cryptologist at Britain’s GCHQ. It’s noteworthy for two reasons. First, Cocks’ IBE scheme does not require bilinear pairings at all — it is based in the much older RSA setting, which means in principle it spent all those years just waiting to be found. Second, Cocks himself had recently become known for something even more surprising: discovering the RSA cryptosystem, nearly five years before RSA did. To bookend that accomplishment with a second major advance in public key cryptography was a pretty impressive accomplishment.

In the years since 2001, a number of additional IBE constructions have been developed, using all sorts of cryptographic settings. Nonetheless, Boneh and Franklin’s early construction remains among the simplest and most efficient.

Even if you’re not interested in IBE for its own sake, it turns out that this primitive is really useful to cryptographers for many things beyond simple encryption. In fact, it’s more helpful to think of IBE as a way of “pluralizing” a single public/secret master keypair into billions of related keypairs. This makes it useful for applications as diverse as blocking chosen ciphertext attacks, forward-secure public key encryption, and short signature schemes.

Attribute Based Encryption

Of course, if you leave cryptographers alone with a tool like IBE, the first thing they’re going to do is find a way to make things more complicated improve on it.

One of the biggest such improvements is due to Sahai and Waters. It’s called Attribute-Based Encryption, or ABE.

The origin of this idea was not actually to encrypt with attributes. Instead Sahai and Waters were attempting to develop an Identity-Based encryption scheme that could encrypt using biometrics. To understand the problem, imagine I decide to use a biometric like your iris scan as the “identity” to encrypt you a ciphertext. Later on you’ll ask the authority for a decryption key that corresponds to your own iris scan — and if everything matches up and you’ll be able to decrypt.

The problem is that this will almost never work.

The issue here is that biometric readings (like iris scans or fingerprint templates) are inherently error-prone. This means every scan will typically be very close, but often there will be a few bits that disagree. With standard IBE

iris
Tell me this isn’t giving you nightmares.

this is fatal: if the encryption identity differs from your key identity by even a single bit, decryption will not work. You’re out of luck.

Sahai and Waters decided that the solution to this problem was to develop a form of IBE with a “threshold gate”. In this setting, each bit of the identity is represented as a different “attribute”. Think of each of these as components you’d encrypt under — something like “bit 5 of your iris scan is a 1” and “bit 23 of your iris scan is a 0”. The encrypting party lists all of these bits and encrypts under each one. The decryption key generated by the authority embeds a similar list of bit values. The scheme is defined so that decryption will work if and only if the number of matching attributes (between your key and the ciphertext) exceeds some pre-defined threshold: e.g., any 2024 out of 2048 bits must be identical in order to decrypt.

The beautiful thing about this idea is not fuzzy IBE. It’s that once you have a threshold gate and a concept of “attributes”, you can more interesting things. The main observation is that a threshold gate can be used to implement the boolean AND and OR gates, like so:

gates

Even better, you can stack these gates on top of one another to assign a fairly complex boolean formula — which will itself determine what conditions your ciphertext can be decrypted under. For example, switching to a more realistic set of attributes, you could encrypt a medical record so that either a pediatrician in a hospital could read it, or an insurance adjuster could. All you’d need is to make sure people received keys that correctly described their attributes (which are just arbitrary strings, like identities).

ABEFormula
A simple “ciphertext policy”, in which the ciphertext can be decrypted if and only if a key matches an appropriate set of attributes. In this case, the key satisfies the formula and thus the ciphertext decrypts. The remaining key attributes are ignored.

The other direction can be implemented as well. It’s possible to encrypt a ciphertext under a long list of attributes, such as creation time, file name, and even GPS coordinates indicating where the file was created. You can then have the authority hand out keys that correspond to a very precise slice of your dataset — for example, “this key decrypts any radiology file encrypted in Chicago between November 3rd and December 12th that is tagged with ‘pediatrics’ or ‘oncology'”.

Functional Encryption

Once you have a related of primitives like IBE and ABE, the researchers’ instinct is to both extend and generalize. Why stop at simple boolean formulae? Can we make keys (or ciphertexts) that embed arbitrary computer programs? The answer, it turns out, is yes — though not terribly efficiently. A set of recent works show that it is possible to build ABE that works over arbitrary polynomial-size circuits, using various lattice-based assumptions. So there is certainly a lot of potential here.

This potential has inspired researchers to generalize all of the above ideas into a single class of encryption called “functional encryption“. Functional encryption is more conceptual than concrete — it’s just a way to look at all of these systems as instances of a specific class. The basic idea is to represent the decryption procedure as an algorithm that computes an arbitary function F over (1) the plaintext inside of a ciphertext, and (2) the data embedded in the key. This function has the following profile:

output = F(key data, plaintext data)

In this model IBE can be expressed by having the encryption algorithm encrypt (identity, plaintext) and defining the function F such that, if “key input == identity”, it outputs the plaintext, and outputs an empty string otherwise. Similarly, ABE can be expressed by a slightly more complex function. Following this paradigm, once can envision all sorts of interesting functions that might be computed by different functions and realized by future schemes.

But those will have to wait for another time. We’ve gone far enough for today.

So what’s the point of all this?

For me, the point is just to show that cryptography can do some pretty amazing things. We rarely see this on a day-to-day basis when it comes to industry and “applied” cryptography, but it’s all there waiting to be used.

Perhaps the perfect application is out there. Maybe you’ll find it.

Notes:

* An earlier version of this post said “mid-1990s”. In comments below, Tom Ristenpart takes issue with that and makes the excellent point that many important developments post-date that. So I’ve moved the date forward about five years, and I’m thinking about how to say this in a nicer way.

** There is also an intermediate form of encryption known as “certificateless encryption“. Proposed by Al-Riyami and Paterson, this idea uses a combination of standard public key encryption and IBE. The basic idea is to encrypt each message using both a traditional public key (generated by the recipient) and an IBE identity. The recipient must then obtain a copy of the secret key from the IBE authority to decrypt. The advantages here are twofold: (1) the IBE key authority can’t decrypt the message by itself, since it does not have the corresponding secret key, which solves the “escrow” problem. And (2) the sender does not need to verify that the public key really belongs to the sender (e.g., by checking a certificate), since the IBE portion prevents imposters from decrypting the resulting message. Unfortunately this system is more like traditional public key cryptography than IBE, and does not have the neat usability features of IBE.

*** A part of the challenge of developing IBE is the need to make a system that is secure against “collusion” between different key holders. For example, imagine a very simple system that has only 2-bit identities. This gives four possible identities: “00”, “01”, “10”, “11”. If I give you a key for the identity “01” and I give Bob a key for “10”, we need to ensure that you two cannot collude to produce a key for identities “00” and “11”. Many earlier proposed solutions have tried to solve this problem by gluing together standard public encryption keys in various ways (e.g., by having a separate public key for each bit of the identity and giving out the secret keys as a single “key”). However these systems tend to fail catastrophically when just a few users collude (or their keys are stolen). Solving the collusion problem is fundamentally what separates real IBE from its faux cousins.

**** A full description of Boneh and Franklin’s scheme can be found here, or in the original paper. Some code is here and here and here. I won’t spend more time on it, except to note that the scheme is very efficient. It was patented and implemented by Voltage Security, now part of HPE.

Secure computing for journalists

This morning on Twitter, Buzzfeed editor Miriam Elder asks the following question:

No, this is not a stupid question. Actually it’s an extremely important question, and judging by some of the responses to this Tweet there are a lot of other people who are confused about the answer.

Since I couldn’t find a perfect layperson’s reference anywhere else, I’m going to devote this post to providing the world’s simplest explanation of why, in the threat model of your typical journalistyour desktop machine isn’t very safe. And specifically, why you’re safer using a modern mobile device — and particularly, an iOS device — than just about any other platform.

A brief caveat: I’m a cryptographer, not a software security researcher. However, I’ve spent the past several years interacting with folks like Charlie and Dan and Thomas. I’m pretty confident that they agree with this advice.

What’s wrong with my laptop/desktop machine?

Sadly, most of the problem is you.

If you’re like most journalists — and really, most professionals — you spend less than 100% of your time thinking about security. You need to get work done. When you’re procrastinating from work, you visit funny sites your friends link you to on Facebook. Then you check your email. If you’re a normal and productive user, you probably do a combination of all these things every few minutes, all of which culminates in your downloading some email attachment and (shudder) opening it in Word.

Now I’m not trying to shame you for this. It’s perfectly normal, and indeed it’s necessary if you want to get things done.  But in the parlance of security professionals, it also means you have a huge attack surface.

In English, this means that from the perspective of an attacker there are many different avenues to compromise your machine. Many of these aren’t even that sophisticated. Often it’s just a matter of catching you during an unguarded moment and convincing you to download an executable file or an infected Office document. A compromised machine means that every piece of software on that machine is also vulnerable.

If you don’t believe this works, head over to Google and search for “Remote Access Trojans”. There’s an entire commercial market for these products, each of which allows you to remotely control someone else’s computer. These off-the-shelf products aren’t very sophisticated: indeed, most require you to trick your victim into downloading and running some executable attachment. Sadly, this works on most people just fine. And this is just the retail stuff. Imagine what a modestly sophisticated attacker can do.

I do some of those things on my phone as well. Why is a phone better?

Classical (desktop and laptop) operating systems were designed primarily to support application developers. This means they offer a lot of power to your applications. An application like Microsoft Word can typically read and write all the files available to your account. If Word becomes compromised, this is usually enough to pwn you in practice. And in many cases, these applications have components with root (or Administrator) access, which makes them even more dangerous.

Modern phone operating systems like Android and iOS were built on a different principle. Rather than trusting apps with much power, each app runs in a “sandbox” that (mainly) limits it to accessing its own files. If the sandbox works, even a malicious application shouldn’t be able to reach out to touch other apps’ files or permanently modify your system. This approach — combined with other protections such as in-memory code signing, hardware secret storage and routine use of anti-exploitation measures — makes your system vastly harder to compromise.

Of course, sandboxing isn’t perfect. A compromised or malicious app can always access its own files. More sophisticated exploits can “break out” of the sandbox, typically by exploiting a vulnerability in the operating system. Such vulnerabilities are routinely discovered and occasionally exploited.

The defense to this is twofold: (1) first, run a modern, up-to-date OS that receives security patches quickly. And (2) avoid downloading malicious apps. Which brings me to the main point of this post.

Why use iOS?

The fact of the matter is that when it comes to addressing these remaining issues, Apple phone operating systems (on iPhones and iPads) simply have a better track record.

Since Apple is the only manufacturer of iOS devices, there is no “middleman” when it comes to monitoring for iOS issues and deploying iOS security updates. This means that the buck stops at Apple — rather than with some third-party equipment manufacturer. Indeed, Apple routinely patches its operating systems and pushes the patches to all supported users — sometimes within hours of learning of a vulnerability (something that is relatively rare at this point in any case).

Of course, to be fair: Google has also become fairly decent at supporting its own Android devices. However, to get assurance from this process you need to be running a relatively brand new device and it needs to be manufactured by Google. Otherwise you’re liable to be several days or weeks behind the time when a security issue is discovered and patched — if you ever get it. And Google still does not support all of the features Apple does, including in-memory code signing and strong file encryption.

Apple also seems to do a relatively decent job at curating its App Store, at least as compared to Google. And because those apps support a more modern base of phones, they tend to have access to better security features, whereas Android apps more routinely get caught doing dumb stuff for backwards compatibility reasons.

640x960
A password manager using the SEP.

Finally, every recent Apple device (starting with the iPhone 5S and up) also includes a specialized chip known as a “Secure Enclave Processor“. This hardened processor assists in securing the boot chain — ensuring that nobody can tamper with your operating system. It can also protect sensitive values like your passwords, ensuring that only a password or fingerprint can access them.

A few Android phones also offer similar features as well. However, it’s unclear how well these are implemented in contrast to Apple’s SEP. It’s not a bet I would choose to take.

So does using iOS mean I’m perfectly safe?

Of course not. Unfortunately, computer security today is about resisting attacks. We still don’t quite know how to prevent them altogether.

Indeed, well-funded attackers like governments are still capable of compromising your iOS device (and your Android, and your PC or Mac). Literally the only question is how much they’ll have to spend doing it.

Here’s one data point. Last year a human rights activist in the UAE was targeted via a powerful zero day exploit, likely by his government. However, he was careful. Instead of clicking the link he was sent, the activist sent it to the engineers at Citizenlab who reverse-engineered the exploit. The resulting 35-page technical report by Lookout Security and Citizenlab is a thing of terrifying beauty: it describes a chain of no less than three previously unpublished software exploits, which together would have led to the complete compromise of the victim’s iPhone.

But such compromises don’t come cheap. It’s easy to see this kind of attack costing a million dollars or more. This is probably orders of magnitude more than it would cost to compromise the typical desktop user. That’s important. Not perfect, but important.

You’re telling me I have to give up my desktop machine?

Not at all. Or rather, while I’d love to tell you that, I understand this may not be realistic for most users.

All I am telling you to do is to be thoughtful. If you’re working on something sensitive, consider moving the majority of that work (and communications) to a secure device until you’re ready to share it. This may be a bit of a hassle, but it doesn’t have to be your whole life. And since most of us already carry some sort of phone or tablet in addition to our regular work computer, hopefully this won’t require too much of a change in your life.

You can still use your normal computer just fine, as long as you’re aware of the relative risks. That’s all I’m trying to accomplish with this post.

In conclusion

I expect that many technical people will find this post objectionable, largely because they assume that with their expertise and care they can make a desktop operating system work perfectly safely. And maybe they can! But that’s not who this post is addressed to.

And of course, this post still only scratches the surface of the problem. There’s still the problem of selecting the right applications for secure messaging (e.g., Signal and WhatsApp) and finding a good secure application for notetaking and document collaboration and so on.

But hopefully this post at least starts the discussion.

The future of Ransomware

This is kind of a funny post for me to write, since it ransomwareinvolves speculating about a very destructive type of software — and possibly offering some (very impractical) suggestions on how it might be improved in the future. It goes without saying that there are some real downsides to this kind of speculation. Nonetheless, I’m going ahead on the theory that it’s usually better to talk and think about the bad things that might happen to you — before you meet them on the street and they steal your lunch money.

On the other hand, just as there’s a part of every karate master that secretly wants to go out and beat up a bar full of people, there’s a part of every security professional that looks at our current generation of attackers and thinks: why can’t you people just be a bit more imaginative?! And wonders whether, if our attackers were just a little more creative, people would actually pay attention to securing their system before the bad stuff happens.

And ransomware is definitely a bad thing. According to the FBI it sucks up $1 billion/year in payments alone, and some unimaginably larger amount in remediation costs. This despite the fact that many ransomware packages truly suck, and individual ransomware developers get routinely pwned due to making stupid cryptographic errors. If this strategy is working so well today, the question  we should be asking ourselves is: how much worse could it get?

So that’s what I’m going to muse about now. A few (cryptographic) ways that it might.

Some of these ideas are the result of collaboration with my students Ian Miers, Gabe Kaptchuk and Christina Garman. They range from the obvious to the foolish to the whimsical, and I would be utterly amazed if any of them really do happen. So please don’t take this post too seriously. It’s all just fun.

Quick background: ransomware today

The amazing thing about ransomware is that something so simple could turn out to be such a problem. Modern ransomware consists of malware that infects your computer and then goes about doing something nasty: it encrypts every file it can get its hands on. This typically includes local files as well as network shares that can be reached from the infected machine.

cryptob1

Once your data has been encrypted, your options aren’t great. If you’re lucky enough to have a recent backup, you can purge the infected machine and restore. Otherwise you’re faced with a devil’s bargain: learn top live without that data, or pay the bastards.

If you choose to pay up, there are all sorts of different procedures. However most break down into the following three steps:

  1. When the ransomware encrypts your files, it generates a secret key file and stores it on your computer.
  2. You upload that file (or data string) to your attackers along with a Bitcoin payment.
  3. They process the result with their secrets and send you a decryption key.

If you’re lucky, and your attackers are still paying attention (or haven’t screwed up the crypto beyond recognition) you get back a decryption key or a tool you can use to undo the encryption on your files. The whole thing is very businesslike. Indeed, recent platforms will allegedly offer you a discount if you infect recommend it to your friends — just like Lyft!

The problem of course, is that nothing in this process guarantees that your attacker will give you that decryption key. They might be scammers. They might not have the secret anymore. They might get tracked down and arrested. Or they might get nervous and bail, taking your precious data and your payment with them. This uncertainty makes ransomware payments inherently risky — and worse, it’s the victims who mostly suffer for it.

Perhaps it would be nice if we could make that work better.

Verifiable key delivery using smart contracts

Most modern ransomware employs a cryptocurrency like Bitcoin to enable the payments that make the ransom possible. This is perhaps not the strongest argument for systems like Bitcoin — and yet it seems unlikely that Bitcoin is going away anytime soon. If we can’t solve the problem of Bitcoin, maybe it’s possible to use Bitcoin to make “more reliable” ransomware.

Recall that following a ransomware infection, there’s a possibility that you’ll pay the ransom and get nothing in return. Fundamentally there’s very little you can do about this. A conscientious ransomware developer might in theory offer a “proof of life” — that is, offer to decrypt a few files at random in order to prove their bonafides. But even if they bother with all the risk and interaction of doing this, there’s still no guarantee that they’ll bother to deliver the hostage alive.

An obvious approach to this problem is to make ransomware payments conditional. Rather than sending off your payment and hoping for the best, victims could use cryptocurrency features to ensure that ransomware operators can’t get paid unless they deliver a key. Specifically, a ransomware developer could easily perform payment via a smart contract script (in a system like Ethereum) that guarantees the following property:

This payment will be delivered to the ransomware operator if and only if the ransomware author unlocks it — by posting the ransomware decryption key to the same blockchain.

The basic primitive needed for this is called a Zero Knowledge Contingent Payment. This idea was proposed by Greg Maxwell and demonstrated by Sean Bowe of the ZCash team.**** The rough idea is to set the decryption key to be some pre-image k for some public hash value K that the ransomware generates and leaves on your system. It’s relatively easy to imagine a smart contract that allows payment if and only if the payee can post the input k such that K=SHA256(k). This could easily be written in Ethereum, and almost certainly has an analog for Bitcoin script.

The challenge here, of course, is to prove that k is actually a decryption key for your files, and that the files contain valid data. There are a handful of different ways to tackle this problem. One is to use complex zero-knowledge proof techniques (like zkSNARKs or ZKBoo) to make the necessary proofs non-interactively. But this is painful, and frankly above the level of most ransomware developers — who are still struggling with basic RSA.

An alternative approach is to use several such K challenges in combination with the “proof of life” idea. The ransomware operator would prove her bonafides by decrypting a small, randomly selected subset of files before the issuer issued payment. The operator could still “fake” the encryption — or lose the decryption key — but she would be exposed with reasonable probability before money changed hands.

“Autonomous” ransomware

Of course, the problem with “verifiable” ransomware is: what ransomware developer would bother with this nonsense?google-self-driving-car-624x326

While the ability to verify decryption might conceivably improve customer satisfaction, it’s not clear that it would really offer that much value to ransomware deverlopers. At the same time, it would definitely add a lot of nasty complexity to their software.

Instead of pursuing ideas that offer developers no obvious upside, ransomware designers presumably will pursue ideas that offer them some real benefits. And that brings us to an idea time whose time has (hopefully) not quite come yet. The idea itself is simple:

Make ransomware that doesn’t require operators.

Recall that in the final step of the ransom process, the ransomware operator must deliver a decryption key to the victim. This step is the most fraught for operators, since it requires them to manage keys and respond to queries on the Internet. Wouldn’t it be better for operators if they could eliminate this step altogether?

Of course, to accomplish this seems to require a trustworthy third party — or better, a form of ransomware that can decrypt itself when the victim makes a Bitcoin payment. Of course this last idea seems fundamentally contradictory. The decryption keys would have to live on the victim’s device, and the victim owns that device. If you tried that, then victim could presumably just hack the secrets out and decrypt the ransomware without paying.

But what if the victim couldn’t hack their own machine?

This isn’t a crazy idea. In fact, it’s exactly the premise that’s envisioned by a new class of trusted execution environments, including Intel’s SGX and ARM TrustZone. These systems — which are built into the latest generation of many processors — allow users to instantiate “secure enclaves”: software environments that can’t be accessed by outside parties. SGX also isolates enclaves from other enclaves, which means the secrets they hold are hard to pry out.

Hypothetically, after infecting your computer a piece of ransomware could generate and store its decryption key inside of a secure enclave. This enclave could be programmed to release the key only on presentation of a valid Bitcoin payment to a designated address.

The beauty of this approach is that no third party even needs to verify the payment. Bitcoin payments themselves consist of a publicly-verifiable transaction embedded in a series of “blocks”, each containing an expensive computational “proof of work“. In principle, after paying the ransom the victim could present the SGX enclave with a fragment of a blockchain all by itself — freeing the ransomware of the need to interact with third parties. If the blockchain fragment exhibited sufficient hashpower along with a valid payment to a specific address, the enclave would release the decryption key.*

The good news is that Intel and ARM have devoted serious resources to preventing this sort of unauthorized access. SGX developers must obtain a code signing certificate from Intel before they can make production-ready SGX enclaves, and it seems unlikely that Intel would partner up with a ransomware operation. Thus a ransomware operator would likely have to (1) steal a signing key from a legitimate Intel-certified developer, or (2) find an exploitable vulnerability in another developer’s enclave.**, ***

This all seems sort of unlikely, and that appears to block most of the threat — for now. Assuming companies like Intel and Qualcomm don’t screw things up, and have a good plan for revoking enclaves (uh oh), this is not very likely to be a big threat.

Of course, in the long run developers might not need Intel SGX at all. An even more speculative concern is that developments in the field of cryptographic obfuscation will provide a software-only alternative means to implement this type of ransomware. This would eliminate the need for a dependency like SGX altogether, allowing the ransomware to do its work with no hardware at all.

At present such techniques are far north of practical, keep getting broken, and might not work at all. But cryptographic researchers keep trying! I guess the lesson is that it’s not all roses if they succeed.

Ransomware Skynet

Since I’m already this far into what reads like a Peyote-fueled rant, let’s see if we can stretch the bounds of credibility just a little a bit farther. If ransomware can become partially autonomous — i.e., do part of its job without the need for human masters — what would it mean for it to become fully autonomous? In other words, what if we got rid of the rest of the human equation?

terminatorgenisys1-xlarge
I come from the future to encrypt C:\Documents

Ransomware with the ability to enforce payments would provide a potent funding source for another type of autonomous agent: a Decentralized Autonomous Organization, or (DAO). These systems are “corporations” that consist entirely of code that runs on a consensus network like Ethereum. They’re driven by rules, and are capable of both receiving and transmitting funds without (direct) instruction from human beings.

At least in theory it might be possible to develop a DAO that’s funded entirely by ransomware payments — and in turn mindlessly contracts real human beings to develop better ransomware, deploy it against human targets, and… rinse repeat. It’s unlikely that such a system would be stable in the long run — humans are clever and good at destroying dumb things — but it might get a good run. Who knows? Maybe this is how the Rampant Orphan Botnet Ecologies get started.

(I hope it goes without saying that I’m mostly not being serious about this part. Even though it would be totally awesome in a horrible sort of way.)

In conclusion

This hasn’t been a terribly serious post, although it was fun to write. The truth is that as a defender, watching your attackers fiddle around is pretty much the most depressing thing ever. Sometimes you have to break the monotony a bit.

But insofar as there is a serious core to this post, it’s that ransomware currently is using only a tiny fraction of the capabilities available to it. Secure execution technologies in particular represent a giant footgun just waiting to go off if manufacturers get things only a little bit wrong.

Hopefully they won’t, no matter how entertaining it might be.

Notes:

* This technique is similar to SPV verification. Of course, it would also be possible for a victim to “forge” a blockchain fragment without paying the ransom. However, the cost of this could easily be tuned to significantly exceed the cost of paying the ransom. There are also many issues I’m glossing over here like difficulty adjustments and the possibility of amortizing the forgery over many different victims. But thinking about that stuff is a drag, and this is all for fun, right?

** Of course, if malware can exploit such a vulnerability in another developer’s enclave to achieve code execution for “ransomware”, then the victim could presumably exploit the same vulnerability to make the ransomware spit out its key without a payment. So this strategy seems self-limiting — unless the ransomware developers find a bug that can be “repaired” by changing some immutable state held by the enclave. That seems like a long shot. And no, SGX does not allow you to “seal” data to the current state of the enclave’s RAM image.

*** In theory, Intel or an ARM manufacturer could also revoke the enclave’s signing certificate. However, the current SGX specification doesn’t explain how such a revocation strategy should work. I assume this will be more prominent in future specifications.

**** The original version of this post didn’t credit Greg and Sean properly, because I honestly didn’t make the connection that I was describing the right primitive. Neat!

Zero Knowledge Proofs: An illustrated primer, Part 2

This post is the second in a two-part series on zero-knowledge proofs. Click here t2380271980_b2a66bd47d_zo read Part 1.

In this post I’m going to continue the short, (relatively) non-technical overview of zero knowledge proofs that I started a couple of years ago. Yes, that was a very long time! If you didn’t catch the first post, now would be an excellent time to go read it.

Before we go much further, a bit of a warning. While this series is still intended as a high-level overview, at a certain point it’s necessary to dig a bit deeper into some specific algorithms. So you should expect this post to get a bit wonkier than the last.

A quick recap, and a bit more on Zero Knowledge(ness)

First, a brief refresher.

In the last post we defined a zero knowledge proof as an interaction between two computer programs (or Turing machines) — respectively called a Prover and a Verifier — where the Prover works to convince the Verifier that some mathematical statement is true. We also covered a specific example: a clever protocol by Goldreich, Micali and Wigderson that allows us to prove, in zero knowledge, that a graph possesses a three-coloring.

In the course of that discussion, we described three critical properties that any zero knowledge proof must satisfy:

  • Completeness: If the Prover is honest, then she will eventually convince the Verifier.
  • Soundness: The Prover can only convince the Verifier if the statement is true.
  • Zero-knowledge(ness): The Verifier learns no information beyond the fact that the statement is true.

The real challenge turns out to be finding a way to formally define the last property. How do you state that a Verifier learns nothing beyond the truth of a statement?

In case you didn’t read the previous post — the answer to this question came from Goldwasser, Micali and Rackoff, and it’s very cool. What they argued is that a protocol can be proven zero knowledge if for every possible Verifier, you can demonstrate the existence of an algorithm called a ‘Simulator’, and show that this algorithm has some very special properties.

From a purely mechanical perspective, the Simulator is like a special kind of Prover. However, unlike a real Prover — which starts with some special knowledge that allows it to prove the truth of a statement — the Simulator gets no special knowledge at all.* Nonetheless, the Simulator (or Simulators) must be able to ‘fool’ every Verifier into believing that the statement is true, while producing a transcript that’s statistically identical top (or indistinguishable from) the output of a real Prover.

The logic here flows pretty cleanly: since Simulator has no ‘knowledge’ to extract in the first place, then clearly a Verifier can’t obtain any meaningful amount of information after interacting with it. Moreover, if the transcript of the interaction is distributed identically to a real protocol run with a normal Prover, then the Verifier can’t do better against the real prover than it can do against the Simulator. (If the Verifier could do better, then that would imply that the distributions were not statistically identical.) Ergo, the Verifier can’t extract useful information from the real protocol run.

This is incredibly wonky, and worse, it seems contradictory! We’re asking that a protocol be both sound — meaning that a bogus Prover can’t trick some Verifier into accepting a statement unless it has special knowledge allowing it to prove the statement — but we’re also asking for the existence of an algorithm (the simulator) that can literally cheat. Clearly both properties can’t hold at the same time.

The solution to this problem is that both properties don’t hold at the same time.

To build our simulator, we’re allowed to do things to the Verifier that would never happen in the real world. The example that I gave in the previous post was to use a ‘time machine’ — that is, our ‘Simulator’ can rewind the Verifier program’s execution in order to ‘fool’ it. Thus, in a world where we can wind the Verifier back in time, it’s easy to show that a Simulator exists. In the real world, of course it doesn’t. This ‘trick’ gets us around the contradiction.

As a last reminder, to illustrate all of these ideas, we covered one of the first general zero knowledge proofs, devised by Goldreich, Micali and Wigderson (GMW). That protocol allowed us to prove, in zero knowledge, that a graph supports a three-coloring. Of course, proving three colorings isn’t terribly interesting. The real significance of the GMW result is theoretical. Since graph three coloring is known to be in the complexity class NP-complete, the GMW protocol can be used to prove any statement in the class NP. And that’s quite powerful.

Let me elaborate slightly on what that means:

  1. If there exists any decision problem (that is, a problem with a yes/no answer) whose witness (solution) can be verified in polynomial time, then:
  2. We can prove that said solution exists by (1) translating the problem into an instance of the graph three-coloring problem, and (2) running the GMW protocol.*

This amazing result gives us interactive zero knowledge proofs for every statement in NP. The only problem is that it’s almost totally unusable.

From theory into practice

If you’re of a practical mindset, you’re probably shaking your head at all this talk of ZK proofs. That’s because actually using this approach would be an insanely expensive and stupid thing to do. Most likely you’d first represent your input problem as a boolean circuit where the circuit is satisfied if and only if you know the correct input. Then you’d have to translate your circuit into a graph, resulting in some further blowup. Finally you’d need to run the GMW protocol, which is damned expensive all by itself.

So in practice nobody does this. It’s really considered a ‘feasibility’ result. Once you show that something is possible, the next step is to make it efficient.

But we do use zero knowledge proofs, almost every day. In this post I’m going to spend some time talking about the more practical ZK proofs that we actually use. To do that I just need give just a tiny bit of extra background.

Proofs vs. Proofs of Knowledge

Before we go on, there’s one more concept we need to cover. Specifically, we need to discuss what precisely we’re proving when we conduct a zero knowledge proof.

Let me explain. At a high level, there are two kinds of statement you might want to prove in zero knowledge. Roughly speaking, these break up as follows.

Statements about “facts”. For example, I might wish to prove that “a specific graph has a three coloring” or “some number N is in the set of composite numbers“. Each of these is a statement about some intrinsic property of the universe.

Statements about my personal knowledge. Alternatively, I might wish to prove that I know some piece information. Examples of this kind of statement include: “I know a three coloring for this graph”, or “I know the factorization of N”. These go beyond merely proving that a fact is true, and actually rely on what the Prover knows.

It’s important to recognize that there’s a big difference between these two kinds of statements! For example, it may be possible to prove that a number N is composite even if you don’t know the full factorization. So merely proving the first statement is not equivalent to proving the second one.

The second class of proof is known as a “proof of knowledge”. It turns out to be extremely useful for proving a variety of statements that we use in real life. In this post, we’ll mostly be focusing on this kind of proof.

The Schnorr identification protocol

Now that we’ve covered some of the required background, it’s helpful to move on to a specific and very useful proof of knowledge that was invented by Claus-Peter Schnorr in the 1980s. At first glance, the Schnorr protocol may seem a bit odd, but in fact it’s the basis of many of our modern signature schemes today.

Schnorr wasn’t really concerned with digital signatures, however. His concern was with identification. Specifically, let’s imagine that Alice has published her public key to the world, and later on wants to prove that she knows the secret key corresponding to that public key. This is the exact problem that we encounter in real-world protocols such as public-key SSH, so it turns out to be well-motivated.

Schnorr began with the assumption that the public key would be of a very specific format. Specifically, let p be some prime number, and let g be a generator of a cyclic group of prime-order q. To generate a keypair, Alice would first pick a random integer a between 1 and q, and then compute the keypair as:

PK_{A} = g^a~mod~p, SK_{A} = a

(If you’ve been around the block a time or two, you’ll probably notice that this is the same type of key used for Diffie-Hellman and the DSA signing algorithm. That’s not a coincidence, and it makes this protocol very useful.)

Alice keeps her secret key to herself, but she’s free to publish her public key to the world. Later on, when she wants to prove knowledge of her secret key, she conducts the following simple interactive protocol with Bob:

There’s a lot going on in here, so let’s take a minute to unpack things.

First off, we should ask ourselves if the protocol is complete. This is usually the easiest property to verify: if Alice performs the protocol honestly, should Bob be satisfied at the end of it? In this case, completeness is pretty easy to see just by doing a bit of substitution:

Proving soundness

The harder property is soundness. Mainly because we don’t yet have a good definition of what it means for a proof of knowledge to be sound. Remember that what we want to show is the following:

If Alice successfully convinces Bob, then she must know the secret key a.

It’s easy to look at the equations above and try to convince yourself that Alice’s only way to cheat the protocol is to know a. But that’s hardly a proof.

When it comes to demonstrating the soundness of a proof of knowledge, we have a really nice formal approach. Just as with the Simulator we discussed above, we need to demonstrate the existence of a special algorithm. This algorithm is called a knowledge extractor, and it does exactly what it claims to. A knowledge extractor (or just ‘Extractor’ for short) is a special type of Verifier that interacts with a Prover, and — if the Prover succeeds in completing the proof — the Extractor should be able to extract the Prover’s original secret.

And this answers our question above. To prove soundness for a proof of knowledge, we must show that an Extractor exists for every possible Prover.

Of course this again seems totally contradictory to the purpose of a zero knowledge protocol — where we’re not supposed to be able to learn secrets from a Prover. Fortunately we’ve already resolved this conundrum once for the case of the Simulator. Here again, we take the same approach. The Extractor is not required to exist during a normal run of the protocol. We simply show that it exists if we’re allowed to take special liberties with the Prover — in this case, we’ll use ‘rewinding’ to wind back the Prover’s execution and allow us to extract secrets.

The extractor for the Schnorr protocol is extremely clever — and it’s also pretty simple. Let’s illustrate it in terms of a protocol diagram. Alice (the Prover) is on the left, and the Extractor is on the right:

The key observation here is that by rewinding Alice’s execution, the Extractor can ‘trick’ Alice into making two different proof transcripts using the same k. This shouldn’t normally happen in a real protocol run, where Alice specifically picks a new k for each execution of the protocol.

If the Extractor can trick Alice into doing this, then he can solve the following simple equation to recover Alice’s secret:

It’s worth taking a moment right now to note that this also implies a serious vulnerability in bad implementations of the Schnorr protocol. If you ever accidentally use the same k for two different runs of the protocol, an attacker may be able to recover your secret key! This can happen if you use a bad random number generator.

Indeed, those with a bit more experience will notice that this is similar to a real attack on systems (with bad random number generators) that implement ECDSA or DSA signatures! This is also not a coincidence. The (EC)DSA signature family is based on Schnorr. Ironically, the developers of DSA managed to retain this vulnerability of the Schorr family of protocols while at the same time ditching the security proof that makes Schnorr so nice.

Proving zero-knowledge(ness) against an honest Verifier

Having demonstrated that Schnorr signatures are complete and sound, it remains only to prove that they’re ‘zero knowledge’. Remember that to do this, normally we require a Simulator that can interact with any possible Verifier and produce a ‘simulated’ transcript of the proof, even if the Simulator doesn’t know the secret it’s proving it knows.

The standard Schnorr protocol does not have such a Simulator, for reasons we’ll get into in a second. Instead, to make the proof work we need to make a special assumption. Specifically, the Verifier needs to be ‘honest’. That is, we need to make the special assumption that it will run its part of the protocol correctly — namely, that it will pick its challenge “c” using only its random number generator, and will not choose this value based on any input we provide it. As long as it does this, we can construct a Simulator.

Here’s how the Simulator works.

Let’s say we are trying to prove knowledge of a secret a for some public key g^a~mod~pbut we don’t actually know the value aOur Simulator assumes that the Verifier will choose some value c as its challenge, and moreover, it knows that the honest Verifier will choose the value c only based on its random number generator — and not based on any inputs the Prover has provided.

  1. First, output some initial g^{k_1} as the Prover’s first message, and find out what challenge c the Verifier chooses.
  2. Rewind the Verifier, and pick a random integer z in the range \{0,\dots,q-1\}.
  3. Compute g^{k_2} = g^z * g^{a (-c)} and output g^{k_2} as the Prover’s new initial message.
  4. When the Verifier challenges on c again, output z.
Notice that the transcript g^k, c, z will verify correctly as a perfectly valid, well-distributed proof of knowledge of the value a. The Verifier will accept this output as a valid proof of knowledge of a, even though the Simulator does not know a in the first place!
What this proves is that if we can rewind a Verifier, then (just as in the first post in this series) we can always trick the Verifier into believing we have knowledge of a value, even when we don’t. And since the statistical distribution of our protocol is identical to the real protocol, this means that our protocol must be zero knowledge — against an honest Verifier.

From interactive to non-interactive

So far we’ve shown how to use the Schnorr protocol to interactively prove knowledge of a secret key a that corresponds to a public key g^{a}. This is an incredibly useful protocol, but it only works if our Verifier is online and willing to interact with us.

An obvious question is whether we can make this protocol work without interaction. Specifically, can I make a proof that I can send you without you even being online. Such a proof is called a non-interactive zero knowledge proof (NIZK). Turning Schnorr into a non-interactive proof seems initially quite difficult — since the protocol fundamentally relies on the Verifier picking a random challenge. Fortunately there is a clever trick we can use.

This technique was developed by Fiat and Shamir in the 1980s. What they observed was that if you have a decent hash function lying around, you can convert an interactive protocol into a non-interactive one by simply using the hash function to pick the challenge.

Specifically, the revised protocol for proving knowledge of a with respect to a public key g^a looks like this:

  1. The Prover picks g^k (just as in the interactive protocol).
  2. Now, the prover computes the challenge as c = H(g^k || M) where H() is a hash function, and M is an (optional) and arbitary message string.
  3. Compute ac + k~mod~q (just as in the interactive protocol).

The upshot here is that the hash function is picking the challenge c without any interaction with the Verifier. In principle, if the hash function is “strong enough” (meaning, it’s a random oracle) then the result is a completely non-interactive proof of knowledge of the value a that the Prover can send to the Verifier. The proof of this is relatively straightforward.

The particularly neat thing about this protocol is that it isn’t just a proof of knowledge, it’s also a signature scheme. That is, if you put a message into the (optional) value M, you obtain a signature on M, which can only be produced by someone who knows the secret key a. The resulting protocol is called the Schnorr signature scheme, and it’s the basis of real-world protocols like EdDSA.

Phew.

Yes, this has been a long post and there’s probably a lot more to be said. Hopefully there will be more time for that in a third post — which should only take me another three years.
Notes:


* In this definition, it’s necessary that the statement be literally true.

The limitations of Android N Encryption

Over the past few years pixelphonewe’ve heard more about smartphone encryption than, quite frankly, most of us expected to hear in a lifetime. We learned that proper encryption can slow down even sophisticated decryption attempts if done correctly. We’ve also learned that incorrect implementations can undo most of that security.

In other words, phone encryption is an area where details matter. For the past few weeks I’ve been looking a bit at Android Nougat’s new file-based encryption to see how well they’ve addressed some of those details in their latest release. The answer, unfortunately, is that there’s still lots of work to do. In this post I’m going to talk about a bit of that.

(As an aside: the inspiration for this post comes from Grugq, who has been loudly and angrily trying to work through these kinks to develop a secure Android phone. So credit where credit is due.)

Background: file and disk encryption 

Disk encryption is much older than smartphones. Indeed, early encrypting filesystems date back at least to the early 1990s and proprietary implementations may go back before that. Even in the relatively new area of PCs operating systems, disk encryption has been a built-in feature since the early 2000s.

The typical PC disk encryption system operates as follows. At boot time you enter a password. This is fed through a key derivation function to derive a cryptographic key. If a hardware co-processor is available (e.g., a TPM), your key is further strengthened by “tangling” it with some secrets stored in the hardware. This helps to lock encryption to a particular device.

The actual encryption can be done in one of two different ways:

  1. Full Disk Encryption (FDE) systems (like TruecryptBitLocker and FileVault) encrypt disks at the level of disk sectors. This is an all-or-nothing approach, since the encryption drivers won’t necessarily have any idea what files those sectors represent. At the same time, FDE is popular — mainly because it’s extremely easy to implement.
  2. File-based Encryption (FBE) systems (like EncFS and eCryptFS) encrypt individual files. This approach requires changes to the filesystem itself, but has the benefit of allowing fine grained access controls where individual files are encrypted using different keys.

Most commercial PC disk encryption software has historically opted to use the full-disk encryption (FDE) approach. Mostly this is just a matter of expediency: FDE is just significantly easier to implement. But philosophically, it also reflects a particular view of what disk encryption was meant to accomplish.

In this view, encryption is an all-or-nothing proposition. Your machine is either on or off; accessible or inaccessible. As long as you make sure to have your laptop stolen only when it’s off, disk encryption will keep you perfectly safe.

So what does this have to do with Android?

Android’s early attempts at adding encryption to their phones followed the standard PC full-disk encryption paradigm. Beginning in Android 4.4 (Kitkat) through Android 6.0 (Marshmallow), Android systems shipped with a kernel device mapper called dm-crypt designed to encrypt disks at the sector level. This represented a quick and dirty way to bring encryption to Android phones, and it made sense — if you believe that phones are just very tiny PCs.

The problem is that smartphones are not PCs.

The major difference is that smartphone users are never encouraged to shut down their device. In practice this means that — after you enter a passcode once after boot — normal users spend their whole day walking around with all their cryptographic keys in RAM. Since phone batteries live for a day or more (a long time compared to laptops) encryption doesn’t really offer much to protect you against an attacker who gets their hands on your phone during this time.

Of course, users do lock their smartphones. In principle, a clever implementation could evict sensitive cryptographic keys from RAM when the device locks, then re-derive them the next time the user logs in. Unfortunately,  Android doesn’t do this — for the very simple reason that Android users want their phones to actually work. Without cryptographic keys in RAM, an FDE system loses access to everything on the storage drive. In practice this turns it into a brick.

For this very excellent reason, once you boot an Android FDE phone it will never evict its cryptographic keys from RAM. And this is not good.

So what’s the alternative?

Android is not the only game in town when it comes to phone encryption. Apple, for its part, also gave this problem a lot of thought and came to a subtly different solution.

Starting with iOS 4, Apple included a “data protection” feature to encrypt all data stored a device. But unlike Android, Apple doesn’t use the full-disk encryption paradigm. Instead, they employ a file-based encryption approach that individually encrypts each file on the device.

In the Apple system, the contents of each file is encrypted under a unique per-file key (metadata is encrypted separately). The file key is in turn encrypted with one of several “class keys” that are derived from the user passcode and some hardware secrets embedded in the processor.

apple
iOS data encryption. Source: iOS Security Guide.

The main advantage of the Apple approach is that instead of a single FDE key to rule them all, Apple can implement fine-grained access control for individual files. To enable this, iOS provides an API developers can use to specify which class key to use in encrypting any given file. The available “protection classes” include:

  • Complete protection. Files encrypted with this class key can only be accessed when the device is powered up and unlocked. To ensure this, the class key is evicted from RAM a few seconds after the device locks.
  • Protected Until First User Authentication. Files encrypted with this class key are protected until the user first logs in (after a reboot), and the key remains in memory.
  • No protection. These files are accessible even when the device has been rebooted, and the user has not yet logged in.

By giving developers the option to individually protect different files, Apple made it possible to build applications that can work while the device is locked, while providing strong protection for files containing sensitive data.

Apple even created a fourth option for apps that simply need to create new encrypted files when the class key has been evicted from RAM. This class uses public key encryption to write new files. This is why you can safely take pictures even when your device is locked.

Apple’s approach isn’t perfect. What it is, however, is the obvious result of a long and careful thought process. All of which raises the following question…

Why the hell didn’t Android do this as well?

The short answer is Android is trying to. Sort of. Let me explain.

As of Android 7.0 (Nougat), Google has moved away from full-disk encryption as the primary mechanism for protecting data at rest. If you set a passcode on your device, Android N systems can be configured to support a more Apple-like approach that uses file encryption. So far so good.

The new system is called Direct Boot, so named because it addresses what Google obviously saw as fatal problem with Android FDE — namely, that FDE-protected phones are useless bricks following a reboot. The main advantage of the new model is that it allows phones to access some data even before you enter the passcode. This is enabled by providing developers with two separate “encryption contexts”:

  • Credential encrypted storage. Files in this area are encrypted under the user’s passcode, and won’t be available until the user enters their passcode (once).
  • Device encrypted storage. These files are not encrypted under the user’s passcode (though they may be encrypted using hardware secrets). Thus they are available after boot, even before the user enters a passcode.

Direct Boot even provides separate encryption contexts for different users on the phone — something I’m not quite sure what to do with. But sure, why not?

If Android is making all these changes, what’s the problem?

One thing you might have noticed is that where Apple had four categories of protection, Android N only has two. And it’s the two missing categories that cause the problems. These are the “complete protection” categories that allow the user to lock their device following first user authentication — and evict the keys from memory.

Of course, you might argue that Android could provide this by forcing application developers to switch back to “device encrypted storage” following a device lock. The problem with this idea is twofold. First, Android documentation and sample code is explicit that this isn’t how things work:

credentialenc

Moreover, a quick read of the documentation shows that even if you wanted to, there is no unambiguous way for Android to tell applications when the system has been re-locked. If keys are evicted when the device is locked, applications will unexpectedly find their file accesses returning errors. Even system applications tend to do badly when this happens.

And of course, this assumes that Android N will even try to evict keys when you lock the device. Here’s how the current filesystem encryption code handles locks:

lockuserkey

While the above is bad, it’s important to stress that the real problem here is not really in the cryptography. The problem is that since Google is not giving developers proper guidance, the company may be locking Android into years of insecurity. Without (even a half-baked) solution to define a “complete” protection class, Android app developers can’t build their apps correctly to support the idea that devices can lock. Even if Android O gets around to implementing key eviction, the existing legacy app base won’t be able to handle it — since this will break a million apps that have implemented their security according to Android’s current recommendations.

In short: this is a thing you get right from the start, or you don’t do at all. It looks like — for the moment — Android isn’t getting it right.

Are keys that easy to steal?

Of course it’s reasonable to ask whether it’s having keys in RAM is that big of concern in the first place. Can these keys actually be accessed?

The answer to that question is a bit complicated. First, if you’re up against somebody with a hardware lab and forensic expertise, the answer is almost certainly “yes”. Once you’ve entered your passcode and derived the keys, they aren’t stored in some magically secure part of the phone. People with the ability to access RAM or the bus lines of the device can potentially nick them.

But that’s a lot of work. From a software perspective, it’s even worse. A software attack would require a way to get past the phone’s lockscreen in order to get running code on the device. In older (pre-N) versions of Android the attacker might need to then escalate privileges to get access to Kernel memory. Remarkably, Android N doesn’t even store its disk keys in the Kernel — instead they’re held by the “vold” daemon, which runs as user “root” in userspace. This doesn’t make exploits trivial, but it certainly isn’t the best way to handle things.

Of course, all of this is mostly irrelevant. The main point is that if the keys are loaded you don’t need to steal them. If you have a way to get past the lockscreen, you can just access files on the disk.

What about hardware?

Although a bit of a tangent, it’s worth noting that many high-end Android phones use some sort of trusted hardware to enable encryption. The most common approach is to use a trusted execution environment (TEE) running with ARM TrustZone.

This definitely solves a problem. Unfortunately it’s not quite the same problem as discussed above. ARM TrustZone — when it works correctly, which is not guaranteed — forces attackers to derive their encryption keys on the device itself, which should make offline dictionary attacks on the password much harder. In some cases, this hardware can be used to cache the keys and reveal them only when you input a biometric such as a fingerprint.

The problem here is that in Android N, this only helps you at the time the keys are being initially derived. Once that happens (i.e., following your first login), the hardware doesn’t appear to do much. The resulting derived keys seem to live forever in normal userspace RAM. While it’s possible that specific phones (e.g., Google’s Pixel, or Samsung devices) implement additional countermeasures, on stock Android N phones hardware doesn’t save you.

So what does it all mean?

How you feel about this depends on whether you’re a “glass half full” or “glass half empty” kind of person.

If you’re an optimistic type, you’ll point out that Android is clearly moving in the right direction. And while there’s a lot of work still to be done, even a half-baked implementation of file-based implementation is better than the last generation of dumb FDE Android encryption. Also: you probably also think clowns are nice.

On the other hand, you might notice that this is a pretty goddamn low standard. In other words, in 2016 Android is still struggling to deploy encryption that achieves (lock screen) security that Apple figured out six years ago. And they’re not even getting it right. That doesn’t bode well for the long term security of Android users.

And that’s a shame, because as many have pointed out, the users who rely on Android phones are disproportionately poorer and more at-risk. By treating encryption as a relatively low priority, Google is basically telling these people that they shouldn’t get the same protections as other users. This may keep the FBI off Google’s backs, but in the long term it’s bad judgement on Google’s part.