Beyond public key encryption

One of the saddest and most fascinating things about applied cryptography is how 6689264031_4c7516b3e1_zlittle cryptography we actually use. This is not to say that cryptography isn’t widely used in industry — it is. Rather, what I mean is that cryptographic researchers have developed so many useful technologies, and yet industry on a day to day basis barely uses any of them. In fact, with a few minor exceptions, the vast majority of the cryptography we use was settled by the early-2000s.*

Most people don’t sweat this, but as a cryptographer who works on the boundary of research and deployed cryptography it makes me unhappy. So while I can’t solve the problem entirely, what I can do is talk about some of these newer technologies. And over the course of this summer that’s what I intend to do: talk. Specifically, in the next few weeks I’m going to write a series of posts that describe some of the advanced cryptography that we don’t generally see used.

Today I’m going to start with a very simple question: what lies beyond public key cryptography? Specifically, I’m going to talk about a handful of technologies that were developed in the past 20 years, each of which allows us to go beyond the traditional notion of public keys.

This is a wonky post, but it won’t be mathematically-intense. For actual definitions of the schemes, I’ll provide links to the original papers, and references to cover some of the background. The point here is to explain what these new schemes do — and how they can be useful in practice.

Identity Based Cryptography

In the mid-1980s, a cryptographer named Adi Shamir proposed a radical new idea. The idea, put simply, was to get rid of public keys.

To understand where Shamir was coming from, it helps to understand a bit about public key encryption. You see, prior to the invention of public key crypto, all cryptography involved secret keys. Dealing with such keys was a huge drag. Before you could communicate securely, you needed to exchange a secret with your partner. This process was fraught with difficulty and didn’t scale well.

Public key encryption (beginning with Diffie-Hellman and Shamir’s RSA cryptosystem) hugely revolutionized cryptography by dramatically simplifying this key distribution process. Rather than sharing secret keys, users could now transmit their public key to other parties. This public key allowed the recipient to encrypt to you (or verify your signature) but it could not be used to perform the corresponding decryption (or signature generation) operations. That part would be done with a secret key you kept to yourself.

While the use of public keys improved many aspects of using cryptography, it also gave rise to a set of new challenges. In practice, it turns out that having public keys is only half the battle — people still need to use distribute them securely.

For example, imagine that I want to send you a PGP-encrypted email. Before I can do this, I need to obtain a copy of your public key. How do I get this? Obviously we could meet in person and exchange that key on physical media — but nobody wants to do this. It would much more desirable to obtain your public key electronically. In practice this means either (1) we have to exchange public keys by email, or (2) I have to obtain your key from a third piece of infrastructure, such as a website or key server. And now we come to the  problem: if that email or key server is untrustworthy (or simply allows anyone to upload a key in your name), I might end up downloading a malicious party’s key by accident. When I send a message to “you”, I’d actually be encrypting it to Mallory.

f64f315ec47f0b041e3d881177039414
Mallory.

Solving this problem — of exchanging public keys and verifying their provenance — has motivated a huge amount of practical cryptographic engineering, including the entire web PKI. In most cases, these systems work well. But Shamir wasn’t satisfied. What if, he asked, we could do it better? More specifically, he asked: could we replace those pesky public keys with something better?

Shamir’s idea was exciting. What he proposed was a new form of public key cryptography in which the user’s “public key” could simply be their identity. This identity could be a name (e.g., “Matt Green”) or something more precise like an email address. Actually, it didn’t realy matter. What did matter was that the public key would be some arbitrary string — and not a big meaningless jumble of characters like “7cN5K4pspQy3ExZV43F6pQ6nEKiQVg6sBkYPg1FG56Not”.

Of course, using an arbitrary string as a public key raises a big problem. Meaningful identities sound great — but I don’t own them. If my public key is “Matt Green”, how do I get the corresponding private key? And if I can get out that private key, what stops some other Matt Green from doing the same, and thus reading my messages? And ok, now that I think about this, what stops some random person who isn’t named Matt Green from obtaining it? Yikes. We’re headed straight into Zooko’s triangle.

Shamir’s idea thus requires a bit more finesse. Rather than expecting identities to be global, he proposed a special server called a “key generation authority” that would be responsible for generating the private keys. At setup time, this authority would generate a single master public key (MPK), which it would publish to the world. If you wanted to encrypt a message to “Matt Green” (or verify my signature), then you could do so using my identity and the single MPK of an authority we’d both agree to use. To decrypt that message (or sign one), I would have to visit the same key authority and ask for a copy of my secret key. The key authority would compute my key based on a master secret key (MSK), which it would keep very secret.

With all algorithms and players specified, whole system looks like this:

IBE
Overview of an Identity-Based Encryption (IBE) system. The Setup algorithm of the Key Generation Authority generates the master public key (MPK) and master secret key (MSK). The authority can use the Extract algorithm to derive the secret key corresponding to a specific ID. The encryptor (left) encrypts using only the identity and MPK. The recipient requests the secret key for her identity, and then uses it to decrypt. (Icons by Eugen Belyakoff)

This design has some important advantages — and more than a few obvious drawbacks. On the plus side, it removes the need for any key exchange at all with the person you’re sending the message to. Once you’ve chosen a master key authority (and downloaded its MPK), you can encrypt to anyone in the entire world. Even cooler: at the time you encrypt, your recipient doesn’t even need to have contacted the key authority yet. She can obtain her secret key after I send her a message.

Of course, this “feature” is also a bug. Because the key authority generates all the secret keys, it has an awful lot of power. A dishonest authority could easily generate your secret key and decrypt your messages. The polite way to say this is that standard IBE systems effectively “bake in” key escrow.**

Putting the “E” in IBE

All of these ideas and more were laid out by Shamir in his 1984 paper. There was just one small problem: Shamir could only figure out half the problem.

Specifically, Shamir’s proposed a scheme for identity-based signature (IBS) — a signature scheme where the public verification key is an identity, but the signing key is generated by the key authority. Try as he might, he could not find a solution to the problem of building identity-based encryption (IBE). This was left as an open problem.***

It would take more than 16 years before someone answered Shamir’s challenge. Surprisingly, when the answer came it came not once but three times.

The first, and probably most famous realization of IBE was developed by Dan Boneh and Matthew Franklin much later — in 2001. The timing of Boneh and Franklin’s discovery makes a great deal of sense. The Boneh-Franklin scheme relies fundamentally on elliptic curves that support an efficient “bilinear map” (or “pairing”).**** The algorithms needed to compute such pairings were not known when Shamir wrote his paper, and weren’t employed constructively — that is, as a useful thing rather than an attack — until about 2000. The same can be said about a second scheme called Sakai-Kasahara, which would be independently discovered around the same time.

(For a brief tutorial on the Boneh-Franklin IBE scheme, see this page.)

The third realization of IBE was not as efficient as the others, but was much more surprising. This scheme was developed by Clifford Cocks, a senior cryptologist at Britain’s GCHQ. It’s noteworthy for two reasons. First, Cocks’ IBE scheme does not require bilinear pairings at all — it is based in the much older RSA setting, which means in principle it spent all those years just waiting to be found. Second, Cocks himself had recently become known for something even more surprising: discovering the RSA cryptosystem, nearly five years before RSA did. To bookend that accomplishment with a second major advance in public key cryptography was a pretty impressive accomplishment.

In the years since 2001, a number of additional IBE constructions have been developed, using all sorts of cryptographic settings. Nonetheless, Boneh and Franklin’s early construction remains among the simplest and most efficient.

Even if you’re not interested in IBE for its own sake, it turns out that this primitive is really useful to cryptographers for many things beyond simple encryption. In fact, it’s more helpful to think of IBE as a way of “pluralizing” a single public/secret master keypair into billions of related keypairs. This makes it useful for applications as diverse as blocking chosen ciphertext attacks, forward-secure public key encryption, and short signature schemes.

Attribute Based Encryption

Of course, if you leave cryptographers alone with a tool like IBE, the first thing they’re going to do is find a way to make things more complicated improve on it.

One of the biggest such improvements is due to Sahai and Waters. It’s called Attribute-Based Encryption, or ABE.

The origin of this idea was not actually to encrypt with attributes. Instead Sahai and Waters were attempting to develop an Identity-Based encryption scheme that could encrypt using biometrics. To understand the problem, imagine I decide to use a biometric like your iris scan as the “identity” to encrypt you a ciphertext. Later on you’ll ask the authority for a decryption key that corresponds to your own iris scan — and if everything matches up and you’ll be able to decrypt.

The problem is that this will almost never work.

The issue here is that biometric readings (like iris scans or fingerprint templates) are inherently error-prone. This means every scan will typically be very close, but often there will be a few bits that disagree. With standard IBE

iris
Tell me this isn’t giving you nightmares.

this is fatal: if the encryption identity differs from your key identity by even a single bit, decryption will not work. You’re out of luck.

Sahai and Waters decided that the solution to this problem was to develop a form of IBE with a “threshold gate”. In this setting, each bit of the identity is represented as a different “attribute”. Think of each of these as components you’d encrypt under — something like “bit 5 of your iris scan is a 1” and “bit 23 of your iris scan is a 0”. The encrypting party lists all of these bits and encrypts under each one. The decryption key generated by the authority embeds a similar list of bit values. The scheme is defined so that decryption will work if and only if the number of matching attributes (between your key and the ciphertext) exceeds some pre-defined threshold: e.g., any 2024 out of 2048 bits must be identical in order to decrypt.

The beautiful thing about this idea is not fuzzy IBE. It’s that once you have a threshold gate and a concept of “attributes”, you can more interesting things. The main observation is that a threshold gate can be used to implement the boolean AND and OR gates, like so:

gates

Even better, you can stack these gates on top of one another to assign a fairly complex boolean formula — which will itself determine what conditions your ciphertext can be decrypted under. For example, switching to a more realistic set of attributes, you could encrypt a medical record so that either a pediatrician in a hospital could read it, or an insurance adjuster could. All you’d need is to make sure people received keys that correctly described their attributes (which are just arbitrary strings, like identities).

ABEFormula
A simple “ciphertext policy”, in which the ciphertext can be decrypted if and only if a key matches an appropriate set of attributes. In this case, the key satisfies the formula and thus the ciphertext decrypts. The remaining key attributes are ignored.

The other direction can be implemented as well. It’s possible to encrypt a ciphertext under a long list of attributes, such as creation time, file name, and even GPS coordinates indicating where the file was created. You can then have the authority hand out keys that correspond to a very precise slice of your dataset — for example, “this key decrypts any radiology file encrypted in Chicago between November 3rd and December 12th that is tagged with ‘pediatrics’ or ‘oncology'”.

Functional Encryption

Once you have a related of primitives like IBE and ABE, the researchers’ instinct is to both extend and generalize. Why stop at simple boolean formulae? Can we make keys (or ciphertexts) that embed arbitrary computer programs? The answer, it turns out, is yes — though not terribly efficiently. A set of recent works show that it is possible to build ABE that works over arbitrary polynomial-size circuits, using various lattice-based assumptions. So there is certainly a lot of potential here.

This potential has inspired researchers to generalize all of the above ideas into a single class of encryption called “functional encryption“. Functional encryption is more conceptual than concrete — it’s just a way to look at all of these systems as instances of a specific class. The basic idea is to represent the decryption procedure as an algorithm that computes an arbitary function F over (1) the plaintext inside of a ciphertext, and (2) the data embedded in the key. This function has the following profile:

output = F(key data, plaintext data)

In this model IBE can be expressed by having the encryption algorithm encrypt (identity, plaintext) and defining the function F such that, if “key input == identity”, it outputs the plaintext, and outputs an empty string otherwise. Similarly, ABE can be expressed by a slightly more complex function. Following this paradigm, once can envision all sorts of interesting functions that might be computed by different functions and realized by future schemes.

But those will have to wait for another time. We’ve gone far enough for today.

So what’s the point of all this?

For me, the point is just to show that cryptography can do some pretty amazing things. We rarely see this on a day-to-day basis when it comes to industry and “applied” cryptography, but it’s all there waiting to be used.

Perhaps the perfect application is out there. Maybe you’ll find it.

Notes:

* An earlier version of this post said “mid-1990s”. In comments below, Tom Ristenpart takes issue with that and makes the excellent point that many important developments post-date that. So I’ve moved the date forward about five years, and I’m thinking about how to say this in a nicer way.

** There is also an intermediate form of encryption known as “certificateless encryption“. Proposed by Al-Riyami and Paterson, this idea uses a combination of standard public key encryption and IBE. The basic idea is to encrypt each message using both a traditional public key (generated by the recipient) and an IBE identity. The recipient must then obtain a copy of the secret key from the IBE authority to decrypt. The advantages here are twofold: (1) the IBE key authority can’t decrypt the message by itself, since it does not have the corresponding secret key, which solves the “escrow” problem. And (2) the sender does not need to verify that the public key really belongs to the sender (e.g., by checking a certificate), since the IBE portion prevents imposters from decrypting the resulting message. Unfortunately this system is more like traditional public key cryptography than IBE, and does not have the neat usability features of IBE.

*** A part of the challenge of developing IBE is the need to make a system that is secure against “collusion” between different key holders. For example, imagine a very simple system that has only 2-bit identities. This gives four possible identities: “00”, “01”, “10”, “11”. If I give you a key for the identity “01” and I give Bob a key for “10”, we need to ensure that you two cannot collude to produce a key for identities “00” and “11”. Many earlier proposed solutions have tried to solve this problem by gluing together standard public encryption keys in various ways (e.g., by having a separate public key for each bit of the identity and giving out the secret keys as a single “key”). However these systems tend to fail catastrophically when just a few users collude (or their keys are stolen). Solving the collusion problem is fundamentally what separates real IBE from its faux cousins.

**** A full description of Boneh and Franklin’s scheme can be found here, or in the original paper. Some code is here and here and here. I won’t spend more time on it, except to note that the scheme is very efficient. It was patented and implemented by Voltage Security, now part of HPE.

Secure computing for journalists

This morning on Twitter, Buzzfeed editor Miriam Elder asks the following question:

No, this is not a stupid question. Actually it’s an extremely important question, and judging by some of the responses to this Tweet there are a lot of other people who are confused about the answer.

Since I couldn’t find a perfect layperson’s reference anywhere else, I’m going to devote this post to providing the world’s simplest explanation of why, in the threat model of your typical journalistyour desktop machine isn’t very safe. And specifically, why you’re safer using a modern mobile device — and particularly, an iOS device — than just about any other platform.

A brief caveat: I’m a cryptographer, not a software security researcher. However, I’ve spent the past several years interacting with folks like Charlie and Dan and Thomas. I’m pretty confident that they agree with this advice.

What’s wrong with my laptop/desktop machine?

Sadly, most of the problem is you.

If you’re like most journalists — and really, most professionals — you spend less than 100% of your time thinking about security. You need to get work done. When you’re procrastinating from work, you visit funny sites your friends link you to on Facebook. Then you check your email. If you’re a normal and productive user, you probably do a combination of all these things every few minutes, all of which culminates in your downloading some email attachment and (shudder) opening it in Word.

Now I’m not trying to shame you for this. It’s perfectly normal, and indeed it’s necessary if you want to get things done.  But in the parlance of security professionals, it also means you have a huge attack surface.

In English, this means that from the perspective of an attacker there are many different avenues to compromise your machine. Many of these aren’t even that sophisticated. Often it’s just a matter of catching you during an unguarded moment and convincing you to download an executable file or an infected Office document. A compromised machine means that every piece of software on that machine is also vulnerable.

If you don’t believe this works, head over to Google and search for “Remote Access Trojans”. There’s an entire commercial market for these products, each of which allows you to remotely control someone else’s computer. These off-the-shelf products aren’t very sophisticated: indeed, most require you to trick your victim into downloading and running some executable attachment. Sadly, this works on most people just fine. And this is just the retail stuff. Imagine what a modestly sophisticated attacker can do.

I do some of those things on my phone as well. Why is a phone better?

Classical (desktop and laptop) operating systems were designed primarily to support application developers. This means they offer a lot of power to your applications. An application like Microsoft Word can typically read and write all the files available to your account. If Word becomes compromised, this is usually enough to pwn you in practice. And in many cases, these applications have components with root (or Administrator) access, which makes them even more dangerous.

Modern phone operating systems like Android and iOS were built on a different principle. Rather than trusting apps with much power, each app runs in a “sandbox” that (mainly) limits it to accessing its own files. If the sandbox works, even a malicious application shouldn’t be able to reach out to touch other apps’ files or permanently modify your system. This approach — combined with other protections such as in-memory code signing, hardware secret storage and routine use of anti-exploitation measures — makes your system vastly harder to compromise.

Of course, sandboxing isn’t perfect. A compromised or malicious app can always access its own files. More sophisticated exploits can “break out” of the sandbox, typically by exploiting a vulnerability in the operating system. Such vulnerabilities are routinely discovered and occasionally exploited.

The defense to this is twofold: (1) first, run a modern, up-to-date OS that receives security patches quickly. And (2) avoid downloading malicious apps. Which brings me to the main point of this post.

Why use iOS?

The fact of the matter is that when it comes to addressing these remaining issues, Apple phone operating systems (on iPhones and iPads) simply have a better track record.

Since Apple is the only manufacturer of iOS devices, there is no “middleman” when it comes to monitoring for iOS issues and deploying iOS security updates. This means that the buck stops at Apple — rather than with some third-party equipment manufacturer. Indeed, Apple routinely patches its operating systems and pushes the patches to all supported users — sometimes within hours of learning of a vulnerability (something that is relatively rare at this point in any case).

Of course, to be fair: Google has also become fairly decent at supporting its own Android devices. However, to get assurance from this process you need to be running a relatively brand new device and it needs to be manufactured by Google. Otherwise you’re liable to be several days or weeks behind the time when a security issue is discovered and patched — if you ever get it. And Google still does not support all of the features Apple does, including in-memory code signing and strong file encryption.

Apple also seems to do a relatively decent job at curating its App Store, at least as compared to Google. And because those apps support a more modern base of phones, they tend to have access to better security features, whereas Android apps more routinely get caught doing dumb stuff for backwards compatibility reasons.

640x960
A password manager using the SEP.

Finally, every recent Apple device (starting with the iPhone 5S and up) also includes a specialized chip known as a “Secure Enclave Processor“. This hardened processor assists in securing the boot chain — ensuring that nobody can tamper with your operating system. It can also protect sensitive values like your passwords, ensuring that only a password or fingerprint can access them.

A few Android phones also offer similar features as well. However, it’s unclear how well these are implemented in contrast to Apple’s SEP. It’s not a bet I would choose to take.

So does using iOS mean I’m perfectly safe?

Of course not. Unfortunately, computer security today is about resisting attacks. We still don’t quite know how to prevent them altogether.

Indeed, well-funded attackers like governments are still capable of compromising your iOS device (and your Android, and your PC or Mac). Literally the only question is how much they’ll have to spend doing it.

Here’s one data point. Last year a human rights activist in the UAE was targeted via a powerful zero day exploit, likely by his government. However, he was careful. Instead of clicking the link he was sent, the activist sent it to the engineers at Citizenlab who reverse-engineered the exploit. The resulting 35-page technical report by Lookout Security and Citizenlab is a thing of terrifying beauty: it describes a chain of no less than three previously unpublished software exploits, which together would have led to the complete compromise of the victim’s iPhone.

But such compromises don’t come cheap. It’s easy to see this kind of attack costing a million dollars or more. This is probably orders of magnitude more than it would cost to compromise the typical desktop user. That’s important. Not perfect, but important.

You’re telling me I have to give up my desktop machine?

Not at all. Or rather, while I’d love to tell you that, I understand this may not be realistic for most users.

All I am telling you to do is to be thoughtful. If you’re working on something sensitive, consider moving the majority of that work (and communications) to a secure device until you’re ready to share it. This may be a bit of a hassle, but it doesn’t have to be your whole life. And since most of us already carry some sort of phone or tablet in addition to our regular work computer, hopefully this won’t require too much of a change in your life.

You can still use your normal computer just fine, as long as you’re aware of the relative risks. That’s all I’m trying to accomplish with this post.

In conclusion

I expect that many technical people will find this post objectionable, largely because they assume that with their expertise and care they can make a desktop operating system work perfectly safely. And maybe they can! But that’s not who this post is addressed to.

And of course, this post still only scratches the surface of the problem. There’s still the problem of selecting the right applications for secure messaging (e.g., Signal and WhatsApp) and finding a good secure application for notetaking and document collaboration and so on.

But hopefully this post at least starts the discussion.

The future of Ransomware

This is kind of a funny post for me to write, since it ransomwareinvolves speculating about a very destructive type of software — and possibly offering some (very impractical) suggestions on how it might be improved in the future. It goes without saying that there are some real downsides to this kind of speculation. Nonetheless, I’m going ahead on the theory that it’s usually better to talk and think about the bad things that might happen to you — before you meet them on the street and they steal your lunch money.

On the other hand, just as there’s a part of every karate master that secretly wants to go out and beat up a bar full of people, there’s a part of every security professional that looks at our current generation of attackers and thinks: why can’t you people just be a bit more imaginative?! And wonders whether, if our attackers were just a little more creative, people would actually pay attention to securing their system before the bad stuff happens.

And ransomware is definitely a bad thing. According to the FBI it sucks up $1 billion/year in payments alone, and some unimaginably larger amount in remediation costs. This despite the fact that many ransomware packages truly suck, and individual ransomware developers get routinely pwned due to making stupid cryptographic errors. If this strategy is working so well today, the question  we should be asking ourselves is: how much worse could it get?

So that’s what I’m going to muse about now. A few (cryptographic) ways that it might.

Some of these ideas are the result of collaboration with my students Ian Miers, Gabe Kaptchuk and Christina Garman. They range from the obvious to the foolish to the whimsical, and I would be utterly amazed if any of them really do happen. So please don’t take this post too seriously. It’s all just fun.

Quick background: ransomware today

The amazing thing about ransomware is that something so simple could turn out to be such a problem. Modern ransomware consists of malware that infects your computer and then goes about doing something nasty: it encrypts every file it can get its hands on. This typically includes local files as well as network shares that can be reached from the infected machine.

cryptob1

Once your data has been encrypted, your options aren’t great. If you’re lucky enough to have a recent backup, you can purge the infected machine and restore. Otherwise you’re faced with a devil’s bargain: learn top live without that data, or pay the bastards.

If you choose to pay up, there are all sorts of different procedures. However most break down into the following three steps:

  1. When the ransomware encrypts your files, it generates a secret key file and stores it on your computer.
  2. You upload that file (or data string) to your attackers along with a Bitcoin payment.
  3. They process the result with their secrets and send you a decryption key.

If you’re lucky, and your attackers are still paying attention (or haven’t screwed up the crypto beyond recognition) you get back a decryption key or a tool you can use to undo the encryption on your files. The whole thing is very businesslike. Indeed, recent platforms will allegedly offer you a discount if you infect recommend it to your friends — just like Lyft!

The problem of course, is that nothing in this process guarantees that your attacker will give you that decryption key. They might be scammers. They might not have the secret anymore. They might get tracked down and arrested. Or they might get nervous and bail, taking your precious data and your payment with them. This uncertainty makes ransomware payments inherently risky — and worse, it’s the victims who mostly suffer for it.

Perhaps it would be nice if we could make that work better.

Verifiable key delivery using smart contracts

Most modern ransomware employs a cryptocurrency like Bitcoin to enable the payments that make the ransom possible. This is perhaps not the strongest argument for systems like Bitcoin — and yet it seems unlikely that Bitcoin is going away anytime soon. If we can’t solve the problem of Bitcoin, maybe it’s possible to use Bitcoin to make “more reliable” ransomware.

Recall that following a ransomware infection, there’s a possibility that you’ll pay the ransom and get nothing in return. Fundamentally there’s very little you can do about this. A conscientious ransomware developer might in theory offer a “proof of life” — that is, offer to decrypt a few files at random in order to prove their bonafides. But even if they bother with all the risk and interaction of doing this, there’s still no guarantee that they’ll bother to deliver the hostage alive.

An obvious approach to this problem is to make ransomware payments conditional. Rather than sending off your payment and hoping for the best, victims could use cryptocurrency features to ensure that ransomware operators can’t get paid unless they deliver a key. Specifically, a ransomware developer could easily perform payment via a smart contract script (in a system like Ethereum) that guarantees the following property:

This payment will be delivered to the ransomware operator if and only if the ransomware author unlocks it — by posting the ransomware decryption key to the same blockchain.

The basic primitive needed for this is called a Zero Knowledge Contingent Payment. This idea was proposed by Greg Maxwell and demonstrated by Sean Bowe of the ZCash team.**** The rough idea is to set the decryption key to be some pre-image k for some public hash value K that the ransomware generates and leaves on your system. It’s relatively easy to imagine a smart contract that allows payment if and only if the payee can post the input k such that K=SHA256(k). This could easily be written in Ethereum, and almost certainly has an analog for Bitcoin script.

The challenge here, of course, is to prove that k is actually a decryption key for your files, and that the files contain valid data. There are a handful of different ways to tackle this problem. One is to use complex zero-knowledge proof techniques (like zkSNARKs or ZKBoo) to make the necessary proofs non-interactively. But this is painful, and frankly above the level of most ransomware developers — who are still struggling with basic RSA.

An alternative approach is to use several such K challenges in combination with the “proof of life” idea. The ransomware operator would prove her bonafides by decrypting a small, randomly selected subset of files before the issuer issued payment. The operator could still “fake” the encryption — or lose the decryption key — but she would be exposed with reasonable probability before money changed hands.

“Autonomous” ransomware

Of course, the problem with “verifiable” ransomware is: what ransomware developer would bother with this nonsense?google-self-driving-car-624x326

While the ability to verify decryption might conceivably improve customer satisfaction, it’s not clear that it would really offer that much value to ransomware deverlopers. At the same time, it would definitely add a lot of nasty complexity to their software.

Instead of pursuing ideas that offer developers no obvious upside, ransomware designers presumably will pursue ideas that offer them some real benefits. And that brings us to an idea time whose time has (hopefully) not quite come yet. The idea itself is simple:

Make ransomware that doesn’t require operators.

Recall that in the final step of the ransom process, the ransomware operator must deliver a decryption key to the victim. This step is the most fraught for operators, since it requires them to manage keys and respond to queries on the Internet. Wouldn’t it be better for operators if they could eliminate this step altogether?

Of course, to accomplish this seems to require a trustworthy third party — or better, a form of ransomware that can decrypt itself when the victim makes a Bitcoin payment. Of course this last idea seems fundamentally contradictory. The decryption keys would have to live on the victim’s device, and the victim owns that device. If you tried that, then victim could presumably just hack the secrets out and decrypt the ransomware without paying.

But what if the victim couldn’t hack their own machine?

This isn’t a crazy idea. In fact, it’s exactly the premise that’s envisioned by a new class of trusted execution environments, including Intel’s SGX and ARM TrustZone. These systems — which are built into the latest generation of many processors — allow users to instantiate “secure enclaves”: software environments that can’t be accessed by outside parties. SGX also isolates enclaves from other enclaves, which means the secrets they hold are hard to pry out.

Hypothetically, after infecting your computer a piece of ransomware could generate and store its decryption key inside of a secure enclave. This enclave could be programmed to release the key only on presentation of a valid Bitcoin payment to a designated address.

The beauty of this approach is that no third party even needs to verify the payment. Bitcoin payments themselves consist of a publicly-verifiable transaction embedded in a series of “blocks”, each containing an expensive computational “proof of work“. In principle, after paying the ransom the victim could present the SGX enclave with a fragment of a blockchain all by itself — freeing the ransomware of the need to interact with third parties. If the blockchain fragment exhibited sufficient hashpower along with a valid payment to a specific address, the enclave would release the decryption key.*

The good news is that Intel and ARM have devoted serious resources to preventing this sort of unauthorized access. SGX developers must obtain a code signing certificate from Intel before they can make production-ready SGX enclaves, and it seems unlikely that Intel would partner up with a ransomware operation. Thus a ransomware operator would likely have to (1) steal a signing key from a legitimate Intel-certified developer, or (2) find an exploitable vulnerability in another developer’s enclave.**, ***

This all seems sort of unlikely, and that appears to block most of the threat — for now. Assuming companies like Intel and Qualcomm don’t screw things up, and have a good plan for revoking enclaves (uh oh), this is not very likely to be a big threat.

Of course, in the long run developers might not need Intel SGX at all. An even more speculative concern is that developments in the field of cryptographic obfuscation will provide a software-only alternative means to implement this type of ransomware. This would eliminate the need for a dependency like SGX altogether, allowing the ransomware to do its work with no hardware at all.

At present such techniques are far north of practical, keep getting broken, and might not work at all. But cryptographic researchers keep trying! I guess the lesson is that it’s not all roses if they succeed.

Ransomware Skynet

Since I’m already this far into what reads like a Peyote-fueled rant, let’s see if we can stretch the bounds of credibility just a little a bit farther. If ransomware can become partially autonomous — i.e., do part of its job without the need for human masters — what would it mean for it to become fully autonomous? In other words, what if we got rid of the rest of the human equation?

terminatorgenisys1-xlarge
I come from the future to encrypt C:\Documents

Ransomware with the ability to enforce payments would provide a potent funding source for another type of autonomous agent: a Decentralized Autonomous Organization, or (DAO). These systems are “corporations” that consist entirely of code that runs on a consensus network like Ethereum. They’re driven by rules, and are capable of both receiving and transmitting funds without (direct) instruction from human beings.

At least in theory it might be possible to develop a DAO that’s funded entirely by ransomware payments — and in turn mindlessly contracts real human beings to develop better ransomware, deploy it against human targets, and… rinse repeat. It’s unlikely that such a system would be stable in the long run — humans are clever and good at destroying dumb things — but it might get a good run. Who knows? Maybe this is how the Rampant Orphan Botnet Ecologies get started.

(I hope it goes without saying that I’m mostly not being serious about this part. Even though it would be totally awesome in a horrible sort of way.)

In conclusion

This hasn’t been a terribly serious post, although it was fun to write. The truth is that as a defender, watching your attackers fiddle around is pretty much the most depressing thing ever. Sometimes you have to break the monotony a bit.

But insofar as there is a serious core to this post, it’s that ransomware currently is using only a tiny fraction of the capabilities available to it. Secure execution technologies in particular represent a giant footgun just waiting to go off if manufacturers get things only a little bit wrong.

Hopefully they won’t, no matter how entertaining it might be.

Notes:

* This technique is similar to SPV verification. Of course, it would also be possible for a victim to “forge” a blockchain fragment without paying the ransom. However, the cost of this could easily be tuned to significantly exceed the cost of paying the ransom. There are also many issues I’m glossing over here like difficulty adjustments and the possibility of amortizing the forgery over many different victims. But thinking about that stuff is a drag, and this is all for fun, right?

** Of course, if malware can exploit such a vulnerability in another developer’s enclave to achieve code execution for “ransomware”, then the victim could presumably exploit the same vulnerability to make the ransomware spit out its key without a payment. So this strategy seems self-limiting — unless the ransomware developers find a bug that can be “repaired” by changing some immutable state held by the enclave. That seems like a long shot. And no, SGX does not allow you to “seal” data to the current state of the enclave’s RAM image.

*** In theory, Intel or an ARM manufacturer could also revoke the enclave’s signing certificate. However, the current SGX specification doesn’t explain how such a revocation strategy should work. I assume this will be more prominent in future specifications.

**** The original version of this post didn’t credit Greg and Sean properly, because I honestly didn’t make the connection that I was describing the right primitive. Neat!

Zero Knowledge Proofs: An illustrated primer, Part 2

This post is the second in a two-part series on zero-knowledge proofs. Click here t2380271980_b2a66bd47d_zo read Part 1.

In this post I’m going to continue the short, (relatively) non-technical overview of zero knowledge proofs that I started a couple of years ago. Yes, that was a very long time! If you didn’t catch the first post, now would be an excellent time to go read it.

Before we go much further, a bit of a warning. While this series is still intended as a high-level overview, at a certain point it’s necessary to dig a bit deeper into some specific algorithms. So you should expect this post to get a bit wonkier than the last.

A quick recap, and a bit more on Zero Knowledge(ness)

First, a brief refresher.

In the last post we defined a zero knowledge proof as an interaction between two computer programs (or Turing machines) — respectively called a Prover and a Verifier — where the Prover works to convince the Verifier that some mathematical statement is true. We also covered a specific example: a clever protocol by Goldreich, Micali and Wigderson that allows us to prove, in zero knowledge, that a graph possesses a three-coloring.

In the course of that discussion, we described three critical properties that any zero knowledge proof must satisfy:

  • Completeness: If the Prover is honest, then she will eventually convince the Verifier.
  • Soundness: The Prover can only convince the Verifier if the statement is true.
  • Zero-knowledge(ness): The Verifier learns no information beyond the fact that the statement is true.

The real challenge turns out to be finding a way to formally define the last property. How do you state that a Verifier learns nothing beyond the truth of a statement?

In case you didn’t read the previous post — the answer to this question came from Goldwasser, Micali and Rackoff, and it’s very cool. What they argued is that a protocol can be proven zero knowledge if for every possible Verifier, you can demonstrate the existence of an algorithm called a ‘Simulator’, and show that this algorithm has some very special properties.

From a purely mechanical perspective, the Simulator is like a special kind of Prover. However, unlike a real Prover — which starts with some special knowledge that allows it to prove the truth of a statement — the Simulator gets no special knowledge at all.* Nonetheless, the Simulator (or Simulators) must be able to ‘fool’ every Verifier into believing that the statement is true, while producing a transcript that’s statistically identical top (or indistinguishable from) the output of a real Prover.

The logic here flows pretty cleanly: since Simulator has no ‘knowledge’ to extract in the first place, then clearly a Verifier can’t obtain any meaningful amount of information after interacting with it. Moreover, if the transcript of the interaction is distributed identically to a real protocol run with a normal Prover, then the Verifier can’t do better against the real prover than it can do against the Simulator. (If the Verifier could do better, then that would imply that the distributions were not statistically identical.) Ergo, the Verifier can’t extract useful information from the real protocol run.

This is incredibly wonky, and worse, it seems contradictory! We’re asking that a protocol be both sound — meaning that a bogus Prover can’t trick some Verifier into accepting a statement unless it has special knowledge allowing it to prove the statement — but we’re also asking for the existence of an algorithm (the simulator) that can literally cheat. Clearly both properties can’t hold at the same time.

The solution to this problem is that both properties don’t hold at the same time.

To build our simulator, we’re allowed to do things to the Verifier that would never happen in the real world. The example that I gave in the previous post was to use a ‘time machine’ — that is, our ‘Simulator’ can rewind the Verifier program’s execution in order to ‘fool’ it. Thus, in a world where we can wind the Verifier back in time, it’s easy to show that a Simulator exists. In the real world, of course it doesn’t. This ‘trick’ gets us around the contradiction.

As a last reminder, to illustrate all of these ideas, we covered one of the first general zero knowledge proofs, devised by Goldreich, Micali and Wigderson (GMW). That protocol allowed us to prove, in zero knowledge, that a graph supports a three-coloring. Of course, proving three colorings isn’t terribly interesting. The real significance of the GMW result is theoretical. Since graph three coloring is known to be in the complexity class NP-complete, the GMW protocol can be used to prove any statement in the class NP. And that’s quite powerful.

Let me elaborate slightly on what that means:

  1. If there exists any decision problem (that is, a problem with a yes/no answer) whose witness (solution) can be verified in polynomial time, then:
  2. We can prove that said solution exists by (1) translating the problem into an instance of the graph three-coloring problem, and (2) running the GMW protocol.*

This amazing result gives us interactive zero knowledge proofs for every statement in NP. The only problem is that it’s almost totally unusable.

From theory into practice

If you’re of a practical mindset, you’re probably shaking your head at all this talk of ZK proofs. That’s because actually using this approach would be an insanely expensive and stupid thing to do. Most likely you’d first represent your input problem as a boolean circuit where the circuit is satisfied if and only if you know the correct input. Then you’d have to translate your circuit into a graph, resulting in some further blowup. Finally you’d need to run the GMW protocol, which is damned expensive all by itself.

So in practice nobody does this. It’s really considered a ‘feasibility’ result. Once you show that something is possible, the next step is to make it efficient.

But we do use zero knowledge proofs, almost every day. In this post I’m going to spend some time talking about the more practical ZK proofs that we actually use. To do that I just need give just a tiny bit of extra background.

Proofs vs. Proofs of Knowledge

Before we go on, there’s one more concept we need to cover. Specifically, we need to discuss what precisely we’re proving when we conduct a zero knowledge proof.

Let me explain. At a high level, there are two kinds of statement you might want to prove in zero knowledge. Roughly speaking, these break up as follows.

Statements about “facts”. For example, I might wish to prove that “a specific graph has a three coloring” or “some number N is in the set of composite numbers“. Each of these is a statement about some intrinsic property of the universe.

Statements about my personal knowledge. Alternatively, I might wish to prove that I know some piece information. Examples of this kind of statement include: “I know a three coloring for this graph”, or “I know the factorization of N”. These go beyond merely proving that a fact is true, and actually rely on what the Prover knows.

It’s important to recognize that there’s a big difference between these two kinds of statements! For example, it may be possible to prove that a number N is composite even if you don’t know the full factorization. So merely proving the first statement is not equivalent to proving the second one.

The second class of proof is known as a “proof of knowledge”. It turns out to be extremely useful for proving a variety of statements that we use in real life. In this post, we’ll mostly be focusing on this kind of proof.

The Schnorr identification protocol

Now that we’ve covered some of the required background, it’s helpful to move on to a specific and very useful proof of knowledge that was invented by Claus-Peter Schnorr in the 1980s. At first glance, the Schnorr protocol may seem a bit odd, but in fact it’s the basis of many of our modern signature schemes today.

Schnorr wasn’t really concerned with digital signatures, however. His concern was with identification. Specifically, let’s imagine that Alice has published her public key to the world, and later on wants to prove that she knows the secret key corresponding to that public key. This is the exact problem that we encounter in real-world protocols such as public-key SSH, so it turns out to be well-motivated.

Schnorr began with the assumption that the public key would be of a very specific format. Specifically, let p be some prime number, and let g be a generator of a cyclic group of prime-order q. To generate a keypair, Alice would first pick a random integer a between 1 and q, and then compute the keypair as:

PK_{A} = g^a~mod~p, SK_{A} = a

(If you’ve been around the block a time or two, you’ll probably notice that this is the same type of key used for Diffie-Hellman and the DSA signing algorithm. That’s not a coincidence, and it makes this protocol very useful.)

Alice keeps her secret key to herself, but she’s free to publish her public key to the world. Later on, when she wants to prove knowledge of her secret key, she conducts the following simple interactive protocol with Bob:

There’s a lot going on in here, so let’s take a minute to unpack things.

First off, we should ask ourselves if the protocol is complete. This is usually the easiest property to verify: if Alice performs the protocol honestly, should Bob be satisfied at the end of it? In this case, completeness is pretty easy to see just by doing a bit of substitution:

Proving soundness

The harder property is soundness. Mainly because we don’t yet have a good definition of what it means for a proof of knowledge to be sound. Remember that what we want to show is the following:

If Alice successfully convinces Bob, then she must know the secret key a.

It’s easy to look at the equations above and try to convince yourself that Alice’s only way to cheat the protocol is to know a. But that’s hardly a proof.

When it comes to demonstrating the soundness of a proof of knowledge, we have a really nice formal approach. Just as with the Simulator we discussed above, we need to demonstrate the existence of a special algorithm. This algorithm is called a knowledge extractor, and it does exactly what it claims to. A knowledge extractor (or just ‘Extractor’ for short) is a special type of Verifier that interacts with a Prover, and — if the Prover succeeds in completing the proof — the Extractor should be able to extract the Prover’s original secret.

And this answers our question above. To prove soundness for a proof of knowledge, we must show that an Extractor exists for every possible Prover.

Of course this again seems totally contradictory to the purpose of a zero knowledge protocol — where we’re not supposed to be able to learn secrets from a Prover. Fortunately we’ve already resolved this conundrum once for the case of the Simulator. Here again, we take the same approach. The Extractor is not required to exist during a normal run of the protocol. We simply show that it exists if we’re allowed to take special liberties with the Prover — in this case, we’ll use ‘rewinding’ to wind back the Prover’s execution and allow us to extract secrets.

The extractor for the Schnorr protocol is extremely clever — and it’s also pretty simple. Let’s illustrate it in terms of a protocol diagram. Alice (the Prover) is on the left, and the Extractor is on the right:

The key observation here is that by rewinding Alice’s execution, the Extractor can ‘trick’ Alice into making two different proof transcripts using the same k. This shouldn’t normally happen in a real protocol run, where Alice specifically picks a new k for each execution of the protocol.

If the Extractor can trick Alice into doing this, then he can solve the following simple equation to recover Alice’s secret:

It’s worth taking a moment right now to note that this also implies a serious vulnerability in bad implementations of the Schnorr protocol. If you ever accidentally use the same k for two different runs of the protocol, an attacker may be able to recover your secret key! This can happen if you use a bad random number generator.

Indeed, those with a bit more experience will notice that this is similar to a real attack on systems (with bad random number generators) that implement ECDSA or DSA signatures! This is also not a coincidence. The (EC)DSA signature family is based on Schnorr. Ironically, the developers of DSA managed to retain this vulnerability of the Schorr family of protocols while at the same time ditching the security proof that makes Schnorr so nice.

Proving zero-knowledge(ness) against an honest Verifier

Having demonstrated that Schnorr signatures are complete and sound, it remains only to prove that they’re ‘zero knowledge’. Remember that to do this, normally we require a Simulator that can interact with any possible Verifier and produce a ‘simulated’ transcript of the proof, even if the Simulator doesn’t know the secret it’s proving it knows.

The standard Schnorr protocol does not have such a Simulator, for reasons we’ll get into in a second. Instead, to make the proof work we need to make a special assumption. Specifically, the Verifier needs to be ‘honest’. That is, we need to make the special assumption that it will run its part of the protocol correctly — namely, that it will pick its challenge “c” using only its random number generator, and will not choose this value based on any input we provide it. As long as it does this, we can construct a Simulator.

Here’s how the Simulator works.

Let’s say we are trying to prove knowledge of a secret a for some public key g^a~mod~pbut we don’t actually know the value aOur Simulator assumes that the Verifier will choose some value c as its challenge, and moreover, it knows that the honest Verifier will choose the value c only based on its random number generator — and not based on any inputs the Prover has provided.

  1. First, output some initial g^{k_1} as the Prover’s first message, and find out what challenge c the Verifier chooses.
  2. Rewind the Verifier, and pick a random integer z in the range \{0,\dots,q-1\}.
  3. Compute g^{k_2} = g^z * g^{a (-c)} and output g^{k_2} as the Prover’s new initial message.
  4. When the Verifier challenges on c again, output z.
Notice that the transcript g^k, c, z will verify correctly as a perfectly valid, well-distributed proof of knowledge of the value a. The Verifier will accept this output as a valid proof of knowledge of a, even though the Simulator does not know a in the first place!
What this proves is that if we can rewind a Verifier, then (just as in the first post in this series) we can always trick the Verifier into believing we have knowledge of a value, even when we don’t. And since the statistical distribution of our protocol is identical to the real protocol, this means that our protocol must be zero knowledge — against an honest Verifier.

From interactive to non-interactive

So far we’ve shown how to use the Schnorr protocol to interactively prove knowledge of a secret key a that corresponds to a public key g^{a}. This is an incredibly useful protocol, but it only works if our Verifier is online and willing to interact with us.

An obvious question is whether we can make this protocol work without interaction. Specifically, can I make a proof that I can send you without you even being online. Such a proof is called a non-interactive zero knowledge proof (NIZK). Turning Schnorr into a non-interactive proof seems initially quite difficult — since the protocol fundamentally relies on the Verifier picking a random challenge. Fortunately there is a clever trick we can use.

This technique was developed by Fiat and Shamir in the 1980s. What they observed was that if you have a decent hash function lying around, you can convert an interactive protocol into a non-interactive one by simply using the hash function to pick the challenge.

Specifically, the revised protocol for proving knowledge of a with respect to a public key g^a looks like this:

  1. The Prover picks g^k (just as in the interactive protocol).
  2. Now, the prover computes the challenge as c = H(g^k || M) where H() is a hash function, and M is an (optional) and arbitary message string.
  3. Compute ac + k~mod~q (just as in the interactive protocol).

The upshot here is that the hash function is picking the challenge c without any interaction with the Verifier. In principle, if the hash function is “strong enough” (meaning, it’s a random oracle) then the result is a completely non-interactive proof of knowledge of the value a that the Prover can send to the Verifier. The proof of this is relatively straightforward.

The particularly neat thing about this protocol is that it isn’t just a proof of knowledge, it’s also a signature scheme. That is, if you put a message into the (optional) value M, you obtain a signature on M, which can only be produced by someone who knows the secret key a. The resulting protocol is called the Schnorr signature scheme, and it’s the basis of real-world protocols like EdDSA.

Phew.

Yes, this has been a long post and there’s probably a lot more to be said. Hopefully there will be more time for that in a third post — which should only take me another three years.
Notes:


* In this definition, it’s necessary that the statement be literally true.

The limitations of Android N Encryption

Over the past few years pixelphonewe’ve heard more about smartphone encryption than, quite frankly, most of us expected to hear in a lifetime. We learned that proper encryption can slow down even sophisticated decryption attempts if done correctly. We’ve also learned that incorrect implementations can undo most of that security.

In other words, phone encryption is an area where details matter. For the past few weeks I’ve been looking a bit at Android Nougat’s new file-based encryption to see how well they’ve addressed some of those details in their latest release. The answer, unfortunately, is that there’s still lots of work to do. In this post I’m going to talk about a bit of that.

(As an aside: the inspiration for this post comes from Grugq, who has been loudly and angrily trying to work through these kinks to develop a secure Android phone. So credit where credit is due.)

Background: file and disk encryption 

Disk encryption is much older than smartphones. Indeed, early encrypting filesystems date back at least to the early 1990s and proprietary implementations may go back before that. Even in the relatively new area of PCs operating systems, disk encryption has been a built-in feature since the early 2000s.

The typical PC disk encryption system operates as follows. At boot time you enter a password. This is fed through a key derivation function to derive a cryptographic key. If a hardware co-processor is available (e.g., a TPM), your key is further strengthened by “tangling” it with some secrets stored in the hardware. This helps to lock encryption to a particular device.

The actual encryption can be done in one of two different ways:

  1. Full Disk Encryption (FDE) systems (like TruecryptBitLocker and FileVault) encrypt disks at the level of disk sectors. This is an all-or-nothing approach, since the encryption drivers won’t necessarily have any idea what files those sectors represent. At the same time, FDE is popular — mainly because it’s extremely easy to implement.
  2. File-based Encryption (FBE) systems (like EncFS and eCryptFS) encrypt individual files. This approach requires changes to the filesystem itself, but has the benefit of allowing fine grained access controls where individual files are encrypted using different keys.

Most commercial PC disk encryption software has historically opted to use the full-disk encryption (FDE) approach. Mostly this is just a matter of expediency: FDE is just significantly easier to implement. But philosophically, it also reflects a particular view of what disk encryption was meant to accomplish.

In this view, encryption is an all-or-nothing proposition. Your machine is either on or off; accessible or inaccessible. As long as you make sure to have your laptop stolen only when it’s off, disk encryption will keep you perfectly safe.

So what does this have to do with Android?

Android’s early attempts at adding encryption to their phones followed the standard PC full-disk encryption paradigm. Beginning in Android 4.4 (Kitkat) through Android 6.0 (Marshmallow), Android systems shipped with a kernel device mapper called dm-crypt designed to encrypt disks at the sector level. This represented a quick and dirty way to bring encryption to Android phones, and it made sense — if you believe that phones are just very tiny PCs.

The problem is that smartphones are not PCs.

The major difference is that smartphone users are never encouraged to shut down their device. In practice this means that — after you enter a passcode once after boot — normal users spend their whole day walking around with all their cryptographic keys in RAM. Since phone batteries live for a day or more (a long time compared to laptops) encryption doesn’t really offer much to protect you against an attacker who gets their hands on your phone during this time.

Of course, users do lock their smartphones. In principle, a clever implementation could evict sensitive cryptographic keys from RAM when the device locks, then re-derive them the next time the user logs in. Unfortunately,  Android doesn’t do this — for the very simple reason that Android users want their phones to actually work. Without cryptographic keys in RAM, an FDE system loses access to everything on the storage drive. In practice this turns it into a brick.

For this very excellent reason, once you boot an Android FDE phone it will never evict its cryptographic keys from RAM. And this is not good.

So what’s the alternative?

Android is not the only game in town when it comes to phone encryption. Apple, for its part, also gave this problem a lot of thought and came to a subtly different solution.

Starting with iOS 4, Apple included a “data protection” feature to encrypt all data stored a device. But unlike Android, Apple doesn’t use the full-disk encryption paradigm. Instead, they employ a file-based encryption approach that individually encrypts each file on the device.

In the Apple system, the contents of each file is encrypted under a unique per-file key (metadata is encrypted separately). The file key is in turn encrypted with one of several “class keys” that are derived from the user passcode and some hardware secrets embedded in the processor.

apple
iOS data encryption. Source: iOS Security Guide.

The main advantage of the Apple approach is that instead of a single FDE key to rule them all, Apple can implement fine-grained access control for individual files. To enable this, iOS provides an API developers can use to specify which class key to use in encrypting any given file. The available “protection classes” include:

  • Complete protection. Files encrypted with this class key can only be accessed when the device is powered up and unlocked. To ensure this, the class key is evicted from RAM a few seconds after the device locks.
  • Protected Until First User Authentication. Files encrypted with this class key are protected until the user first logs in (after a reboot), and the key remains in memory.
  • No protection. These files are accessible even when the device has been rebooted, and the user has not yet logged in.

By giving developers the option to individually protect different files, Apple made it possible to build applications that can work while the device is locked, while providing strong protection for files containing sensitive data.

Apple even created a fourth option for apps that simply need to create new encrypted files when the class key has been evicted from RAM. This class uses public key encryption to write new files. This is why you can safely take pictures even when your device is locked.

Apple’s approach isn’t perfect. What it is, however, is the obvious result of a long and careful thought process. All of which raises the following question…

Why the hell didn’t Android do this as well?

The short answer is Android is trying to. Sort of. Let me explain.

As of Android 7.0 (Nougat), Google has moved away from full-disk encryption as the primary mechanism for protecting data at rest. If you set a passcode on your device, Android N systems can be configured to support a more Apple-like approach that uses file encryption. So far so good.

The new system is called Direct Boot, so named because it addresses what Google obviously saw as fatal problem with Android FDE — namely, that FDE-protected phones are useless bricks following a reboot. The main advantage of the new model is that it allows phones to access some data even before you enter the passcode. This is enabled by providing developers with two separate “encryption contexts”:

  • Credential encrypted storage. Files in this area are encrypted under the user’s passcode, and won’t be available until the user enters their passcode (once).
  • Device encrypted storage. These files are not encrypted under the user’s passcode (though they may be encrypted using hardware secrets). Thus they are available after boot, even before the user enters a passcode.

Direct Boot even provides separate encryption contexts for different users on the phone — something I’m not quite sure what to do with. But sure, why not?

If Android is making all these changes, what’s the problem?

One thing you might have noticed is that where Apple had four categories of protection, Android N only has two. And it’s the two missing categories that cause the problems. These are the “complete protection” categories that allow the user to lock their device following first user authentication — and evict the keys from memory.

Of course, you might argue that Android could provide this by forcing application developers to switch back to “device encrypted storage” following a device lock. The problem with this idea is twofold. First, Android documentation and sample code is explicit that this isn’t how things work:

credentialenc

Moreover, a quick read of the documentation shows that even if you wanted to, there is no unambiguous way for Android to tell applications when the system has been re-locked. If keys are evicted when the device is locked, applications will unexpectedly find their file accesses returning errors. Even system applications tend to do badly when this happens.

And of course, this assumes that Android N will even try to evict keys when you lock the device. Here’s how the current filesystem encryption code handles locks:

lockuserkey

While the above is bad, it’s important to stress that the real problem here is not really in the cryptography. The problem is that since Google is not giving developers proper guidance, the company may be locking Android into years of insecurity. Without (even a half-baked) solution to define a “complete” protection class, Android app developers can’t build their apps correctly to support the idea that devices can lock. Even if Android O gets around to implementing key eviction, the existing legacy app base won’t be able to handle it — since this will break a million apps that have implemented their security according to Android’s current recommendations.

In short: this is a thing you get right from the start, or you don’t do at all. It looks like — for the moment — Android isn’t getting it right.

Are keys that easy to steal?

Of course it’s reasonable to ask whether it’s having keys in RAM is that big of concern in the first place. Can these keys actually be accessed?

The answer to that question is a bit complicated. First, if you’re up against somebody with a hardware lab and forensic expertise, the answer is almost certainly “yes”. Once you’ve entered your passcode and derived the keys, they aren’t stored in some magically secure part of the phone. People with the ability to access RAM or the bus lines of the device can potentially nick them.

But that’s a lot of work. From a software perspective, it’s even worse. A software attack would require a way to get past the phone’s lockscreen in order to get running code on the device. In older (pre-N) versions of Android the attacker might need to then escalate privileges to get access to Kernel memory. Remarkably, Android N doesn’t even store its disk keys in the Kernel — instead they’re held by the “vold” daemon, which runs as user “root” in userspace. This doesn’t make exploits trivial, but it certainly isn’t the best way to handle things.

Of course, all of this is mostly irrelevant. The main point is that if the keys are loaded you don’t need to steal them. If you have a way to get past the lockscreen, you can just access files on the disk.

What about hardware?

Although a bit of a tangent, it’s worth noting that many high-end Android phones use some sort of trusted hardware to enable encryption. The most common approach is to use a trusted execution environment (TEE) running with ARM TrustZone.

This definitely solves a problem. Unfortunately it’s not quite the same problem as discussed above. ARM TrustZone — when it works correctly, which is not guaranteed — forces attackers to derive their encryption keys on the device itself, which should make offline dictionary attacks on the password much harder. In some cases, this hardware can be used to cache the keys and reveal them only when you input a biometric such as a fingerprint.

The problem here is that in Android N, this only helps you at the time the keys are being initially derived. Once that happens (i.e., following your first login), the hardware doesn’t appear to do much. The resulting derived keys seem to live forever in normal userspace RAM. While it’s possible that specific phones (e.g., Google’s Pixel, or Samsung devices) implement additional countermeasures, on stock Android N phones hardware doesn’t save you.

So what does it all mean?

How you feel about this depends on whether you’re a “glass half full” or “glass half empty” kind of person.

If you’re an optimistic type, you’ll point out that Android is clearly moving in the right direction. And while there’s a lot of work still to be done, even a half-baked implementation of file-based implementation is better than the last generation of dumb FDE Android encryption. Also: you probably also think clowns are nice.

On the other hand, you might notice that this is a pretty goddamn low standard. In other words, in 2016 Android is still struggling to deploy encryption that achieves (lock screen) security that Apple figured out six years ago. And they’re not even getting it right. That doesn’t bode well for the long term security of Android users.

And that’s a shame, because as many have pointed out, the users who rely on Android phones are disproportionately poorer and more at-risk. By treating encryption as a relatively low priority, Google is basically telling these people that they shouldn’t get the same protections as other users. This may keep the FBI off Google’s backs, but in the long term it’s bad judgement on Google’s part.

Attack of the week: 64-bit ciphers in TLS

A few months ago it was starting to seem like you couldn’t go a week without a new attack on TLS. In that context, this summer has been a blessed relief. Sadly, it looks like our vacation is over, and it’s time to go back to school.

Today brings the news that Karthikeyan Bhargavan and Gaëtan Leurent out of INRIA have a new paper that demonstrates a practical attack on legacy ciphersuites in TLS (it’s called “Sweet32”, website here). What they show is that ciphersuites that use 64-bit blocklength ciphers — notably 3DES — are vulnerable to plaintext recovery attacks that work even if the attacker cannot recover the encryption key.

While the principles behind this attack are well known, there’s always a difference between attacks in principle and attacks in practice. What this paper shows is that we really need to start paying attention to the practice.

So what’s the matter with 64-bit block ciphers?

Block ciphers are one of the most widely-used cryptographic primitives. As the nameimplies, these are schemes designed to encipher data in blocks, rather than a single bit at a time.

The two main parameters that define a block cipher are its block size (the number of bits it processes in one go), and its key size. The two parameters need not be related. So for example, DES has a 56-bit key and a 64-bit block. Whereas 3DES (which is built from DES) can use up to a 168-bit key and yet still has the same 64-bit block. More recent ciphers have opted for both larger blocks and larger keys.

When it comes to the security provided by a block cipher, the most important parameter is generally the key size. A cipher like DES, with its tiny 56-bit key, is trivially vulnerable to brute force attacks that attempt decryption with every possible key (often using specialized hardware). A cipher like AES or 3DES is generally not vulnerable to this sort of attack, since the keys are much longer.

However, as they say: key size is not everything. Sometimes the block size matters too.

You see, in practice, we often need to encrypt messages that are longer than a single block. We also tend to want our encryption to be randomized. To accomplish this, most protocols use a block cipher in a scheme called a mode of operation. The most popular mode used in TLS is CBC mode. Encryption in CBC looks like this:

Source: Wikipedia

The nice thing about CBC is that (leaving aside authentication issues) it can be proven (semantically) secure if we make various assumptions about the security of the underlying block cipher. Yet these security proofs have one important requirement. Namely, the attacker must not receive too much data encrypted with a single key.

The reason for this can be illustrated via the following simple attack.

Imagine that an honest encryptor is encrypting a bunch of messages using CBC mode. Following the diagram above, this involves selecting a random Initialization Vector (IV) of size equal to the block size of the cipher, then XORing IV with the first plaintext block (P), and enciphering the result (P \oplus IV). The IV is sent (in the clear) along with the ciphertext.

Most of the time, the resulting ciphertext block will be unique — that is, it won’t match any previous ciphertext block that an attacker may have seen. However, if the encryptor processes enough messages, sooner or later the attacker will see a collision. That is, it will see a ciphertext block that is the same as some previous ciphertext block. Since the cipher is deterministic, this means the cipher’s input (P \oplus IV) must be identical to the cipher’s previous input (P' \oplus IV') that created the previous block.

In other words, we have (P \oplus IV) = (P' \oplus IV'), which can be rearranged as (P \oplus P') = (IV \oplus IV'). Since the IVs are random and known to the attacker, the attacker has (with high probability) learned the XOR of two (unknown) plaintexts!

What can you do with the XOR of two unknown plaintexts? Well, if you happen to know one of those two plaintext blocks — as you might if you were able to choose some of the plaintexts the encryptor was processing — then you can easily recover the other plaintext. Alternatively, there are known techniques that can sometimes recover useful data even when you don’t know both blocks.

The main lesson here is that this entire mess only occurs if the attacker sees a collision. And the probability of such a collision is entirely dependent on the size of the cipher block. Worse, thanks to the (non-intuitive) nature of the birthday bound, this happens much more quickly than you might think it would. Roughly speaking, if the cipher block is b bits long, then we should expect a collision after roughly 2^{b/2} encrypted blocks.

In the case of a 64-bit blocksize cipher like 3DES, this is somewhere in the vicinity of 2^{32}, or around 4 billion enciphered blocks.

(As a note, the collision does not really need to occur in the first block. Since all blocks in CBC are calculated in the same way, it could be a collision anywhere within the messages.)

Whew. I thought this was a practical attack. 4 billion is a big number!

It’s true that 4 billion blocks seems like an awfully large number. In a practical attack, the requirements would be even larger — since the most efficient attack is for the attacker to know a lot of the plaintexts, in the hope that she will be able to recover one unknown plaintext when she learns the value (P ⊕ P’).

However, it’s worth keeping in mind that these traffic numbers aren’t absurd for TLS. In practice, 4 billion 3DES blocks works out to 32GB of raw ciphertext. A lot to be sure, but not impossible. If, as the Sweet32 authors do, we assume that half of the plaintext blocks are known to the attacker, we’d need to increase the amount of ciphertext to about 64GB. This is a lot, but not impossible.

The Sweet32 authors take this one step further. They imagine that the ciphertext consists of many HTTPS connections, consisting of 512 bytes of plaintext, in each of which is embedded the same secret 8-byte cookie — and the rest of the session plaintext is known. Calculating from these values, they obtain a requirement of approximately 256GB of ciphertext needed to recover the cookie with high probability.

That is really a lot.

But keep in mind that TLS connections are being used to encipher increasingly more data. Moreover, a single open browser frame running attacker-controlled Javascript can produce many gigabytes of ciphertext in a single hour. So these attacks are not outside of the realm of what we can run today, and presumably will be very feasible in the future.

How does the TLS attack work?

While the cryptographic community has been largely pushing TLS away from ciphersuites like CBC, in favor of modern authenticated modes of operation, these modes still exist in TLS. And they exist not only for use not only with modern ciphers like AES, but they are often available for older ciphersuites like 3DES. For example, here’s a connection I just made to Google:


Of course, just because a server supports 3DES does not mean that it’s vulnerable to this attack. In order for a particular connection to be vulnerable, both the client and server must satisfy three main requirements:

    1. The client and server must negotiate a 64-bit cipher. This is a relatively rare occurrence, but can happen in cases where one of the two sides is using an out-of-date client. For example, stock Windows XP does not support any of the AES-based ciphersuites. Similarly, SSL3 connections may negotiate 3DES ciphersuites.
    2. The server and client must support long-lived TLS sessions, i.e., encrypting a great deal of data with the same key. Unfortunately, most web browsers place no limit on the length of an HTTPS session if Keep-Alive is used, provided that the server allows the session. The Sweet32 authors scanned and discovered that many servers (including IIS) will allow sessions long enough to run their attack. Across the Internet, the percentage of vulnerable servers is small (less than 1%), but includes some important sites.
    3. The client must encipher a great deal of known data, including a secret session cookie. This is generally achieved by running adversarial Javascript code in the browser, although it could be done using standard HTML as well.

      Sites vulnerable to Sweet32. (source)

These caveats aside, the authors were able to run their attack using Firefox, sending at a rate of about 1500 connections per second. With a few optimizations, they were able to recover a 16-byte secret cookie in about 30 hours (a lucky result, given an expected 38 hour run time).The client must encipher a great deal of known data, including a secret session cookie. This is generally achieved by running adversarial Javascript code in the browser, although it could be done using standard HTML as well.

So what do we do now?

While this is not an earthshaking result, it’s roughly comparable to previous results we’ve seen with legacy ciphers like RC4.

In short, while these are not the easiest attacks to run, it’s a big problem that there even exist semi-practical attacks that undo the encryption used in standard encryption protocols. This is a problem that we should address, and these attack papers help to make those problems more clear.

Is Apple’s Cloud Key Vault a crypto backdoor?

TL;DR: No, it isn’t. If that’s all you wanted to know, you can stop reading.

Still, as you can see there’s been some talk on Twitter about the subject, and I’m afraid it could lead to a misunderstanding. That would be too bad, since Apple’s new technology is kind of a neat experiment.

So while I promise that this blog is not going to become all-Apple-all-the-time, I figured I’d take a minute to explain what I’m talking about. This post is loosely based on an explanation of Apple’s new escrow technology that Ivan Krstic gave at BlackHat. You should read the original for the fascinating details.

What is Cloud Key Vault (and what is iCloud Keychain)?

A few years ago Apple quietly introduced a new service called iCloud Keychain. This service is designed to allow you to back up your passwords and secret keys to the cloud. Now, if backing up your sensitive passwords gives you the willies, you aren’t crazy. Since these probably include things like bank and email passwords, you really want these to be kept extremely secure.

And — at least going by past experience — security is not where iCloud shines:

The problem here is that passwords need to be secured at a much higher assurance level than most types of data backup. But how can Apple ensure this? We can’t simply upload our secret passwords the way we upload photos of our kids. That would create a number of risks, including:

  1. The risk that someone will guess, reset or brute-force your iCloud password. Password resets are a particular problem. Unfortunately these seem necessary for normal iCloud usage, since people do forget their passwords. But that’s a huge risk when you’re talking about someone’s entire password collection.
  2. The risk that someone will break into Apple’s infrastructure. Even if Apple gets their front-end brute-forcing protections right (and removes password resets), the password vaults themselves are a huge target. You want to make sure that even someone who hacks Apple can’t get them out of the system.
  3. The risk that a government will compel Apple to produce data. Maybe you’re thinking of the U.S. government here. But that’s myopic: Apple stores iCloud data all over the world.

So clearly Apple needs a better way to protect these passwords. How do to it?

Why not just encrypt the passwords?

It is certainly possible for an Apple device to encrypt your password vault before sending it to iCloud. The problem here is that Apple doesn’t necessarily have a strong encryption key to do this with. Remember that the point of a backup is to survive the loss of your device, and thus we can’t assume the existence of a strong recovery key stored on your phone.

This leaves us with basically one option: a user password. This could be either the user’s iCloud password or their device passcode. Unfortunately for the typical user, these tend to be lousy. They may be strong enough to use as a login password — in a system that allows only a very limited number of login attempts. But the kinds of passwords typical users choose to enter on mobile devices are rarely strong enough to stand up to an offline dictionary attack, which is the real threat when using passwords as encryption keys.

(Even using a strong memory-hard password hash like scrypt — with crazy huge parameters — probably won’t save a user who chooses a crappy password. Blame phone manufacturers for making it painful to type in complicated passwords by forcing you to type them so often.)

So what’s Apple to do?

So Apple finds itself in a situation where they can’t trust the user to pick a strong password. They can’t trust their own infrastructure. And they can’t trust themselves. That’s a problem. Fundamentally, computer security requires some degree of trust — someone has to be reliable somewhere.

Apple’s solution is clever: they decided to make something more trustworthy than themselves. To create a new trust anchor, Apple purchased a bunch of fancy devices called Hardware Security Modules, or HSMs. These are sophisticated, tamper-resistant specialized computers that store and operate with cryptographic keys, while preventing even malicious users from extracting them. The high-end HSMs Apple uses also allow the owner to include custom programming.

Rather than trusting Apple, your phone encrypts its secrets under a hardcoded 2048-bit RSA public key that belongs to Apple’s HSM. It also encrypts a function of your device passcode, and sends the resulting encrypted blob to iCloud. Critically, only the HSM has a copy of the corresponding RSA decryption key, thus only the HSM can actually view any of this information. Apple’s network sees only an encrypted blob of data, which is essentially useless.

When a user wishes to recover their secrets, they authenticate themselves directly to the HSM. This is done using a user’s “iCloud Security Code” (iCSC), which is almost always your device passcode — something most people remember after typing it every day. This authentication is done using the Secure Remote Password protocol, ensuring that Apple (outside of the HSM) never sees any function of your password.

Now, I said that device passcodes are lousy secrets. That’s true when we’re talking about using them as encryption keys — since offline decryption attacks allow the attacker to make an unlimited number of attempts. However, with the assistance of an HSM, Apple can implement a common-sense countermeasure to such attacks: they limit you to a fixed number of login attempts. This is roughly the same protection that Apple implements on the devices themselves.

The encrypted contents of the data sent to the HSM (source).

The upshot of all these ideas is that — provided that the HSM works as designed, and that it can’t be reprogrammed — even Apple can’t access your stored data except by logging in with a correct passcode. And they only get a limited number of attempts to guess correctly, after which the account locks.

This rules out both malicious insiders and government access, with one big caveat.

What stops Apple from just reprogramming its HSM?

This is probably the biggest weakness of the system, and the part that’s driving the “backdoor’ concerns above. You see, the HSMs Apple uses are programmable. This means that — as long as Apple still has the code signing keys — the company can potentially update the custom code it includes onto the HSM to do all sort sorts of things.

These things might include: programming the HSM to output decrypted escrow keys. Or disabling the maximum login attempt counting mechanism. Or even inserting a program that runs a brute-force dictionary attack on the HSM itself. This would allow Apple to brute-force your passcode and/or recover your passwords.

Fortunately Apple has thought about this problem and taken steps to deal with it. Note that on HSMs like the one Apple is using, the code signing keys live on a special set of admin smartcards. To remove these keys as a concern, once Apple is done programming the HSM, they run these cards through a process that they call a “physical one-way hash function”.

If that sounds complicated, here’s Ivan’s slightly simpler explanation.

So, with the code signing keys destroyed, updating the HSM to allow nefarious actions should not be possible. Pretty much the only action Apple can take is to  wipe the HSM, which would destroy the HSM’s RSA secret keys and thus all of the encrypted records it’s responsible for. To make sure all admin cards are destroyed, the company has developed a complex ceremony for controlling the cards prior to their destruction. This mostly involves people making assertions that they haven’t made copies of the code signing key — which isn’t quite foolproof. But overall it’s pretty impressive.

The downside for Apple, of course, is that there had better not be a bug in any of their programming. Because right now there’s nothing they can do to fix it — except to wipe all of their HSMs and start over.

Couldn’t we use this idea to implement real crypto backdoors?

A key assertion I’ve heard is that if Apple can do this, then surely they can do something similar to escrow your keys for law enforcement. But looking at the system shows isn’t true at all.

To be sure, Apple’s reliance on a Hardware Security Module indicates a great deal of faith in a single hardware/software solution for storing many keys. Only time will tell if that faith is really justified. To be honest, I think it’s an overly-strong assumption. But iCloud Keychain is opt-in, so individuals can decide for themselves whether or not to take the risk. That wouldn’t be true of a mandatory law enforcement backdoor.

But the argument that Apple has enabled a law enforcement backdoor seems to miss what Apple has actually done. Instead of building a system that allows the company to recover your secret information, Apple has devoted enormous resources to locking themselves out. Only customers can access their own information. In other words, Apple has decided that the only way they can hold this information is if they don’t even trust themselves with it.

That’s radically different from what would be required to build a mandatory key escrow system for law enforcement. In fact, one of the big objections to such a backdoor — which my co-authors and I recently outlined in a report — is the danger that any of the numerous actors in such a system could misuse it. By eliminating themselves from the equation, Apple has effectively neutralized that concern.

If Apple can secure your passwords this way, then why don’t they do the same for your backed up photos, videos, and documents?

That’s a good question. Maybe you should ask them?