# On Ghost Users and Messaging Backdoors

The past few years have been an amazing time for the deployment of encryption. In ten years, encrypted web connections have gone from a novelty into a requirement for running a modern website. Smartphone manufacturers deployed default storage encryption to billions of phones. End-to-end encrypted messaging and phone calls are now deployed to billions of users.

While this progress is exciting to cryptographers and privacy advocates, not everyone sees it this way. A few countries, like the U.K. and Australia, have passed laws in an attempt to gain access to this data, and at least one U.S. proposal has made it to Congress. The Department of Justice recently added its own branding to the mix, asking tech companies to deploy “responsible encryption“.

What, exactly, is “responsible encryption”? Well, that’s a bit of a problem. Nobody on the government’s side of the debate has really been willing to get very specific about that. In fact, a recent speech by U.S. Deputy Attorney General Rod Rosenstein implored cryptographers to go figure it out.

With this as background, a recent article by GCHQ’s Ian Levy and Crispin Robinson reads like a breath of fresh air. Unlike their American colleagues, the British folks at GCHQ — essentially, the U.K.’s equivalent of NSA — seem eager to engage with the technical community and to put forward serious ideas. Indeed, Levy and Robinson make a concrete proposal in the article above: they offer a new solution designed to surveil both encrypted messaging and phone calls.

In this post I’m going to talk about that proposal as fairly as I can — given that I only have a high-level understanding of the idea. Then I’ll discuss what I think could go wrong.

### A brief, illustrated primer on E2E

The GCHQ proposal deals with law-enforcement interception on messaging systems and phone calls. To give some intuition about the proposal, I first need to give a very brief (and ultra-simplified) explanation of how those systems actually work.

The basic idea in any E2E communication systems is that each participant encrypts messages (or audio/video data) directly from one device to the other. This layer of encryption reduces the need to trust your provider’s infrastructure — ranging from telephone lines to servers to undersea cables — which gives added assurance against malicious service providers and hackers.

If you’ll forgive a few silly illustrations, the intuitive result is a picture that looks something like this:

If we consider the group chat/call setting, the picture changes slightly, but only slightly. Each participant still encrypts data to the other participants directly, bypassing the provider. The actual details (specific algorithms, key choices) vary between different systems. But the concept remains the same.

The problem with the simplified pictures above is that there’s actually a lot more going on in an E2E system than just encryption.

In practice, one of the most challenging problems in encrypted messaging stems is getting the key you need to actually perform the encryption. This problem, which is generally known as key distribution, is an age-old concern in the field of computer security. There are many ways for it to go wrong.

In the olden days, we used to ask users to manage and exchange their own keys, and then select which users they wanted to encrypt to. This was terrible and everyone hated it. Modern E2E systems have become popular largely because they hide all of this detail from their users. This comes at the cost of some extra provider-operated infrastructure.

In practice, systems like Apple iMessage, WhatsApp and Facebook Messenger actually look more like this:

The Apple at the top of the picture above stands in for Apple’s “identity service”, which is a cluster of servers running in Apple’s various data centers. These servers perform many tasks, but most notably: they act as a directory for looking up the encryption key of the person you’re talking to. If that service misfires and gives you the wrong key, the best ciphers in the world won’t help you. You’ll just be encrypting to the wrong person.

These identity services do more than look up keys. In at least some group messaging systems like WhatsApp and iMessage, they also control the membership of group conversations. In poorly-designed systems, the server can add and remove users from a group conversation at will, even if none of the participants have requested this. It’s as though you’re having a conversation in a very private room — but the door is unlocked and the building manager controls who can come enter and join you.

(A technical note: while these two aspects of the identity system serve different purposes, in practice they’re often closely related. For example, in many systems there is little distinction between “group” and “two-participant” messaging. For example, in systems that support multiple devices connected to a single account, like Apple’s iMessage, every single device attached to your user account is treated as a separate party to the conversation. Provided either party has more than one device on their account [say, an iPhone and an iPad] , you can think of every iMessage conversation as being a group conversation.)

Most E2E systems have basic countermeasures against bad behavior by the identity service. For example, client applications will typically alert you when a new user joins your group chat, or when someone adds a new device to your iMessage account. Similarly, both WhatsApp and Signal expose “safety numbers” that allow participants to verify that they received the right cryptographic keys, which offers a check against dishonest providers.

But these countermeasures are not perfect, and not every service offers them. Which brings me to the GCHQ proposal.

### What GCHQ wants

The Lawfare article by Levy and Robinson does not present GCHQ’s proposal in great detail. Fortunately, both authors have spent most of the touring the U.S., giving several public talks about their ideas. I had the privilege of speaking to both of them earlier this summer when they visited Johns Hopkins, so I think I have a rough handle on what they’re thinking.

In its outlines, the idea they propose is extremely simple. The goal is to take advantage of existing the weaknesses in the identity management systems of group chat and calling systems. This would allow law enforcement — with the participation of the service provider — to add a “ghost user” (or in some cases, a “ghost device”) to an existing group chat or calling session. In systems where group membership can be modified by the provider infrastructure, this could mostly be done via changes to the server-side components of the provider’s system.

I say that it could mostly be done server-side, because there’s a wrinkle. Even if you modify the provider infrastructure to add unauthorized users to a conversation, most existing E2E systems do notify users when a new participant (or device) joins a conversation. Generally speaking, having a stranger wander into your conversation is a great way to notify criminals that the game’s afoot or what have you, so you’ll absolutely want to block this warning.

While the GCHQ proposal doesn’t go into great detail, it seems to follow that any workable proposal will require providers to suppress those warning messages at the target’s device. This means the proposal will also require changes to the client application as well as the server-side infrastructure.

(Certain apps like Signal are already somewhat hardened against these changes, because group chat setup is handled in an end-to-end encrypted/authenticated fashion by clients. This prevents the server from inserting new users without the collaboration of at least one group participant. At the moment, however, both WhatsApp and iMessage seem vulnerable to GCHQ’s proposed approach.)

Due to this need for extensive server and client modifications, the GCHQ proposal actually represents a very significant change to the design of messaging systems. It seems likely that the client-side code changes would need to be deployed to all users, since you can’t do targeted software updates just against criminals. (Or rather, if you could rely on such targeted software updates, you would just use that capability instead of the thing that GCHQ is proposing.)

Which brings us to the last piece: how do get providers to go along with all of this?

While optimism and cooperation are nice in principle, it seems unlikely that communication providers are going to to voluntarily insert a powerful eavesdropping capability into their encrypted services, if only because it represents a huge and risky modification. Presumably this means that the UK government will have to compel cooperation. One potential avenue for this is to use Technical Capability Notices from the UK’s Investigatory Powers Act. Those notices mandate that a provider offer real-time decryption for sets of users ranging from 1-10,000 users, and moreover, that providers must design their systems to ensure this such a capability remains available.

And herein lies the problem.

### Providers are already closing this loophole

The real problem with the GCHQ proposal is that it targets a weakness in messaging/calling systems that’s already well-known to providers, and moreover, a weakness that providers have been working to close — perhaps because they’re worried that someone just like GCHQ (or probably, much worse) will try to exploit it. By making this proposal, the folks at GCHQ have virtually guaranteed that those providers will move much, much faster on this.

And they have quite a few options at their disposal. Over the past several years researchers have proposed several designs that offer transparency to users regarding which keys they’re obtaining from a provider’s identity service. These systems operate by having the identity service commit to the keys that are associated with individual users, such that it’s very hard for the provider to change a user’s keys (or to add a device) without everyone in the world noticing.

As mentioned above, advanced messengers like Signal have “submerged” the group chat management into the encrypted communications flow, so that the server cannot add new users without the digitally authenticated approval of one of the existing participants. This design, if ported to in more popular services like WhatsApp, would seem to kill the GCHQ proposal dead.

Of course, these solutions highlight the tricky nature of GCHQ’s proposal. Note that in order to take advantage of existing vulnerabilities, GCHQ is going to have to require that providers change their system. And of course, once you’ve opened the door to forcing providers to change their system, why stop with small changes? What stops the UK government from, say, taking things a step farther, and using the force of law to compel providers not to harden their systems against this type of attack?

Which brings us to the real problem with the GCHQ proposal. As far as I can see, there are two likely outcomes. In the first, providers rapidly harden their system — which is good! — and in the process kill off the vulnerabilities that make GCHQ’s proposal viable (which is bad, at least for GCHQ). The more interest that governments express towards the proposal, the more likely this first outcome is. In the second outcome, the UK government, perhaps along with other governments, solve this problem by forcing the providers to keep their systems vulnerable. This second outcome is what I worry about.

More concretely, it’s true that today’s systems include existing flaws that are easy to exploit. But that does not mean we should entomb those flaws in concrete. And once law enforcement begins to rely on them, we will effectively have done so. Over time what seems like a “modest proposal” using current flaws will rapidly become an ossifying influence that holds ancient flaws in place. In the worst-case outcome, we’ll be appointing agencies like GCHQ as the ultimate architect of Apple and Facebook’s communication systems.

That is not a good outcome. In fact, it’s one that will likely slow down progress for years to come.

The first rule of PAKE is: nobody ever wants to talk about PAKE. The second rule of PAKE is that this is a shame, because PAKE — which stands for Password Authenticated Key Exchange — is actually one of the most useful technologies that (almost) never gets used. It should be deployed everywhere, and yet it isn’t.

To understand why this is such a damn shame, let’s start by describing a very real problem.

Imagine I’m operating a server that has to store user passwords. The traditional way to do this is to hash each user password and store the result in a password database. There are many schools of thought on how to handle the hashing process; the most common recommendation these days is to use a memory-hard password hashing function like scrypt or argon2 (with a unique per-password salt), and then store only the hashed result. There are various arguments about which hash function to use, and whether it could help to also use some secret value (called “pepper“), but we’ll ignore these for the moment.

Regardless of the approach you take, all of these solutions have a single achilles heel:

When the user comes back to log into your website, they will still need to send over their (cleartext) password, since this is required in order for the server to do the check.

This requirement can lead to disaster if your server is ever persistently compromised, or if your developers make a simple mistake. For example, earlier this year Twitter asked all of its (330 million!) users to change their passwords — because it turned out that company had been logging cleartext (unhashed) passwords.

Now, the login problem doesn’t negate the advantage of password hashing in any way. But it does demand a better solution: one where the user’s password never has to go to the server in cleartext. The cryptographic tool that can give this to us is PAKE, and in particular a new protocol called OPAQUE, which I’ll get to at the end of this post.

### What’s a PAKE?

A PAKE protocol, first introduced by Bellovin and Merritt, is a special form of cryptographic key exchange protocol. Key exchange (or “key agreement”) protocols are designed to help two parties (call them a client and server) agree on a shared key, using public-key cryptography. The earliest key exchange protocols — like classical Diffie-Hellman — were unauthenticated, which made them vulnerable to man-in-the-middle attacks. The distinguishing feature of PAKE protocols is the client will authenticate herself to the server using a password. For obvious reasons, the password, or a hash of it, is assumed to be already known to the server, which is what allows for checking.

If this was all we required, PAKE protocols would be easy to build. What makes a PAKE truly useful is that it should also provide protection for the client’s password. A stronger version of this guarantee can be stated as follows: after a login attempt (valid, or invalid) both the client and server should learn only whether the client’s password matched the server’s expected value, and no additional information. This is a powerful guarantee. In fact, it’s not dissimilar to what we ask for from a zero knowledge proof.

Of course, the obvious problem with PAKE is that many people don’t want to run a “key exchange” protocol in the first place! They just want to verify that a user knows a password.

The great thing about PAKE is that the simpler “login only” use-case is easy to achieve. If I have a standard PAKE protocol that allows a client and server to agree on a shared key K if (and only if) the client knows the right password, then all we need add is a simple check that both parties have arrived at the same key. (This can be done, for example, by having the parties compute some cryptographic function with it and check the results.) So PAKE is useful even if all you’ve got in mind is password checking.

### SRP: The PAKE that Time Forgot

The PAKE concept seems like it provides an obvious security benefit when compared to the naive approach we use to log into servers today. And the techniques are old, in the sense that PAKEs have been known since way back in 1992! Despite this, they’ve seen from almost no adoption. What’s going on?

There are a few obvious reasons for this. The most obvious has to do with the limitations of the web: it’s much easier to put a password form onto a web page than it is to do fancy crypto in the browser. But this explanation isn’t sufficient. Even native applications rarely implement PAKE for their logins. Another potential explanation has to do with patents, though most of these are expired now. To me there are two likely reasons for the ongoing absence of PAKE: (1) there’s a lack of good PAKE implementations in useful languages, which makes it a hassle to use, and (2) cryptographers are bad at communicating the value of their work, so most people don’t know PAKE is even an option.

Even though I said PAKE isn’t deployed, there are some exceptions to the rule.

One of the remarkable ones is a 1998 protocol designed by Tom Wu [correction: not Tim Wu] and called “SRP”. Short for “Secure Remote Password“, this is a simple three-round PAKE with a few elegant features that were not found in the earliest works. Moreover, SRP has the distinction of being (as far as I know) the most widely-deployed PAKE protocol in the world. I cite two pieces of evidence for this claim:

1. SRP has been standardized as a TLS ciphersuite, and is actually implemented in libraries like OpenSSL, even though nobody seems to use it much.
2. Apple uses SRP extensively in their iCloud Key Vault.

This second fact by itself could make SRP one of the most widely used cryptographic protocols in the world, so vast is the number of devices that Apple ships. So this is nothing to sneer at.

Industry adoption of SRP is nice, but also kind of a bummer: mainly because while any PAKE adoption is cool, SRP itself isn’t the best PAKE we can deploy. I was planning to go into the weeds about why I feel so strongly about SRP, but it got longwinded and it distracted from the really nice protocol I actually want to talk about further below. If you’re still interested, I moved the discussion onto this page.

In lieu of those details, let me give a quick and dirty TL;DR on SRP:

1. SRP does some stuff “right”. For one thing, unlike early PAKEs it does not require you to store a raw password on the server (or, equivalently, a hash that could be used by a malicious client in place of the password). Instead, the server stores a “verifier” which is a one-way function of the password hash. This means a leak of the password database does not (immediately) allow the attacker to impersonate the user — unless they conduct further expensive dictionary attacks. (The technical name for this is “asymmetric” PAKE.)
2. Even better, the current version of SRP (v4 v6a) isn’t obviously broken!
3. However (and with no offense to the designers) the SRP protocol design is completely bonkers, and earlier versions have been broken several times — which is why we’re now at revision 6a. Plus the “security proof” in the original research paper doesn’t really prove anything meaningful.
4. SRP currently relies on integer (finite field) arithmetic, and for various reasons (see point 3 above) the construction is not obviously transferable to the elliptic curve setting. This requires more bandwidth and computation, and thus SRP can’t take advantage of the many efficiency improvements we’ve developed in settings like Curve25519.
5. SRP is vulnerable to pre-computation attacks, due to the fact that it hands over the user’s “salt” to any attacker who can start an SRP session. This means I can ask a server for your salt, and build a dictionary of potential password hashes even before the server is compromised.
6. Despite all these drawbacks, SRP is simple — and actually ships with working code. Plus there’s working code in OpenSSL that even integrates with TLS, which makes it relatively easy to adopt.

Out of all these points, the final one is almost certainly responsible for the (relatively) high degree of commercial success that SRP has seen when compared to other PAKE protocols. It’s not ideal, but it’s real. This is something for cryptographers to keep in mind.

### OPAQUE: The PAKE of a new generation

When I started thinking about PAKEs a few months ago, I couldn’t help but notice that most of the existing work was kind of crummy. It either had weird problems like SRP, or it required the user to store the password (or an effective password) on the server, or it revealed the salt to an attacker — allowing pre-computation attacks.

Then earlier this year, Jarecki, Krawczyk and Xu proposed a new protocol called OPAQUE. Opaque has a number of extremely nice advantages:

1. It can be implemented in any setting where Diffie-Hellman and discrete log (type) problems are hard. This means that, unlike SRP, it can be easily instantiated using efficient elliptic curves.
2. Even better: OPAQUE does not reveal the salt to the attacker. It solves this problem by using an efficient “oblivious PRF” to combine the salt with the password, in a way that ensures the client does not learn the salt and the server does not learn the password.
3. OPAQUE works with any password hashing function. Even better, since all the hashing work is done on the client, OPAQUE can actually take load off the server, freeing an online service up to use much strong security settings — for example, configuring scrypt with large RAM parameters.
4. In terms of number of messages and exponentiations, OPAQUE is not much different from SRP. But since it can be implemented in more efficient settings, it’s likely to be a lot more efficient.
5. Unlike SRP, OPAQUE has a reasonable security proof (in a very strong model).

There’s even an Internet Draft proposal for OPAQUE, which you can read here. Unfortunately, at this point I’m not aware of any production quality implementations of the code (if you know of one, please link to it in the comments and I’ll update). (Update: There are several potential implementations listed in the comments — I haven’t looked closely enough to endorse any, but this is great!) But that should soon change.

The full OPAQUE protocol is given a little bit further below. In the rest of this section I’m going to go into the weeds on how OPAQUE works.

Problem 1: Keeping the salt secret. As I mentioned above, the main problem with earlier PAKEs is the need to transmit the salt from a server to a (so far unauthenticated) client. This enables an attacker to run pre-computation attacks, where they can build an offline dictionary based on this salt.

The challenge here is that the salt is typically fed into a hash function (like scrypt) along with the password. Intuitively someone has to compute that function. If it’s the server, then the server needs to see the password — which defeats the whole purpose. If it’s the client, then the client needs the salt.

In theory one could get around this problem by computing the password hashing function using secure two-party computation (2PC). In practice, solutions like this are almost certainly not going to be efficient — most notably because password hashing functions are designed to be complex and time consuming, which will basically explode the complexity of any 2PC system.

OPAQUE gets around this with the following clever trick. They leave the password hash on the client’s side, but they don’t feed it the stored salt. Instead, they use a special two-party protocol called an oblivious PRF to calculate a second salt (call it salt2) so that the client can use salt2 in the hash function — but does not learn the original salt.

The basic idea of such a function is that the server and client can jointly compute a function PRF(salt, password), where the server knows “salt” and the client knows “password”. Only the client learns the output of this function. Neither party learns anything about the other party’s input.

The gory details:

The actual implementation of the oblivious PRF relies on the idea that the client has the password $P$ and the server has the salt, which is expressed as a scalar value $s$. The output of the PRF function should be of the form $H(P)^s$, where $H:\{0,1\}^* \rightarrow {\mathcal G}$ is a special hash function that hashes passwords into elements of a cyclic (prime-order) group.

To compute this PRF requires a protocol between the client and server. In this protocol, the client first computes $H(P)$ and then “blinds” this password by selecting a random scalar value r, and blinding the result to obtain $C = H(P)^r$. At this point, the client can send the blinded value C over to the server, secure in the understanding that (in a prime-order group), the blinding by r hides all partial information about the underlying password.

The server, which has a salt value s, now further exponentiates this calue to obtain $R = C^s$ and sends the result R back to the client. If we write this out in detail, the result can be expressed as $R = H(P)^{rs}$. The client now computes the inverse of its own blinding value r and exponentiates one more time as follows: $R' = R^{r^{-1}} = H(P)^s$. This element $R'$, which consists of the hash of the password exponentiated by the salt, is the output of the desired PRF function.

A nice feature of this protocol is that, if the client enters the wrong password into the protocol, she should obtain a value that is very different from the actual value she wants. This guarantee comes from the fact that the hash function is likely to produce wildly different outputs for distinct passwords.

Problem 2: Proving that the client got the right key K. Of course, at this point, the client has derived a key K, but the server has no idea what it is. Nor does the server know whether it’s the right key.

The solution OPAQUE uses based an old idea due to Gentry, Mackenzie and Ramzan. When the user first registers with the server, she generates a strong public and private key for a secure agreement protocol (like HMQV), and encrypts the resulting private key under K, along with the server’s public key. The resulting authenticated ciphertext (and the public key) is stored in the password database.

C = Encrypt(K, client secret key | server’s public key)

When the client wishes to authenticate using the OPAQUE protocol, the server sends it the stored ciphertext C. If the client entered the right password into the first phase, she can derive K, and now decrypt this ciphertext. Otherwise it’s useless. Using the embedded secret key, she can now run a standard authenticated key agreement protocol to complete the handshake. (The server verifies the clients’ inputs against its copy of the client’s public key, and the client does similarly.)

Putting it all together. All of these different steps can be merged together into a single protocol that has the same number of rounds as SRP. Leaving aside the key verification steps, it looks like the protocol above. Basically, just two messages: one from the client and one returned to the server.

The final aspect of the OPAQUE work is that it includes a strong security proof that shows the resulting protocol can be proven secure under the 1-more discrete logarithm assumption in the random oracle model, which is a (well, relatively) standard assumption that appears to hold in the settings we work with.

### In conclusion

So in summary, we have this neat technology that could make the process of using passwords much easier, and could allow us to do it in a much more efficient way — with larger hashing parameters, and more work done by the client? Why isn’t this everywhere?

Maybe in the next few years it will be.

# Why I’m done with Chrome

This blog is mainly reserved for cryptography, and I try to avoid filling it with random “someone is wrong on the Internet” posts. After all, that’s what Twitter is for! But from time to time something bothers me enough that I have to make an exception. Today I wanted to write specifically about Google Chrome, how much I’ve loved it in the past, and why — due to Chrome’s new user-unfriendly forced login policy — I won’t be using it going forward.

### A brief history of Chrome

When Google launched Chrome ten years ago, it seemed like one of those rare cases where everyone wins. In 2008, the browser market was dominated by Microsoft, a company with an ugly history of using browser dominance to crush their competitors. Worse, Microsoft was making noises about getting into the search business. This posed an existential threat to Google’s internet properties.

In this setting, Chrome was a beautiful solution. Even if the browser never produced a scrap of revenue for Google, it served its purpose just by keeping the Internet open to Google’s other products. As a benefit, the Internet community would receive a terrific open source browser with the best development team money could buy. This might be kind of sad for Mozilla (who have paid a high price due to Chrome) but overall it would be a good thing for Internet standards.

### What changed?

A few weeks ago Google shipped an update to Chrome that fundamentally changes the sign-in experience. From now on, every time you log into a Google property (for example, Gmail), Chrome will automatically sign the browser into your Google account for you. It’ll do this without asking, or even explicitly notifying you. (However, and this is important: Google developers claim this will not actually start synchronizing your data to Google — yet. See further below.)

Your sole warning — in the event that you’re looking for it — is that your Google profile picture will appear in the upper-right hand corner of the browser window. I noticed mine the other day:

The change hasn’t gone entirely unnoticed: it received some vigorous discussion on sites like Hacker News. But the mainstream tech press seems to have ignored it completely. This is unfortunate — and I hope it changes — because this update has huge implications for Google and the future of Chrome.

In the rest of this post, I’m going to talk about why this matters. From my perspective, this comes down to basically four points:

1. Nobody on the Chrome development team can provide a clear rationale for why this change was necessary, and the explanations they’ve given don’t make any sense.
2. This change has enormous implications for user privacy and trust, and Google seems unable to grapple with this.
3. The change makes a hash out of Google’s own privacy policies for Chrome.
4. Google needs to stop treating customer trust like it’s a renewable resource, because they’re screwing up badly.

I warn you that this will get a bit ranty. Please read on anyway.

### Google’s stated rationale makes no sense

The new feature that triggers this auto-login behavior is called “Identity consistency between browser and cookie jar” (HN). After conversations with two separate Chrome developers on Twitter (who will remain nameless — mostly because I don’t want them to hate me), I was given the following rationale for the change:

But note something critical about this scenario. In order for this problem to apply to you, you already have to be signed into Chrome. There is absolutely nothing in this problem description that seems to affect users who chose not to sign into the browser in the first place.

So if signed-in users are your problem, why would you make a change that forces unsignedin users to become signed-in? I could waste a lot more ink wondering about the mismatch between the stated “problem” and the “fix”, but I won’t bother: because nobody on the public-facing side of the Chrome team has been able to offer an explanation that squares this circle.

And this matters, because “sync” or not…

### The change has serious implications for privacy and trust

The Chrome team has offered a single defense of the change. They point out that just because your browser is “signed in” does not mean it’s uploading your data to Google’s servers. Specifically:

This is my paraphrase. But I think it’s fair to characterize the general stance of the Chrome developers I spoke with as: without this “sync” feature, there’s nothing wrong with the change they’ve made, and everything is just fine.

This is nuts, for several reasons.

User consent matters. For ten years I’ve been asked a single question by the Chrome browser: “Do you want to log in with your Google account?” And for ten years I’ve said no thanks. Chrome still asks me that question — it’s just that now it doesn’t honor my decision.

The Chrome developers want me to believe that this is fine, since (phew!) I’m still protected by one additional consent guardrail. The problem here is obvious:

If you didn’t respect my lack of consent on the biggest user-facing privacy option in Chrome (and  didn’t even notify me that you had stopped respecting it!) why should I trust any other consent option you give me? What stops you from changing your mind on that option in a few months, when we’ve all stopped paying attention?

The fact of the matter is that I’d never even heard of Chrome’s “sync” option — for the simple reason that up until September 2018, I had never logged into Chrome. Now I’m forced to learn these new terms, and hope that the Chrome team keeps promises to keep all of my data local as the barriers between “signed in” and “not signed in” are gradually eroded away.

The Chrome sync UI is a dark pattern. Now that I’m forced to log into Chrome, I’m faced with a brand new menu I’ve never seen before. It looks like this:

Does that big blue button indicate that I’m already synchronizing my data to Google? That’s scary! Wait, maybe it’s an invitation to synchronize! If so, what happens to my data if I click it by accident? (I won’t give it the answer away, you should go find out. Just make sure you don’t accidentally upload all your data in the process. It can happen quickly.)

In short, Google has transformed the question of consenting to data upload from something affirmative that I actually had to put effort into — entering my Google credentials and signing into Chrome — into something I can now do with a single accidental click. This is a dark pattern. Whether intentional or not, it has the effect of making it easy for people to activate sync without knowing it, or to think they’re already syncing and thus there’s no additional cost to increasing Google’s access to their data.

Don’t take my word for it. It even gives (former) Google people the creeps.

Big brother doesn’t need to actually watch you. We tell things to our web browsers that we wouldn’t tell our best friends. We do this with some vague understanding that yes, the Internet spies on us. But we also believe that this spying is weak and probabilistic. It’s not like someone’s standing over our shoulder checking our driver’s license with each click.

What happens if you take that belief away? There are numerous studies indicating that even the perception of surveillance can significantly greatly magnify the degree of self-censorship users force on themselves. Will user feel comfortable browsing for information on sensitive mental health conditions — if their real name and picture are always loaded into the corner of their browser? The Chrome development team says “yes”. I think they’re wrong.

For all we know, the new approach has privacy implications even if sync is off. The Chrome developers claim that with “sync” off, a Chrome has no privacy implications. This might be true. But when pressed on the actual details, nobody seems quite sure.

For example, if I have my browser logged out, then I log in and turn on “sync”, does all my past (logged-out) data get pushed to Google? What happens if I’m forced to be logged in, and then subsequently turn on “sync”? Nobody can quite tell me if the data uploaded in these conditions is the same. These differences could really matter.

The Chrome privacy policy is a remarkably simple document. Unlike most privacy policies, it was clearly written as a promise to Chrome’s users — rather than as the usual lawyer CYA. Functionally, it describes two browsing modes: “Basic browser mode” and “signed-in mode”. These modes have very different properties. Read for yourself:

In “basic browser mode”, your data is stored locally. In “signed-in” mode, your data gets shipped to Google’s servers. This is easy to understand. If you want privacy, don’t sign in. But what happens if your browser decides to switch you from one mode to the other, all on its own?

Technically, the privacy policy is still accurate. If you’re in basic browsing mode, your data is still stored locally. The problem is that you no longer get to decide which mode you’re in. This makes a mockery out of whatever intentions the original drafters had. Maybe Google will update the document to reflect the new “sync” distinction that the Chrome developers have shared with me. We’ll see.

Update: After I tweeted about my concerns, I received a DM on Sunday from two different Chrome developers, each telling me the good news: Google is updating their privacy policy to reflect the new operation of Chrome. I think that’s, um, good news. But I also can’t help but note that updating a privacy policy on a weekend is an awful lot of trouble to go to for a change that… apparently doesn’t even solve a problem for signed-out users.

### Trust is not a renewable resource

For a company that sustains itself by collecting massive amounts of user data, Google has  managed to avoid the negative privacy connotations we associate with, say, Facebook. This isn’t because Google collects less data, it’s just that Google has consistently been more circumspect and responsible with it.

Where Facebook will routinely change privacy settings and apologize later, Google has upheld clear privacy policies that it doesn’t routinely change. Sure, when it collects, it collects gobs of data, but in the cases where Google explicitly makes user security and privacy promises — it tends to keep them. This seems to be changing.

Google’s reputation is hard-earned, and it can be easily lost. Changes like this burn a lot of trust with users. If the change is solving an absolutely critical problem for users , then maybe a loss of trust is worth it. I wish Google could convince me that was the case.

### Conclusion

This post has gone on more than long enough, but before I finish I want to address two common counterarguments I’ve heard from people I generally respect in this area.

One argument is that Google already spies on you via cookies and its pervasive advertising network and partnerships, so what’s the big deal if they force your browser into a logged-in state? One individual I respect described the Chrome change as “making you wear two name tags instead of one”. I think this objection is silly both on moral grounds — just because you’re violating my privacy doesn’t make it ok to add a massive new violation — but also because it’s objectively silly. Google has spent millions of dollars adding additional tracking features to both Chrome and Android. They aren’t doing this for fun; they’re doing this because it clearly produces data they want.

The other counterargument (if you want to call it that) goes like this: I’m a n00b for using Google products at all, and of course they were always going to do this. The extreme version holds that I ought to be using lynx+Tor and DJB’s custom search engine, and if I’m not I pretty much deserve what’s coming to me.

I reject this argument. I think It’s entirely possible for a company like Google to make good, usable open source software that doesn’t massively violate user privacy. For ten years I believe Google Chrome did just this.

Why they’ve decided to change, I don’t know. It makes me sad.

# Friday Dachshund Blogging

For over a year this blog has failed to deliver on an essential promise — that there would someday be pictures of dachshunds. Today we deliver.

This is Callie (short for Calliope) working her way through a bit of summer crypto reading:

But sometimes that’s exhausting and you’ve gotta take a break.

A visit from a strange metallic dachshund:

Summer:

And in memoriam, Zoe and Sophie, who helped me start this blog.

# Wonk post: chosen ciphertext security in public-key encryption (Part 2)

This continues the post from Part 1. Note that this is a work in progress, and may have some bugs in it 🙂 I’ll try to patch them up as I go along.

In the previous post I discussed the problem of building CCA-secure public key encryption. Here’s a quick summary of what we discussed in the first part:

• We covered the definition of CCA2 security.
• We described how you can easily achieve this notion in the symmetric encryption setting using a CPA-secure encryption scheme, plus a secure MAC.
• We talked about why this same approach doesn’t work for standard public-key encryption.

In this post I’m going to discuss a few different techniques that actually do provide CCA security for public key encryption. We’ll be covering these in no particular order.

A quick note on security proofs. There are obviously a lot of different ways you could try to hack together a CCA2 secure scheme out of different components. Some of those might be secure, or they might not be. In general, the key difference between a “secure” and “maybe secure” scheme is the fact that we can construct some kind of security proof for it.

The phrase “some kind” will turn out to be an important caveat, because these proofs might require a modest amount of cheating.

### The bad and the ugly

Before we get to the constructive details, it’s worth talking a bit about some ideas that don’t work to achieve CCA security. The most obvious place to start is with some of the early RSA padding schemes, particularly the PKCS#1v1.5 padding standard.

PKCS#1 padding was developed in the 1980s, when it was obvious that public key encryption was going to become widely deployed. It was intended as a pre-processing stage for messages that were going to be encrypted using an RSA public key.

This padding scheme had two features. First, it added randomness to the message prior to encrypting it. This was designed to defeat the simple ciphertext guessing attacks that come from deterministic encryption schemes like RSA. It can be easily shown that randomized encryption is absolutely necessary for any IND-CPA (and implicitly, IND-CCA) secure public key encryption scheme. Second, the padding added some “check” bytes that were intended to help detect mangled ciphertexts after decryption; this was designed (presumably) to shore the scheme up against invalid decryption attempts.

PKCS#1v1.5 is still widely used in protocols, including all versions of TLS prior to TLS 1.3. The diagram below shows what the padding scheme looks like when used in TLS with a 2048-bit RSA key. The section labeled “48 bytes PMS” (pre-master secret) in this example represents the plaintext being encrypted. The 205 “non-zero padding” consists of purely random bytes that exclude the byte “0”, because that value is reserved to indicate the end of the padding section and the beginning of the plaintext.

After using the RSA secret key to recover the padded message, the decryptor is supposed to parse the message and verify that the first two bytes (“00 02”) and the boundary “00” byte are all correct and in not violating any rules. The decryptor may optionally conduct other checks like verifying the length and structure of the plaintext, in case that’s known in advance.

One of the most immediate observations about PKCS#1v1.5 is that the designers kind of intuitively understood that chosen ciphertext attacks were a thing. They clearly added some small checks to make sure that it would be hard for an attacker to modify a given ciphertext (e.g., by multiplying it by a chosen value). It’s also obvious that these checks aren’t very strong. In the standardized version of the padding scheme, there are essentially three bytes to check — and one of them (the “00” byte after the padding) can “float” at a large number of different positions, depending on how much padding and plaintext there is in the message.

The use of a weak integrity check leads to a powerful CCA2 attack on the encryption scheme that was first discovered by Daniel Bleichenbacher. The attack is powerful due to the fact that it actually leverages the padding check as a way to learn information about the plaintext. That is: the attacker “mauls” a ciphertext and sends it to be decrypted, and relies on the fact that the decryptor will simply perform the decryption checks they’re supposed to perform — and output a noticable error if they fail. Given only this one bit of information per decryption, the attack can gradually recover the full plaintext of a specific ciphertext by (a) multiplying it with some value, (b) sending the result to be decrypted, (c) recording the success/failure result, (d) adaptively picking a new value and repeating step (a) many thousands or millions of times.

The PKCS#1v1.5 padding scheme is mainly valuable to us today because it provides an excellent warning to cryptographic engineers, who would otherwise continue to follow the “you can just hack together something that looks safe” school of building protocols. Bleichenbacher-style attacks have largely scared the crypto community straight. Rather than continuing to use this approach, the crypto community has (mostly) moved towards techniques that at least offer some semblance of provable security.

That’s what we’ll cover in just a moment.

### A few quick notes on achieving CCA2-secure public key encryption

Before we get to a laundry list of specific techniques and schemes, it’s worth asking what types of design features we might be looking for in a CCA2 public key encryption scheme. Historically there have been two common requirements:

• It would be super convenient if we could start with an existing encryption scheme, like RSA or Elgamal encryption, and generically tweak (or “compile”) that scheme into a CCA2-secure scheme. (Re-usable generic techniques are particularly useful in the event that someone comes up with new underlying encryption schemes, like post-quantum secure ones.)
• The resulting scheme should be pretty efficient. That rules out most of the early theoretical techniques that use huge zero knowledge proofs (as cool as they are).

Before we get to the details, I also want to repeat the intuitive description of the CCA2 security game, which I gave in the previous post. The game (or “experiment”) works like this:

1. I generate an encryption keypair for a public-key scheme and give you the public key.
2. You can send me (sequentially and adaptively) many ciphertexts, which I will decrypt with my secret key. I’ll give you the result of each decryption.
3. Eventually you’ll send me a pair of messages (of equal length) $M_0, M_1$ and I’ll pick a bit $b$ at random, and return to you the encryption of $M_b$, which I will denote as $C^* \leftarrow {\sf Encrypt}(pk, M_b)$.
4. You’ll repeat step (2), sending me ciphertexts to decrypt. If you send me $C^*$ I’ll reject your attempt. But I’ll decrypt any other ciphertext you send me, even if it’s only slightly different from $C^*$.
5. You (the attacker) will output your guess $b'$. They “win” the game if $b'=b$.
6. We say a scheme is IND-CCA2 secure if the attacker wins with probability “not much greater” than 1/2 (which is the best an attacker can do if they just guess randomly.)

A quick review of this definition shows that we need a CCA2-encryption scheme to provide at least two major features.

First off, it should be obvious that the scheme must not leak information about the secret key, even when I’m using it to decrypt arbitrary chosen ciphertexts of your choice. There are obvious examples of schemes that fail to meet this requirement: the most famous is the (textbook) Rabin cryptosystem — where the attacker’s ability to obtain the decryption of a single chosen ciphertext can leak the entire secret key.

More subtly, it seems obvious that CCA2 security is related to non-malleabilityHere’s why: suppose I receive a challenge ciphertext $C^*$ at step (3). It must be the case that I cannot easily “maul” that ciphertext into a new ciphertext $C'$ that contains a closely related plaintext (and that the challenger will be able and willing to meaingfully decrypt). It’s easy to see that if I could get away with this, by the rules of the game I could probably win at step (4), simply by sending $C'$ in to be decrypted, getting the result, and seeing whether it’s more closely related to $M_0$ or $M_1$. (This is, in fact, a very weak explanation of what the Bleichenbacher attack does.)

It turns out that an even stronger property that helps achieve both of these conditions is something called plaintext awareness. There are various subtly-different mathematical formulations of this idea, but here I’ll try to give only the English-language intuition:

If the attacker is able to submit a (valid) ciphertext to be decrypted, it must be the case that she already knows the plaintext of that message.

This guarantee is very powerful, because it helps us to be sure that the decryption process doesn’t give the attacker any new information that she doesn’t already have. She can submit any messages she wants (including mauling the challenge ciphertext $C^*$) but if this plaintext-awareness property holds in the strongest sense, those decryptions won’t tell her anything she doesn’t already know.

Of course, just because your scheme appears to satisfy the above conditions does not mean it’s secure. Both rules above are heuristics: that is, they’re necessary conditions to prevent attacks, but they may or may not be sufficient. To really trust a scheme (in the cryptographic sense) we should be able to offer a proof (under some assumptions) that these guarantees hold. We’ll address that a bit as we go forward.

### Technique 1: Optimal Asymmetric Encryption Padding

One of the earlier practical CCA2 transforms was developed by Bellare and Rogaway as a direct replacement for the PKCS#1v1.5 padding scheme in RSA encryption. The scheme they developed — called Optimal Asymmetric Encryption Padding — represents a “drop-in” replacement for the earlier, broken v1.5 padding scheme. It also features a security proof. (Mostly. We’ll come back to this last point.)

(Confusingly, OAEP was adopted into the PKCS#1 standards as of version 2.0, so sometimes you’ll see it referred to as PKCS#1v2.0-OAEP.)

OAEP’s most obvious advance over the previous state of the art is the addition of not one, but two separate hash functions $G()$ and $H()$ that don’t exist in the v1.5 predecessor. (These are sometimes referred to as “mask generation functions”, which is just a fancy way of saying they’re hash functions with outputs of a custom, chosen size. Such functions can be easily built from existing hash functions like SHA256.)

Expressed graphically, this is what OAEP it looks like:

If you’ve ever seen the DES cipher, this structure should look familiar to you. Basically OAEP is a two-round (unkeyed) Feistel network that uses a pair of hash functions to implement the round functions. There are a few key observations you can make right off the bat:

• Just looking at the diagram above, you can see that it’s very easy to compute this padding function forward (going from a plaintext $m$ and some random padding $r$ to a padded message) and backwards — that is, it’s an easily-invertible permutation. The key to this feature is the Feistel network structure.
• Upon decryption, a decryptor can invert the padding of a given message and verify that the “check string” (the string of k1 “0” bits) is correctly structured. If this string is not structured properly, the decryptor can simply output an error. This comprises the primary decryption check.
• Assuming some (strong) properties of the hash functions, it intuitively seems that the OAEP transform is designed to create a kind of “avalanche effect” where even a small modification of a padded message will result in a very different unpadded result when the transform is inverted. In practice any such modification should “trash” the check string with overwhelming probability.

From an intuitive point of view, these last two properties are what makes OAEP secure against chosen-ciphertext attacks. The idea here is that, due to the random properties of the hash function, it should be hard for the attacker to construct a valid ciphertext (one that has a correct check string) if she does not already know the plaintext that goes into the transform. This should hold even if the attacker already has some known valid ciphertext (like $C^*$) that she wishes to maul.

More specifically related to mauling: if I send an RSA-OAEP ciphertext $C^*$ that encrypts a specific message $m$, the attacker should not be able to easily maul that ciphertext into a different ciphertext $C'$ that will still pass the decryption checks. This is due to two facts: (1) because RSA is a (trapdoor) permutation, any change to $C^*$ will implicitly change the padded message your recover after inverting the RSA function. And (2) sending this altered padded message backwards through the OAEP transform should, with overwhelming probability, trash the check string (and the message $m$). The result is that the adversary can’t maul someone else’s ciphertext.

This all assumes some very strong assumptions about the hash functions, which we’ll discuss below.

The OAEP proof details (at the most ridiculously superficial level)

Proving OAEP secure requires two basic techniques. Both fundamentally rely on the notion that the functions $G()$ and $H()$ are random oraclesThis is important for two very different reasons.

First: assuming a function is a “random oracle” means that we’re assuming it to have the same behavior as a random function. This is an awesome property for a hash function to have! (Note: real hash functions don’t have it. This means that hypothetically they could have very ‘non-random’ behavior that would make RSA-OAEP insecure. In practice this has not yet been a practical concern for real OAEP implementations, but it’s worth keeping in mind.

It’s easy to see that if the hash functions $G()$ and $H()$ were random functions, it would give OAEP some very powerful properties. Remember, one of the main intuitive goals of the OAEP scheme is to prevent attackers from successfully getting you to decrypt an improperly-constructed (e.g., mauled) ciphertext. If both hash functions are truly random, then this implies that any invalid ciphertext will almost certainly fail decryption, because the padding check will fail.

At a much deeper level, the use of random oracles in RSA’s security proof gives the security reduction a great deal of “extra power” to handle things like decrypt chosen ciphertexts. This is due to the fact that, in a random oracle proof, the proof reduction is allowed to both “see” every value hashed through those hash functions, and also to “program” the functions so that they will produce specific outputs. This would not be possible if $G()$ and $H()$ were implemented using real hash functions, and so the entire security proof would break down.

These properties provide a tool in the security proof to enable decryption even when the secret key is unknown. In a traditional proof of the RSA-OAEP scheme, the idea is to show that an attacker who breaks the encryption (in the IND-CCA2 sense) can be used to construct a second attacker who solves the RSA problem. This is done by taking some random values $(N, e, C)$ where $N, e$ is an RSA public key of unknown factorization and “programming” the random oracles such that $C^* = C$. The intuitive idea is that an attacker who is able to learn something about the underlying message must query the functions $G()$ and $H()$ on correct inputs that, ultimately will allow the security reduction to obtain the RSA inverse of $C^*$ even when the reduction does not know the RSA secret key, That is, such an attacker will allow us to find an integer $M'$ such that $M'^e = C$.

(There turned out to be some issues in the original OAEP proof that make it not quite work for arbitrary trapdoor permutations. Shoup fixed these by providing a new padding padding scheme called OAEP+, but the original OAEP had since gone into heavy usage within standards! It turns out that RSA-OAEP does work, however, for RSA with public exponents 3 and other exponents, though proving this required some ugly band-aids. This whole story is part of a cautionary tail about provably security, which Koblitz discusses here.)

### Technique 2: The Fujisaki-Okamoto Transform

One limitation of OAEP (and OAEP+) padding is that it requires a trapdoor permutation in order to work. This applies nicely to RSA encryption, but does not necessarily work with every existing public-key encryption scheme. This motivates the need for other CCA transforms that work with arbitrary existing (non-CCA) encryption schemes.

One of the nicest generic techniques for building CCA2-secure public-key encryption is due to Eiichiro Fujisaki and Tatsuaki Okamoto. The idea of this transform is to begin with a scheme that already meets the definition of IND-CPA security — that is, it is semantically secure, but not against chosen ciphertext attacks. (For this description, we’ll also require that this scheme has a large [exponentially-sized] message space and some specific properties related to randomness.) The beauty of the “Fujisaki-Okamoto transform” (henceforth: F-O) is that, like OAEP before it, given a working public-key encryption scheme, it requires only the addition of some hash functions, and can be proven secure in the random oracle model.

Let’s imagine that we have an IND-CPA encryption public-key encryption algorithm that consists of the algorithms ${\sf KeyGen}, {\sf Encrypt}, {\sf Decrypt}$. We’ll also make use of two independent hash functions $H_1, H_2$.

A key observation here is that in every IND-CPA (semantically secure) public key encryption scheme, the ${\sf Encrypt}$ algorithm is randomized. This actually has to be the case, due to the definition of IND-CPA. (See here for a discussion of why that is.) Put more explicitly, what this means is that the encryption algorithm must have acccess to some set of random bits that will be used to produce the ciphertext.

The main trick that the F-O transform uses is to de-randomize this public-key encryption algorithm. Instead of using real random bits to encrypt, it will instead use the output of the hash function $H_1$ to produce the random bits that will be used for encryption. This turns a randomized encryption into a deterministic one. (This, of course, requires that both the input and the internals of $H_1$ are capable of producing bits that “look” random.)

Let’s get to the nuts and bolts. The F-O transform does not change the key generation algorithm of the original encryption scheme at all, except to specify the hash functions $H_1, H_2$. The main changes come in the new encryption and decryption algorithms. I’m going to present one variant of the transform, though there are others. This one works as follows.

To encrypt a message $M$, which we’ll express as some fixed-length string of bits:

1. Instead of encrypting the actual message $M$, we instead sample a random message $R$ from the message space of the original CPA-secure scheme.
2. We hash the random message $R$ together with the original message $M$ using that first hash function $H_1$. The output of this function will give us a ‘random’ bitstring. Let’s denote this as: $r \leftarrow H_1(R \| M)$.
3. Next, we’ll encrypt the new random message $R$ using the original (CPA-secure) encryption scheme’s ${\sf Encrypt}$ algorithm, but critically: we will use the bits $r$ as the randomness for that encryption. The result of this process will give the first part of the ciphertext: $C_1 \leftarrow {\sf Encrypt}(pk, R; r)$. Note that here $r$ just refers to the randomness for the encryption algorithm, not an actual message being encrypted.
4. Finally, we derive a “key” for encrypting the real message we want to send. We can compute this as $K \leftarrow H_2(R)$.
5. We now encrypt the original message $M$ we want to send using some secure encryption scheme, for example the simple one-time pad: $C_2 \leftarrow M \oplus K$.
6. We output the “ciphertext” $C = (C_1, C_2)$.

To decrypt $C = (C_1, C_2)$, we would perform the following steps:

1. First, use the original public-key encryption scheme’s secret key to decrypt the ciphertext $C_1$, which (if all is well) should give us $R' \leftarrow {\sf Decrypt}(sk, C_1)$.
2. Now use knowledge of $R'$ to recover the key $K' \leftarrow H_2(R')$ and thus the message $M'$ which we can obtain as $M' \leftarrow C_2 \oplus K'$.
3. Now check that both $R', M'$ are valid by re-computing the randomness $r' \leftarrow H_1(R' \| M')$ and verifying the condition $C_1 = {\sf Encrypt}(pk, R'; r')$. If this final check fails, simply output a decryption error.

Phew. So what the heck is going on here?

Let’s tackle this scheme from a practical perspective. Earlier in this post, we said that to achieve IND-CCA2 security, a scheme must have two features. First, it must be plaintext aware, which means that in order to construct a valid ciphertext (that passes all decryption checks) the attacker should already know the plaintext.

Does F-O have this property? Well, intuitively we would hope that the answer is “yes”. Note for some valid F-O ciphertext $C = (C_1, C_2)$ the decrypted plaintext is implicitly defined as $M' \leftarrow C_2 \oplus H_2(R')$. So what we really want to prove is that in order to construct a valid ciphertext the attacker must already know $R'$ and $M'$ prior to sending the message for decryption.

This guarantee (with high probability) comes from the structure of $C_1$. In order for the ciphertext to be considered valid by the decryptor, it must be the case that $C_1$ satisfies the check $C_1 = {\sf Encrypt}(pk, M'; r' = H_1(R' \| M'))$. The idea of this proof is that it should be hard for an attacker to construct such a $C_1$ unless she has previously called the hash function $H_1$ on input $(R', M')$. If she has called the hash function to produce this portion of the ciphertext, then she already knows those values and the decryption oracle provides her with no additional information she didn’t already have. (Alternatively, if she did not call the hash function, then her probability of making a valid $C_1$ should be extremely low.)

Of course, this is only one strategy available to the attacker. She could also maul an existing ciphertext like $C^* = (C_1^*, C_2^*)$. In this case her strategy is twofold: she can tamper with the first portion of the ciphertext and/or she can tamper with the second. But it’s easy to see that this will tend to break some portion of the decryption checks:

1. If she tampers with any bit of $C_2^*$, she will change the recovered message into a new value that we can call $M''$. However this will in turn (with overwhelming probability) cause the decryptor to recover very different random coins $r'' \leftarrow H_1(R' \| M'')$ than were used in the original construction of $C_1^*$, and thus decryption check on that piece will probably fail.
2. If she tampers with any bit of $C_1^*$, the decryption check $latex C_1^* = {\sf Encrypt}(pk, M’; r’) ought not to pass, and decryption will just produce an error. 3. She might try to tamper with both parts of the ciphertext, of course. But this would seem even more challenging. The problem with the exercise above is that none of this constitutes a proof that the approach works. There is an awful lot of should and probably in this argument, and none of this ought to make you very happy. A rough sketch of the proof for an F-O scheme can be found here. (I warn you that it’s probably got some bugs in it, and I’m offering it mainly as an intuition.) The F-O scheme has many variants. A slightly different and much more formal treatment by Hofheinz and Kiltz can be found here, and deals with some other requirements on the underlying CPA-secure scheme. ### To be continued… So far in this discussion we’ve covered two basic techniques — both at a very superficial level — that achieve CCA2 security under the ridiculously strong assumption that random oracles exist. Unfortunately, they don’t. This motivates the need for better approaches that don’t require random oracles at all. There are a couple of those that, sadly, nobody uses. Those will have to wait until the next post. # Was the Efail disclosure horribly screwed up? TL;DR. No. Or keep reading if you want. On Monday a team of researchers from Münster, RUB and NXP disclosed serious cryptographic vulnerabilities in a number of encrypted email clients. The flaws, which go by the cute vulnerability name of “Efail”, potentially allow an attacker to decrypt S/MIME or PGP-encrypted email with only minimal user interaction. By the standards of cryptographic vulnerabilities, this is about as bad as things get. In short: if an attacker can intercept and alter an encrypted email — say, by sending you a new (altered) copy, or modifying a copy stored on your mail server — they can cause many GUI-based email clients to send the full plaintext of the email to an attacker controlled-server. Even worse, most of the basic problems that cause this flaw have been known for years, and yet remain in clients. The big (and largely under-reported) story of EFail is the way it affects S/MIME. That “corporate” email protocol is simultaneously (1) hated by the general crypto community because it’s awful and has a slash in its name, and yet (2) is probably the most widely-used email encryption protocol in the corporate world. The table at the right — excerpted from the paper — gives you a flavor of how Efail affects S/MIME clients. TL;DR it affects them very badly. Efail also happens to affect a smaller, but non-trivial number of OpenPGP-compatible clients. As one might expect (if one has spent time around PGP-loving folks) the disclosure of these vulnerabilities has created something of a backlash on HN, and among people who make and love OpenPGP clients. Mostly for reasons that aren’t very defensible. So rather than write about fun things — like the creation of CFB and CBC gadgets — today, I’m going to write about something much less exciting: the problem of vulnerability disclosure in ecosystems like PGP. And how bad reactions to disclosure can hurt us all. ### How Efail was disclosed to the PGP community Putting together a comprehensive timeline of the Efail disclosure process would probably be a boring, time-intensive project. Fortunately Thomas Ptacek loves boring and time-intensive projects, and has already done this for us. Briefly, the first Efail disclosures to vendors began last October, more than 200 days prior to the agreed publication date. The authors notified a large number of vulnerable PGP GUI clients, and also notified the GnuPG project (on which many of these projects depend) by February at the latest. From what I can tell every major vendor agreed to make some kind of patch. GnuPG decided that it wasn’t their fault, and basically stopped corresponding. All parties agreed not to publicly discuss the vulnerability until an agreed date in April, which was later pushed back to May 15. The researchers also notified the EFF and some journalists under embargo, but none of them leaked anything. On May 14 someone dumped the bug onto a mailing list. So the EFF posted a notice about the vulnerability (which we’ll discuss a bit more below), and the researchers put up a website. That’s pretty much the whole story. There are three basic accusations going around about the Efail disclosure. They can be summarized as (1) maintaining embargoes in coordinated disclosures is really hard, (2) the EFF disclosure “unfairly” made this sound like a serious vulnerability “when it isn’t”, and (3) everything was already patched anyway so what’s the big deal. ### Disclosures are hard; particularly coordinated ones I’ve been involved in two disclosures of flaws in open encryption protocols. (Both were TLS issues.) Each one poses an impossible dilemma. You need to simultaneously (a) make sure every vendor has as much advance notice as possible, so they can patch their software. But at the same time (b) you need to avoid telling literally anyone, because nothing on the Internet stays secret. At some point you’ll notify some FOSS project that uses an open development mailing list or ticket server, and the whole problem will leak out into the open. Disclosing bugs that affect PGP is particularly fraught. That’s because there’s no such thing as “PGP”. What we have instead is a large and distributed community that revolves around the OpenPGP protocol. The pillar of this community is the GnuPG project, which maintains the core GnuPG tool and libraries that many clients rely on. Then there are a variety of niche GUI-based clients and email plugin projects. Finally, there are commercial vendors like Apple and Microsoft. (Who are mostly involved in the S/MIME side of things, and may reluctantly allow PGP plugins.) Then, of course there are thousands of end-users, who will generally fail to update their software unless something really bad and newsworthy happens. The obvious solution to the disclosure problem to use a staged disclosure. You notify the big commercial vendors first, since that’s where most of the affected users are. Then you work your way down the “long tail” of open source projects, knowing that inevitably the embargo could break and everyone will have to patch in a hurry. And you keep in mind that no matter what happens, everyone will blame you for screwing up the disclosure. For the PGP issues in Efail, the big client vendors are Mozilla (Thunderbird), Microsoft (Outlook) and maybe Apple (Mail). The very next obvious choice would be to patch the GnuPG tool so that it no longer spits out unauthenticated plaintext, which is the root of many of the problems in Efail. The Efail team appears to have pursued exactly this approach for the client-side vulnerabilities. Sadly, the GnuPG team made the decision that it’s not their job to pre-emptively address problems that they view as ‘clients misusing the GnuPG API’ (my paraphrase), even when that misuse appears to be rampant across many of the clients that use their tool. And so the most obvious fix for one part of the problem was not available. This is probably the most unfortunate part of the Efail story, because in this case GnuPG is very much at fault. Their API does something that directly violates cryptographic best practices — namely, releasing unauthenticated plaintext prior to producing an error message. And while this could be understood as a reasonable API design at design time, continuing to support this API even as clients routinely misuse it has now led to flaws across the ecosystem. The refusal of GnuPG to take a leadership role in preemptively safeguarding these vulnerabilities both increases the difficulty of disclosing these flaws, and increases the probability of future issues. ### So what went wrong with the Efail disclosure? Despite what you may have heard, given the complexity of this disclosure, very little went wrong. The main issues people have raised seem to have to do with the contents of an EFF post. And with some really bad communications from Robert J. Hansen at the Enigmail (and GnuPG) project. The EFF post. The Efail researchers chose to use the Electronic Frontier Foundation as their main source for announcing the existence of the vulnerability to the privacy community. This hardly seems unreasonable, because the EFF is generally considered a trusted broker, and speaks to the right community (at least here in the US). The EFF post doesn’t give many details, nor does it give a list of affected (or patched) clients. It does give two pretty mild recommendations: 1. Temporarily disable or uninstall your existing clients until you’ve checked that they’re patched. 2. Maybe consider using a more modern cryptosystem like Signal, at least until you know that your PGP client is safe again. This naturally led to a huge freakout by many in the PGP community. Some folks, including vendors, have misrepresented the EFF post as essentially pushing people to “permanently” uninstall PGP, which will “put lives at risk” because presumably these users (whose lives are at risk, remember) will immediately fall back to sending incriminating information via plaintext emails — rather than temporarily switching their communications to one of several modern, well-studied secure messengers, or just not emailing for a few hours. In case you think I’m exaggerating about this, here’s one reaction from ProtonMail: The most reasonable criticism I’ve heard of the EFF post is that it doesn’t give many details about which clients are patched, and which are vulnerable. This could presumably give someone the impression that this vulnerability is still present in their email client, and thus would cause them to feel less than secure in using it. I have to be honest that to me that sounds like a really good outcome. The problem with Efail is that it doesn’t matter if your client is secure. The Efail vulnerability could affect you if even a single one of your communication partners is using an insecure client. So needless to say I’m not very sympathetic to the reaction around the EFF post. If you can’t be sure whether your client is secure, you probably should feel insecure. Bad communications from GnuPG and Enigmail. On the date of the disclosure, anyone looking for accurate information about security from two major projects — GnuPG and Enigmail — would not have been able to find it. They wouldn’t have found it because developers from both Enigmail and GnuPG were on mailing lists and Twitter claiming that they had never heard of Efail, and hadn’t been notified by the researchers. Needless to say, these allegations took off around the Internet, sometimes in place of real information that could have helped users (like, whether either project had patched.) It goes without saying that neither allegation was actually true. In fact, both project members soon checked with their fellow developers (and their memories) and found out that they’d both been given months of notice by the researchers, and that Enigmail had even developed a patch. (However, it turned out that even this patch may not perfectly address the issue, and the community is still working to figure out exactly what still needs to be done.) This is an understandable mistake, perhaps. But it sure is a bad one. ### PGP is bad technology and it’s making a bad community Now that I’ve made it clear that neither the researchers nor the EFF is out to get the PGP community, let me put on my mask and horns and tell you why someone should be. I’ve written extensively about PGP on this blog, but in the past I’ve written mostly from a technical point of view about the problems with PGP. But what’s really problematic about PGP is not just the cryptography; it’s the story it tells about path dependence and how software communities work. The fact of the matter is that OpenPGP is not really a cryptography project. That is, it’s not held together by cryptography. It’s held together by backwards-compatibility and (increasingly) a kind of an obsession with the idea of PGP as an end in and of itself, rather than as a means to actually make end-users more secure. Let’s face it, as a protocol, PGP/OpenPGP is just not what we’d develop if we started over today. It was formed over the years out of mostly experimental parts, which were in turn replaced, bandaged and repaired — and then worked into numerous implementations, which all had to be insanely flexible and yet compatible with one another. The result is bad, and most of the software implementing it is worse. It’s the equivalent of a beloved antique sports car, where the electrical system is totally shot, but it still drives. You know, the kind of car where the owner has to install a hand-switch so he can turn the reverse lights on manually whenever he wants to pull out of a parking space. If PGP went away, I estimate it would take the security community less than a year to entirely replace (the key bits of) the standard with something much better and modern. It would have modern crypto and authentication, and maybe even extensions for future post-quantum future security. It would be simple. Many bright new people would get involved to help write the inevitable Rust, Go and Javascript clients and libraries. Unfortunately for us all, (Open)PGP does exist. And that means that even fancy greenfield email projects feel like they need to support OpenPGP, or at least some subset of it. This in turn perpetuates the PGP myth, and causes other clients to use it. And as a direct result, even if some clients re-implement OpenPGP from scratch, other clients will end up using tools like GnuPG which will support unauthenticated encryption with bad APIs. And the cycle will go round and around, like a spaceship stuck near the event horizon of a black hole. And as the standard perpetuates itself, largely for the sake of being a standard, it will fail to attract new security people. It will turn away exactly the type of people who should be working on these tools. Those people will go off and build encryption systems in a totally different area, or they’ll get into cryptocurrency. And — with some exceptions — the people who work in the community will increasingly work in that community because they’re supporting PGP, and not because they’re trying to seek out the best security technologies for their users. And the serious (email) users of PGP will be using it because they like the idea of using PGP better than they like using an actual, secure email standard. And as things get worse, and fail to develop, people who work on it will become more dogmatic about its importance, because it’s something threatened and not a real security protocol that anyone’s using. To me that’s where PGP is going today, and that is why the community has such a hard time motivating itself to take these vulnerabilities seriously, and instead reacts defensively. Maybe that’s a random, depressing way to end a post. But that’s the story I see in OpenPGP. And it makes me really sad. # A few thoughts on Ray Ozzie’s “Clear” Proposal Yesterday I happened upon a Wired piece by Steven Levy that covers Ray Ozzie’s proposal for “CLEAR”. I’m quoted at the end of the piece (saying nothing much), so I knew the piece was coming. But since many of the things I said to Levy were fairly skeptical — and most didn’t make it into the piece — I figured it might be worthwhile to say a few of them here. Ozzie’s proposal is effectively a key escrow system for encrypted phones. It’s receiving attention now due to the fact that Ozzie has a stellar reputation in the industry, and due to the fact that it’s been lauded by law enforcement (and some famous people like Bill Gates). Ozzie’s idea is the just the latest bit of news in this second edition of the “Crypto Wars”, in which the FBI and various law enforcement agencies have been arguing for access to end-to-end encryption technologies — like phone storage and messaging — in the face of pretty strenuous opposition by (most of) the tech community. In this post I’m going to sketch a few thoughts about Ozzie’s proposal, and about the debate in general. Since this is a cryptography blog, I’m mainly going to stick to the technical, and avoid the policy details (which are substantial). Also, since the full details of Ozzie’s proposal aren’t yet public — some are explained in the Levy piece and some in this patent — please forgive me if I get a few details wrong. I’ll gladly correct. [Note: I’ve updated this post in several places in response to some feedback from Ray Ozzie. For the updated parts, look for the *. Also, Ozzie has posted some slides about his proposal.] ### How to Encrypt a Phone The Ozzie proposal doesn’t try tackle every form of encrypted data. Instead it focuses like a laser on the simple issue of encrypted phone storage. This is something that law enforcement has been extremely concerned about. It also represents the (relatively) low-hanging fruit of the crypto debate, for essentially two reasons: (1) there are only a few phone hardware manufacturers, and (2) access to an encrypted phone generally only takes place after law enforcement has gained physical access to it. I’ve written about the details of encrypted phone storage in a couple of previous posts. A quick recap: most phone operating systems encrypt a large fraction of the data stored on your device. They do this using an encryption key that is (typically) derived from the user’s passcode. Many recent phones also strengthen this key by “tangling” it with secrets that are stored within the phone itself — typically with the assistance of a secure processor included in the phone. This further strengthens the device against simple password guessing attacks. The upshot is that the FBI and local law enforcement have not — until very recently (more on that further below) — been able to obtain access to many of the phones they’ve obtained during investigation. This is due the fact that, by making the encryption key a function of the user’s passcode, manufacturers like Apple have effectively rendered themselves unable to assist law enforcement. ### The Ozzie Escrow Proposal Ozzie’s proposal is called “Clear”, and it’s fairly straightforward. Effectively, it calls for manufacturers (e.g., Apple) to deliberately put themselves back in the loop. To do this, Ozzie proposes a simple form of key escrow (or “passcode escrow”). I’m going to use Apple as our example in this discussion, but obviously the proposal will apply to other manufacturers as well. Ozzie’s proposal works like this: 1. Prior to manufacturing a phone, Apple will generate a public and secret “keypair” for some public key encryption scheme. They’ll install the public key into the phone, and keep the secret key in a “vault” where hopefully it will never be needed. 2. When a user sets a new passcode onto their phone, the phone will encrypt a passcode under the Apple-provided public key. This won’t necessarily be the user’s passcode, but it will be an equivalent passcode that can unlock the phone.* It will store the encrypted result in the phone’s storage. 3. In the unlikely event that the FBI (or police) obtain the phone and need to access its files, they’ll place the phone into some form of law enforcement recovery mode. Ozzie describes doing this with some special gesture, or “twist”. Alternatively, Ozzie says that Apple itself could do something more complicated, such as performing an interactive challenge/response with the phone in order to verify that it’s in the FBI’s possession. 4. The phone will now hand the encrypted passcode to law enforcement. (In his patent, Ozzie suggests it might be displayed as a barcode on a screen.) 5. The law enforcement agency will send this data to Apple, who will do a bunch of checks (to make sure this is a real phone and isn’t in the hands of criminals). Apple will access their secret key vault, and decrypt the passcode. They can then send this back to the FBI. 6. Once the FBI enters this code, the phone will be “bricked”. Let me be more specific: Ozzie proposes that once activated, a secure chip inside the phone will now permanently “blow” several JTAG fuses monitored by the OS, placing the phone into a locked mode. By reading the value of those fuses as having been blown, the OS will never again overwrite its own storage, will never again talk to any network, and will become effectively unable to operate as a normal phone again. When put into its essential form, this all seems pretty simple. That’s because it is. In fact, with the exception of the fancy “phone bricking” stuff in step (6), Ozzie’s proposal is a straightforward example of key escrow — a proposal that people have been making in various guises for many years. The devil is always in the details. ### A vault of secrets If we picture how the Ozzie proposal will change things for phone manufacturers, the most obvious new element is the key vault. This is not a metaphor. It literally refers to a giant, ultra-secure vault that will have to be maintained individually by different phone manufacturers. The security of this vault is no laughing matter, because it will ultimately store the master encryption key(s) for every single device that manufacturer ever makes. For Apple alone, that’s about a billion active devices. Does this vault sound like it might become a target for organized criminals and well-funded foreign intelligence agencies? If it sounds that way to you, then you’ve hit on one of the most challenging problems with deploying key escrow systems at this scale. Centralized key repositories — that can decrypt every phone in the world — are basically a magnet for the sort of attackers you absolutely don’t want to be forced to defend yourself against. So let’s be clear. Ozzie’s proposal relies fundamentally on the ability of manufacturers to secure extremely valuable key material for a massive number of devices against the strongest and most resourceful attackers on the planet. And not just rich companies like Apple. We’re also talking about the companies that make inexpensive phones and have a thinner profit margin. We’re also talking about many foreign-owned companies like ZTE and Samsung. This is key material that will be subject to near-constant access by the manufacturer’s employees, who will have to access these keys regularly in order to satisfy what may be thousands of law enforcement access requests every month. If ever a single attacker gains access to that vault and is able to extract, a few “master” secret keys (Ozzie says that these master keys will be relatively small in size*) then the attackers will gain unencrypted access to every device in the world. Even better: if the attackers can do this surreptitiously, you’ll never know they did it. Now in fairness, this element of Ozzie’s proposal isn’t really new. In fact, this key storage issue an inherent aspect of all massive-scale key escrow proposals. In the general case, the people who argue in favor of such proposals typically make two arguments: 1. We already store lots of secret keys — for example, software signing keys — and things works out fine. So this isn’t really a new thing. 2. Hardware Security Modules. Let’s take these one at a time. It is certainly true that software manufacturers do store secret keys, with varying degrees of success. For example, many software manufacturers (including Apple) store secret keys that they use to sign software updates. These keys are generally locked up in various ways, and are accessed periodically in order to sign new software. In theory they can be stored in hardened vaults, with biometric access controls (as the vaults Ozzie describes would have to be.) But this is pretty much where the similarity ends. You don’t have to be a technical genius to recognize that there’s a world of difference between a key that gets accessed once every month — and can be revoked if it’s discovered in the wild — and a key that may be accessed dozens of times per day and will be effectively undetectable if it’s captured by a sophisticated adversary. Moreover, signing keys leak all the time. The phenomenon is so common that journalists have given it a name: it’s called “Stuxnet-style code signing”. The name derives from the fact that the Stuxnet malware — the nation-state malware used to sabotage Iran’s nuclear program — was authenticated with valid code signing keys, many of which were (presumably) stolen from various software vendors. This practice hasn’t remained with nation states, unfortunately, and has now become common in retail malware. The folks who argue in favor of key escrow proposals generally propose that these keys can be stored securely in special devices called Hardware Security Modules (HSMs). Many HSMs are quite solid. They are not magic, however, and they are certainly not up to the threat model that a massive-scale key escrow system would expose them to. Rather than being invulnerable, they continue to cough up vulnerabilities like this one. A single such vulnerability could be game-over for any key escrow system that used it. In some follow up emails, Ozzie suggests that keys could be “rotated” periodically, ensuring that even after a key compromise the system could renew security eventually. He also emphasizes the security mechanisms (such as biometric access controls) that would be present in such a vault. I think that these are certainly valuable and necessary protections, but I’m not convinced that they would be sufficient. ### Assume a secure processor Let’s suppose for a second that an attacker does get access to the Apple (or Samsung, or ZTE) key vault. In the section above I addressed the likelihood of such an attack. Now let’s talk about the impact. Ozzie’s proposal has one significant countermeasure against an attacker who wants to use these stolen keys to illegally spy on (access) your phone. Specifically, should an attacker attempt to illegally access your phone, the phone will be effectively destroyed. This doesn’t protect you from having your files read — that horse has fled the stable — but it should alert you to the fact that something fishy is going on. This is better than nothing. This measure is pretty important, not only because it protects you against evil maid attacks. As far as I can tell, this protection is pretty much the only measure by which theft of the master decryption keys might ever be detected. So it had better work well. The details on how this might work aren’t very clear in Ozzie’s patent, but the Wired article describes it as follows. This quote to repeat Ozzie’s presentation at Columbia University: What Ozzie appears to describe here is a secure processor contained within every phone. This processor would be capable if securely and irreversibly enforcing that once law enforcement has accessed a phone, that phone could no longer be placed into an operational state. My concern with this part of Ozzie’s proposal is fairly simple: this processor does not currently exist. To explain why this, let me tell a story. Back in 2013, Apple began installing a secure processor in each of their phones. While this secure processor (called the Secure Enclave Processor, or SEP) is not exactly the same as the one Ozzie proposes, the overall security architecture seems very similar. One main goal of Apple’s SEP was to limit the number of passcode guessing attempts that a user could make against a locked iPhone. In short, it was designed to keep track of each (failed) login attempt and keep a counter. If the number of attempts got too high, the SEP would make the user wait a while — in the best case — or actively destroy the phone’s keys. This last protection is effectively identical to Ozzie’s proposal. (With some modest differences: Ozzie proposes to “blow fuses” in the phone, rather than erasing a key; and he suggests that this event would triggered by entry of a recovery passcode.*) For several years, the SEP appeared to do its job fairly effectively. Then in 2017, everything went wrong. Two firms, Cellebrite and Grayshift, announced that they had products that effectively unlocked every single Apple phone, without any need to dismantle the phone. Digging into the details of this exploit, it seems very clear that both firms — working independently — have found software exploits that somehow disable the protections that are supposed to be offered by the SEP. The cost of this exploit (to police and other law enforcement)? About$3,000-$5,000 per phone. Or (if you like to buy rather than rent) about$15,000. Aso, just to add an element of comedy to the situation, the GrayKey source code appears to have recently been stolen. The attackers are extorting the company for two Bitcoin. Because 2018. (🤡👞)

Let me sum this up my point in case I’m not beating you about the head quite enough:

The richest and most sophisticated phone manufacturer in the entire world tried to build a processor that achieved goals similar to those Ozzie requires. And as of April 2018, after five years of trying, they have been unable to achieve this goala goal that is critical to the security of the Ozzie proposal as I understand it.

Now obviously the lack of a secure processor today doesn’t mean such a processor will never exist. However, let me propose a general rule: if your proposal fundamentally relies on a secure lock that nobody can ever break, then it’s on you to show me how to build that lock.

### Conclusion

While this mainly concludes my notes about on Ozzie’s proposal, I want to conclude this post with a side note, a response to something I routinely hear from folks in the law enforcement community. This is the criticism that cryptographers are a bunch of naysayers who aren’t trying to solve “one of the most fundamental problems of our time”, and are instead just rejecting the problem with lazy claims that it “can’t work”.

As a researcher, my response to this is: phooey.

Cryptographers — myself most definitely included — love to solve crazy problems. We do this all the time. You want us to deploy a new cryptocurrency? No problem! Want us to build a system that conducts a sugar-beet auction using advanced multiparty computation techniques? Awesome. We’re there. No problem at all.

But there’s crazy and there’s crazy.

The reason so few of us are willing to bet on massive-scale key escrow systems is that we’ve thought about it and we don’t think it will work. We’ve looked at the threat model, the usage model, and the quality of hardware and software that exists today. Our informed opinion is that there’s no detection system for key theft, there’s no renewability system, HSMs are terrifically vulnerable (and the companies largely staffed with ex-intelligence employees), and insiders can be suborned. We’re not going to put the data of a few billion people on the line an environment where we believe with high probability that the system will fail.

Maybe that’s unreasonable. If so, I can live with that.

# Wonk post: chosen ciphertext security in public-key encryption (Part 1)

In general I try to limit this blog to posts that focus on generally-applicable techniques in cryptography. That is, I don’t focus on the deeply wonky. But this post is going to be an exception. Today, I’m going to talk about a topic that most “typical” implementers don’t — and shouldn’t — think about.

Specifically: I’m going to talk about various techniques for making public key encryption schemes chosen ciphertext secure. I see this as the kind of post that would have saved me ages of reading when I was a grad student, so I figured it wouldn’t hurt to write it all down (even though this absolutely shouldn’t serve as a replacement for just reading the original papers!)

### Background: CCA(1/2) security

Early (classical) ciphers used a relatively weak model of security, if they used one at all. That is, the typical security model for an encryption scheme was something like the following:

1. I generate an encryption key (or keypair for public-key encryption)
2. I give you the encryption of some message of my choice
3. You “win” if you can decrypt it

This is obviously not a great model in the real world, for several reasons. First off, in some cases the attacker knows a lot about the message to be decrypted. For example: it may come from a small space (like a set of playing cards). For this reason we require a stronger definition like “semantic security” that assumes the attacker can choose the plaintext distribution, and can also obtain the encryption of messages of his/her own choice. I’ve written more about this here.

More relevant to this post, another limitation of the above game is that — in some real-world examples — the attacker has even more power. That is: in addition to obtaining the encryption of chosen plaintexts, they may be able to convince the secret keyholder to decrypt chosen ciphertexts of their choice.

The latter attack is called a chosen-ciphertext (CCA) attack.

At first blush this seems like a really stupid model. If you can ask the keyholder to decrypt chosen ciphertexts, then isn’t the scheme just obviously broken? Can’t you just decrypt anything you want?

The answer, it turns out, is that there are many real-life examples where the attacker has decryption capability, but the scheme isn’t obviously broken. For example:

1. Sometimes an attacker can decrypt a limited set of ciphertexts (for example, because someone leaves the decryption machine unattended at lunchtime.) The question then is whether they can learn enough from this access to decrypt other ciphertexts that are generated after she loses access to the decryption machine — for example, messages that are encrypted after the operator comes back from lunch.
2. Sometimes an attacker can submit any ciphertext she wants — but will only obtain a partial decryption of the ciphertext. For example, she might learn only a single bit of information such as “did this ciphertext decrypt correctly”. The question, then, is whether she can leverage this tiny amount of data to fully decrypt some ciphertext of her choosing.

The first example is generally called a “non-adaptive” chosen ciphertext attack, or a CCA1 attack (and sometimes, historically, a “lunchtime” attack). There are a few encryption schemes that totally fall apart under this attack — the most famous textbook example is Rabin’s public key encryption scheme, which allows you to recover the full secret key from just a single chosen-ciphertext decryption.

The more powerful second example is generally referred to as an “adaptive” chosen ciphertext attack, or a CCA2 attack. The term refers to the idea that the attacker can select the ciphertexts they try to decrypt based on seeing a specific ciphertext that they want to attack, and by seeing the answers to specific decryption queries.

In this article we’re going to use the more powerful “adaptive” (CCA2) definition, because that subsumes the CCA1 definition. We’re also going to focus primarily on public-key encryption.

With this in mind, here is the intuitive definition of the experiment we want a CCA2 public-key encryption scheme to be able to survive:

1. I generate an encryption keypair for a public-key scheme and give you the public key.
2. You can send me (sequentially and adaptively) many ciphertexts, which I will decrypt with my secret key. I’ll give you the result of each decryption.
3. Eventually you’ll send me a pair of messages (of equal length) $M_0, M_1$ and I’ll pick a bit $b$ at random, and return to you the encryption of $M_b$, which I will denote as $C^* \leftarrow {\sf Encrypt}(pk, M_b)$.
4. You’ll repeat step (2), sending me ciphertexts to decrypt. If you send me $C^*$ I’ll reject your attempt. But I’ll decrypt any other ciphertext you send me, even if it’s only slightly different from $C^*$.
5. The attacker outputs their guess $b'$. They “win” the game if $b'=b$.

We say that our scheme is secure if the attacker wins only with a significantly greater probability than they would win with if they simply guessed $b'$ at random. Since they can win this game with probability 1/2 just by guessing randomly, that means we want (Probability attacker wins the game) – 1/2 to be “very small” (typically a negligible function of the security parameter).

You should notice two things about this definition. First, it gives the attacker the full decryption of any ciphertext they send me. This is obviously much more powerful than just giving the attacker a single bit of information, as we mentioned in the example further above. But note that powerful is good. If our scheme can remain secure in this powerful experiment, then clearly it will be secure in a setting where the attacker gets strictly less information from each decryption query.

The second thing you should notice is that we impose a single extra condition in step (4), namely that the attacker cannot ask us to decrypt $C^*$. We do this only to prevent the game from being “trivial” — if we did not impose this requirement, the attacker could always just hand us back $C^*$ to decrypt, and they would always learn the value of $b$.

(Notice as well that we do not give the attacker the ability to request encryptions of chosen plaintexts. We don’t need to do that in the public key encryption version of this game, because we’re focusing exclusively on public-key encryption here — since the attacker has the public key, she can encrypt anything she wants without my help.)

With definitions out of the way, let’s talk a bit about how we achieve CCA2 security in real schemes.

### A quick detour: symmetric encryption

This post is mainly going to focus on public-key encryption, because that’s actually the problem that’s challenging and interesting to solve. It turns out that achieving CCA2 for symmetric-key encryption is really easy. Let me briefly explain why this is, and why the same ideas don’t work for public-key encryption.

(To explain this, we’ll need to slightly tweak the CCA2 definition above to make it work in the symmetric setting. The changes here are small: we won’t give the attacker a public key in step (1), and at steps (2) and (4) we will allow the attacker to request the encryption of chosen plaintexts as well as the decryption.)

The first observation is that many common encryption schemes — particularly, the widely-used cipher modes of operation like CBC and CTR — are semantically secure in a model where the attacker does not have the ability to decrypt chosen ciphertexts. However, these same schemes break completely in the CCA2 model.

The simple reason for this is ciphertext malleability. Take CTR mode, which is particularly easy to mess with. Let’s say we’ve obtained a ciphertext $C^*$ at step (4) (recall that $C^*$ is the encryption of $M_b$), it’s trivially easy to “maul” the ciphertext — simply by flipping, say, a bit of the message (i.e., XORing it with “1”). This gives us a new ciphertext $C' = C^* \oplus 1$ that we are now allowed to submit for decryption. We are now allowed (by the rules of the game) to submit this ciphertext, and obtain $M_b \oplus 1$, which we can use to figure out $b$.

(A related, but “real world” variant of this attack is Vaudenay’s Padding Oracle Attack, which breaks actual implementations of symmetric-key cryptosystems. Here’s one we did against Apple iMessage. Here’s an older one on XML encryption.)

So how do we fix this problem? The straightforward observation is that we need to prevent the attacker from mauling the ciphertext $C^*$. The generic approach to doing this is to modify the encryption scheme so that it includes a Message Authentication Code (MAC) tag computed over every CTR-mode ciphertext. The key for this MAC scheme is generated by the encrypting party (me) and kept with the encryption key. When asked to decrypt a ciphertext, the decryptor first checks whether the MAC is valid. If it’s not, the decryption routine will output “ERROR”. Assuming an appropriate MAC scheme, the attacker can’t modify the ciphertext (including the MAC) without causing the decryption to fail and produce a useless result.

So in short: in the symmetric encryption setting, the answer to CCA2 security is simply for the encrypting parties to authenticate each ciphertext using a secret authentication (MAC) key they generate. Since we’re talking about symmetric encryption, that extra (secret) authentication key can be generated and stored with the decryption key. (Some more efficient schemes make this all work with a single key, but that’s just an engineering convenience.) Everything works out fine.

So now we get to the big question.

### CCA security is easy in symmetric encryption. Why can’t we just do the same thing for public-key encryption?

As we saw above, it turns out that strong authenticated encryption is sufficient to get CCA(2) security in the world of symmetric encryption. Sadly, when you try this same idea generically in public key encryption, it doesn’t always work. There’s a short reason for this, and a long one. The short version is: it matters who is doing the encryption.

Let’s focus on the critical difference. In the symmetric CCA2 game above, there is exactly one person who is able to (legitimately) encrypt ciphertexts. That person is me. To put it more clearly: the person who performs the legitimate encryption operations (and has the secret key) is also the same person who is performing decryption.

Even if the encryptor and decryptor aren’t literally the same person, the encryptor still has to be honest. (To see why this has to be the case, remember that the encryptor has shared secret key! If that party was a bad guy, then the whole scheme would be broken, since they could just output the secret key to the bad guys.) And once you’ve made the stipulation that the encryptor is honest, then you’re almost all the way there. It suffices simply to add some kind of authentication (a MAC or a signature) to any ciphertext she encrypts. At that point the decryptor only needs to determine whether any given ciphertexts actually came from the (honest) encryptor, and avoid decrypting the bad ones. You’re done.

Public key encryption (PKE) fundamentally breaks all these assumptions.

In a public-key encryption scheme, the main idea is that anyone can encrypt a message to you, once they get a copy of your public key. The encryption algorithm may sometimes be run by good, honest people. But it can also be run by malicious people. It can be run by parties who are adversarial. The decryptor has to be able to deal with all of those cases. One can’t simply assume that the “real” encryptor is honest.

Let me give a concrete example of how this can hurt you. A couple of years ago I wrote a post about flaws in Apple iMessage, which (at the time) used simple authenticated (public key) encryption scheme. The basic iMessage encryption algorithm used public key encryption (actually a combination of RSA with some AES thrown in for efficiency) so that anyone could encrypt a message to my key. For authenticity, it required that every message be signed with an ECDSA signature by the sender.

When I received a message, I would look up the sender’s public key and first make sure the signature was valid. This would prevent bad guys from tampering with the message in flight — e.g., executing nasty stuff like adaptive chosen ciphertext attacks. If you squint a little, this is almost exactly a direct translation of the symmetric crypto approach we discussed above. We’re simply swapping the MAC for a digital signature.

The problems with this scheme start to become apparent when we consider that there might be multiple people sending me ciphertexts. Let’s say the adversary is on the communication path and intercepts a signed message from you to me. They want to change (i.e., maul) the message so that they can execute some kind of clever attack. Well, it turns out this is simple. They simply rip off the honest signature and replace it one they make themselves:

The new message is identical, but now appears to come from a different person (the attacker). Since the attacker has their own signing key, they can maul the encrypted message as much as they want, and sign new versions of that message. If you plug this attack into (a version) of the public-key CCA2 game up top, you see they’ll win quite easily. All they have to do is modify the challenge ciphertext $C^*$ at step (4) to be signed with their own signing key, then they can change it by munging with the CTR mode encryption, and request the decryption of that ciphertext.

Of course if I only accept messages from signed by some original (guaranteed-to-be-honest) sender, this scheme might work out fine. But that’s not the point of public key encryption. In a real public-key scheme — like the one Apple iMessage was trying to build — I should be able to (safely) decrypt messages from anyone, and in that setting this naive scheme breaks down pretty badly.

Whew.

Ok, this post has gotten a bit long, and so far I haven’t actually gotten to the various “tricks” for adding chosen ciphertext security to real public key encryption schemes. That will have to wait until the next post, to come shortly.

# Hash-based Signatures: An illustrated Primer

Over the past several years I’ve been privileged to observe two contradictory and fascinating trends. The first is that we’re finally starting to use the cryptography that researchers have spent the past forty years designing. We see this every day in examples ranging from encrypted messaging to phone security to cryptocurrencies.

The second trend is that cryptographers are getting ready for all these good times to end.

But before I get to all of that — much further below — let me stress that this is not a post about the quantum computing apocalypse, nor is it about the success of cryptography in the 21st century. Instead I’m going to talk about something much more wonky. This post will be about one of the simplest (and coolest!) cryptographic technologies ever developed: hash-based signatures.

Hash-based signature schemes were first invented in the late 1970s by Leslie Lamport, and significantly improved by Ralph Merkle and others. For many years they were largely viewed as an interesting cryptographic backwater, mostly because they produce relatively large signatures (among other complications). However in recent years these constructions have enjoyed something of a renaissance, largely because — unlike signatures based on RSA or the discrete logarithm assumption — they’re largely viewed as resistant to serious quantum attacks like Shor’s algorithm.

First some background.

### Background: Hash functions and signature schemes

In order to understand hash-based signatures, it’s important that you have some familiarity with cryptographic hash functions. These functions take some input string (typically or an arbitrary length) and produce a fixed-size “digest” as output. Common cryptographic hash functions like SHA2, SHA3 or Blake2 produce digests ranging from 256 bits to 512 bits.

In order for a function $H(\cdot)$ to be considered a ‘cryptographic’ hash, it must achieve some specific security requirements. There are a number of these, but here we’ll just focus on three common ones:

1. Pre-image resistance (sometimes known as “one-wayness”): given some output $Y = H(X)$, it should be time-consuming to find an input $X$ such that $H(X) = Y$. (There are many caveats to this, of course, but ideally the best such attack should require a time comparable to a brute-force search of whatever distribution $X$ is drawn from.)

2. Second-preimage resistance: This is subtly different than pre-image resistance. Given some input $X$, it should be hard for an attacker to find a different input $X'$ such that $H(X) = H(X')$.

3. Collision resistance: It should be hard to find any two values $X_1, X_2$ such that $H(X_1) = H(X_2)$. Note that this is a much stronger assumption than second-preimage resistance, since the attacker has complete freedom to find any two messages of its choice.

The example hash functions I mentioned above are believed to provide all of these properties. That is, nobody has articulated a meaningful (or even conceptual) attack that breaks any of them. That could always change, of course, in which case we’d almost certainly stop using them. (We’ll discuss the special case of quantum attacks a bit further below.)

Since our goal is to use hash functions to construct signature schemes, it’s also helpful to briefly review that primitive.

A digital signature scheme is a public key primitive in which a user (or “signer”) generates a pair of keys, called the public key and private key. The user retains the private key, and can use this to “sign” arbitrary messages — producing a resulting digital signature. Anyone who has possession of the public key can verify the correctness of a message and its associated signature.

From a security perspective, the main property we want from a signature scheme is unforgeability, or “existential unforgeability“. This requirement means that an attacker (someone who does not possess the private key) should not be able to forge a valid signature on a message that you did not sign. For more on the formal definitions of signature security, see this page.

### The Lamport One-Time Signature

The first hash-based signature schemes was invented in 1979 by a mathematician named Leslie Lamport. Lamport observed that given only simple hash function — or really, a one-way function — it was possible to build an extremely powerful signature scheme.

Powerful that is, provided that you only need to sign one message! More on this below.

For the purposes of this discussion, let’s suppose we have the following ingredient: a hash function that takes in, say, 256-bit inputs and produces 256-bit outputs. SHA256 would be a perfect example of such a function. We’ll also need some way to generate random bits.

Let’s imagine that our goal is to sign 256 bit messages. To generate our secret key, the first thing we need to do is generate a series of  512 separate random bitstrings, each of 256 bits in length. For convenience, we’ll arrange those strings into two separate lists and refer to each one by an index as follows:

${\bf sk_0} = sk^{0}_1, sk^{0}_2, \dots, sk^{0}_{256}$
${\bf sk_1} = sk^{1}_1, sk^{1}_2, \dots, sk^{1}_{256}$

The lists $({\bf sk_0}, {\bf sk_1})$ represent the secret key that we’ll use for signing. To generate the public key, we now simply hash every one of those random strings using our function $H(\cdot)$. This produces a second pair of lists:

${\bf pk_0} = H(sk^{0}_1), H(sk^{0}_2), \dots, H(sk^{0}_{256})$
${\bf pk_1} = H(sk^{1}_1), H(sk^{1}_2), \dots, H(sk^{1}_{256})$

We can now hand out our public key $({\bf pk_0}, {\bf pk_1})$ to the entire world. For example, we can send it to our friends, embed it into a certificate, or post it on Keybase.

Now let’s say we want to sign a 256-bit message $M$ using our secret key. The very first thing we do is break up and represent $M$ as a sequence of 256 individual bits:

$M_1, \dots, M_{256} \in \{0,1\}$

The rest of the signing algorithm is blindingly simple. We simply work through the message from the first bit to the last bit, and select a string from one of the two secret key list. The list we choose from depends on value of the message bit we’re trying to sign.

Concretely, for i=1 to 256: if the $i^{th}$ message bit $M_i =0$, we grab the $i^{th}$ secret key string $(sk^{0}_i)$ from the ${\bf sk_0}$ list, and output that string as part of our signature. If the message bit $M_i = 1$ we copy the appropriate string $(sk^{1}_i)$ from the ${\bf sk_1}$ list. Having done this for each of the message bits, we concatenate all of the strings we selected. This forms our signature.

Here’s a toy illustration of the process, where (for simplicity) the secret key and message are only eight bits long. Notice that each colored box below represents a different 256-bit random string:

When a user — who already has the public key $({\bf pk_0}, {\bf pk_1})$ — receives a message $M$ and a signature, she can verify the signature easily. Let $s_i$ represent the $i^{th}$ component of the signature: for each such string. She simply checks the corresponding message bit $M_i$ and computes hash $H(s_i)$. If $M_i = 0$ the result should match the corresponding element from ${\bf pk_0}$. If $M_i = 1$ the result should match the element in ${\bf pk_1}$.

The signature is valid if every single element of the signature, when hashed, matches the correct portion of the public key. Here’s an (admittedly) sketchy illustration of the verification process, for at least one signature component:

If your initial impression of Lamport’s scheme is that it’s kind of insane, you’re both a bit right and a bit wrong.

Let’s start with the negative. First, it’s easy to see that Lamport signatures and keys are be quite large: on the order of thousands of bits. Moreover — and much more critically — there is a serious security limitation on this scheme: each key can only be used to sign one message. This makes Lamport’s scheme an example of what’s called a “one time signature”.

To understand why this restriction exists, recall that every Lamport signature reveals exactly one of the two possible secret key values at each position. If I only sign one message, the signature scheme works well. However, if I ever sign two messages that differ at any bit position $i$, then I’m going to end up handing out both secret key values for that position. This can be a problem.

Imagine that an attacker sees two valid signatures on different messages. She may be able to perform a simple “mix and match” forgery attack that allows her to sign a third message that I never actually signed. Here’s how that might look in our toy example:

The degree to which this hurts you really depends on how different the messages are  and how many of them you’ve given the attacker to play with. But it’s rarely good news.

So to sum up our observations about the Lamport signature scheme. It’s simple. It’s fast. And yet for various practical reasons it kind of sucks. Maybe we can do a little better.

From one-time to many-time signatures: Merkle’s tree-based signature

While the Lamport scheme is a good start, our inability to sign many messages with a single key is a huge drawback. Nobody was more inspired by this than Martin Hellman’s student Ralph Merkle. He quickly came up with a clever way to address this problem.

While we can’t exactly retrace Merkle’s steps, let’s see if we can recover some of the obvious ideas.

Let’s say our goal is to use Lamport’s signature to sign many messages — say $N$ of them. The most obvious approach is to simply generate $N$ different keypairs for the original Lamport scheme, then concatenate all the public keys together into one mega-key.

(Mega-key is a technical term I just invented).

If the signer holds on to all $N$ secret key components, she can now sign $N$ different messages by using exactly one secret Lamport key per message. This seems to solve the problem without ever requiring her to re-use a secret key. The verifier has all the public keys, and can verify all the received messages. No Lamport keys are ever used to sign twice.

Obviously this approach sucks big time.

Specifically, in this naive approach, signing $N$ times requires the signer to distribute a public key that is $N$ times as large as a normal Lamport public key. (She’ll also need to hang on to a similar pile of secret keys.) At some point people will get fed up with this, and probably $N$ won’t every get to be very large. Enter Merkle.

What Merkle proposed was a way to retain the ability to sign $N$ different messages, but without the linear-cost blowup of public keys. Merkle’s idea worked like this:

1. First, generate $N$ separate Lamport keypairs. We can call those $(PK_1, SK_1), \dots, (PK_N, SK_N)$.
2. Next, place each public key at one leaf of a Merkle hash tree (see below), and compute the root of the tree. This root will become the “master” public key of the new Merkle signature scheme.
3. The signer retains all of the Lamport public and secret keys for use in signing.

Merkle trees are described here. Roughly speaking, what they provide is a way to collect many different values such that they can be represented by a single “root” hash (of length 256 bits, using the hash function in our example). Given this hash, it’s possible to produce a simple “proof” that an element is in a given hash tree. Moreover, this proof has size that is logarithmic in the number of leaves in the tree.

To sign the $i^{th}$ message, the signer simply selects the $i^{th}$ public key from the tree, and signs the message using the corresponding Lamport secret key. Next, she concatenates the resulting signature to the Lamport public key and tacks on a “Merkle proof” that shows that this specific Lamport public key is contained within the tree identified by the root (i.e., the public key of the entire scheme). She then transmits this whole collection as the signature of the message.

(To verify a signature of this form, the verifier simply unpacks this “signature” as a Lamport signature, Lamport public key, and Merkle Proof. She verifies the Lamport signature against the given Lamport public key, and uses the Merkle Proof to verify that the Lamport public key is really in the tree. With these three objectives achieved, she can trust the signature as valid.)

This approach has the disadvantage of increasing the “signature” size by more than a factor of two. However, the master public key for the scheme is now just a single hash value, which makes this approach scale much more cleanly than the naive solution above.

As a final optimization, the secret key data can itself be “compressed” by generating all of the various secret keys using the output of a cryptographic pseudorandom number generator, which allows for the generation of a huge number of (apparently random) bits from a single short ‘seed’.

Whew.

### Making signatures and keys (a little bit) more efficient

Merkle’s approach allows any one-time signature to be converted into an $N$-time signature. However, his construction still requires us to use some underlying one-time signature like Lamport’s scheme. Unfortunately the (bandwidth) costs of Lamport’s scheme are still relatively high.

There are two major optimizations that can help to bring down these costs. The first was also proposed by Merkle. We’ll cover this simple technique first, mainly because it helps to explain the more powerful approach.

If you recall Lamport’s scheme, in order sign a 256-bit message we required a vector consisting of 512 separate secret key (and public key) bitstrings. The signature itself was a collection of 256 of the secret bitstrings. (These numbers were motivated by the fact that each bit of the message to be signed could be either a “0” or a “1”, and thus the appropriate secret key element would need to be drawn from one of two different secret key lists.)

But here’s a thought: what if we don’t sign all of the message bits?

Let’s be a bit more clear. In Lamport’s scheme we sign every bit of the message — regardless of its value — by outputting one secret string. What if, instead of signing both zero values and one values in the message, we signed only the message bits were equal to one? This would cut the public and secret key sizes in half, since we could get rid of the ${\bf sk_0}$ list entirely.

We would now have only a single list of bitstrings $sk_1, \dots, sk_{256}$ in our secret key. For each bit position of the message where $M_i = 1$ we would output a string $sk_i$. For every position where $M_i = 0$ we would output… zilch. (This would also tend to reduce the size of signatures, since many messages contain a bunch of zero bits, and those would now ‘cost’ us nothing!)

An obvious problem with this approach is that it’s horrendously insecure. Please do not implement this scheme!

As an example, let’s say an attacker observes a (signed) message that begins with “1111…”, and she want to edit the message so it reads “0000…” — without breaking the signature. All she has to do to accomplish this is to delete several components of the signature! In short, while it’s very difficult to “flip” a zero bit into a one bit, it’s catastrophically easy to do the reverse.

But it turns out there’s a fix, and it’s quite elegant.

You see, while we can’t prevent an attacker from editing our message by turning one bits into zero bits, we can catch them. To do this, we tack on a simple “checksum” to the message, then sign the combination of the original message and the checksum. The signature verifier must verify the entire signature over both values, and also ensure that the received checksum is correct.

The checksum we use is trivial: it consists of a simple binary integer that represents the total number of zero bits in the original message.

If the attacker tries to modify the content of the message (excluding the checksum) in order to turn some one bit into a zero bit, the signature scheme won’t stop her. But this attack have the effect of increasing the number of zero bits in the message. This will immediately make the checksum invalid, and the verifier will reject the signature.

Of course, a clever attacker might also try to mess with the checksum (which is also signed along with the message) in order to “fix it up” by increasing the integer value of the checksum. However — and this is critical — since the checksum is a binary integer, in order to increase the value of the checksum, she would always need to turn some zero bit of the checksum into a one bit. But since the checksum is also signed, and the signature scheme prevents this kind of change, the attacker has nowhere to go.

(If you’re keeping track at home, this does somewhat increase the size of the ‘message’ to be signed. In our 256-bit message example, the checksum will require an additional eight bits and corresponding signature cost. However, if the message has many zero bits, the reduced signature size will typically still be a win.)

### Winternitz: Trading space for time

The trick above reduces the public key size by half, and reduces the size of (some) signatures by a similar amount. That’s nice, but not really revolutionary. It still gives keys and signatures that are thousands of bits long.

It would be nice if we could make a bigger dent in those numbers.

The final optimization we’ll talk about was proposed by Robert Winternitz as a further optimization of Merkle’s technique above. In practical use it gives a 4-8x reduction in the size of signatures and public keys — at a cost of increasing both signing and verification time.

Winternitz’s idea is an example of a technique called a “time-space tradeoff“. This term refers to a class of solutions in which space is reduced at the cost of adding more computation time (or vice versa). To explain Winternitz’s approach, it helps to ask the following question:

What if, instead of signing messages composed of bits (0 or 1), we treated our messages as though they were encoded using larger symbol alphabets? For example, what if we signed four-bit ‘nibbles’? Or eight-bit bytes?

In Lamport’s original scheme, we had two lists of bitstrings as part of the signing (and public) key. One was for signing zero message bits, and the other was for one bits.

Now let’s say we want to sign bytes rather than bits. An obvious idea would be to increase the number of secret key lists (and public key) from two such list to 256 such lists — one list for each possible value of a message byte. The signer could work through the message one byte at a time, and pick from the much larger menu of key values.

Unfortunately, this solution really stinks. It reduces the size of the signature by a factor of eight, at a cost of increasing the public and secret key size by a factor of 256. Even this might be fine if the public keys could be used for many signatures, but they can’t — when it comes to key re-use, this “byte signing version of Lamport” suffers from the same limitations as the original Lamport signature.

All of which brings us to Winternitz’s idea.

Since it’s too expensive to store and distribute 256 truly random lists, what if we generated those lists programatically only when we needed them?

Winternitz’s idea was to generate single list of random seeds ${\bf sk_0} = (sk^0_1, \dots, sk^0_{256})$ for our initial secret key. Rather than generating additional lists randomly, he proposed to use the hash function $H()$ on each element of that initial secret key, in order to derive the next such list for the secret key: ${\bf sk_1} = (sk^{1}_1, \dots, sk^{1}_{256}) = (H(sk^{0}_1), \dots, H(sk^{0}_{256}))$. And similarly, one can use the hash function again on that list to get the next list ${\bf sk_2}$. And so on for many possible lists.

This is helpful in that we now only need to store a single list of secret key values ${\bf sk_0}$, and we can derive all the others lists on-demand just by applying the hash function.

But what about the public key? This is where Winternitz gets clever.

Specifically, Winternitz proposed that the public key could be derived by applying the hash function one more time to the final secret key list. This would produce a single public key list ${\bf pk}$. (In practice we only need 255 secret key lists, since we can treat the final secret key list as the public key.) The elegance of this approach is that given any one of the possible secret key values it’s always possible to check it against the public key, simply by hashing forward multiple times and seeing if we reach a public key element.

The whole process of key generation is illustrated below:

To sign the first byte of a message, we would pick a value from the appropriate list. For example, if the message byte was “0”, we would output a value from ${\bf sk_0}$ in our signature. If the message byte was “20”, we would output a value from ${\bf sk_{20}}$. For bytes with the maximal value “255” we don’t have a secret key list. That’s ok: in this case we can output an empty string, or we can output the appropriate element of ${\bf pk}$.

Note as well that in practice we don’t really need to store each of these secret key lists. We can derive any secret key value on demand given only the original list ${\bf sk}_0$. The verifier only holds the public key vector and (as mentioned above) simply hashes forward an appropriate number of times — depending on the message byte — to see whether the result is equal to the appropriate component of the public key.

Like the Merkle optimization discussion in the previous section, the scheme as presented so far has a glaring vulnerability. Since the secret keys are related (i.e., $sk^{1}_{1} = H(sk^{0}_1)$), anyone who sees a message that signs the message “0” can easily change the corresponding byte of the message to a “1”, and update the signature to match. In fact, an attacker can increment the value of any byte(s) in the message. Without some check on this capability, this would allow very powerful forgery attacks.

The solution to this problem is similar to the we discussed just above. To prevent an attacker from modifying the signature, the signer calculates and also signs a checksum of the original message bytes. The structure of this checksum is designed to prevent the attacker from incrementing any of the bytes, without invalidating the checksum. I won’t go into the gory details right now, but you can find them here.

It goes without saying that getting this checksum right is critical. Screw it up, even a little bit, and some very bad things can happen to you. This would be particularly unpleasant if you deployed these signatures in a production system.

Illustrated in one terrible picture, a 4-byte toy example of the Winternitz scheme looks like this:

### What are hash-based signatures good for?

Throughout this entire discussion, we’ve mainly been talking about the how of hash-based signatures rather than the why of them. It’s time we addressed this. What’s the point of these strange constructions?

One early argument in favor of hash-based signatures is that they’re remarkably fast and simple. Since they only require only the evaluation of a hash function and some data copying, from a purely computational cost perspective they’re highly competitive with schemes like ECDSA and RSA. This could hypothetically be important for lightweight devices. Of course, this efficiency comes at a huge tradeoff in bandwidth efficiency.

However, there is more complicated reason for the (recent) uptick in attention to hash-based signature constructions. This stems from the fact that all of our public-key crypto is about to be broken.

More concretely: the imminent arrival of quantum computers is going to have a huge impact on the security of nearly all of our practical signature schemes, ranging from RSA to ECDSA and so on. This is due to the fact that Shor’s algorithm (and its many variants) provides us with a polynomial-time algorithm for solving the discrete logarithm and factoring problems, which is likely to render most of these schemes insecure.

Most implementations of hash-based signatures are not vulnerable to Shor’s algorithm. That doesn’t mean they’re completely immune to quantum computers, of course. The best general quantum attacks on hash functions are based on a search technique called Grover’s algorithm, which reduces the effective security of a hash function. However, the reduction in effective security is nowhere near as severe as Shor’s algorithm (it ranges between the square root and cube root), and so security can be retained by simply increasing the internal capacity and output size of the hash function. Hash functions like SHA3 were explicitly developed with large digest sizes to provide resilience against such attacks.

So at least in theory, hash-based signatures are interesting because they provide us with a line of defense against future quantum computers — for the moment, anyway.

Note that so far I’ve only discussed some of the “classical” hash-based signature schemes. All of the schemes I described above were developed in the 1970s or early 1980s. This hardly brings us up to present day.

After I wrote the initial draft of this article, a few people asked for pointers on more recent developments in the field. I can’t possibly give an exhaustive list here, but let me describe just a couple of the more recent ideas that others brought up (thanks to Zooko and Claudio Orlandi):

Signatures without state. A limitation of all the signature schemes above is that they require the signer to keep state between signatures. In the case of one-time signatures the reasoning is obvious: you have to avoid using any key more than once. But even in the multi-time Merkle signature, you have to remember which leaf public key you’re using, so you can avoid using any leaf twice. Even worse, the Merkle scheme requires the signer to construct all the keypairs up front, so the total number of signatures is bounded.

In the 1980s, Oded Goldreich pointed out that one can build signatures without these limitations. The idea is as follows: rather than generate all signatures up front, one can generate a short “certification tree” of one-time public keys. Each of these keys can be used to sign additional one-time public keys at a lower layer of the tree, and so on and so forth. Provided all of the private keys are generated deterministically using a single seed, this means that the full tree need not exist in full at key generation time, but can be built on-demand whenever a new key is generated. Each signature contains a “certificate chain” of signatures and public keys starting from the root and going down to a real signing keypair at the bottom of the tree.

This technique allows for the construction of extremely “deep” trees with a vast (exponential) number of possible signing keys. This allows us to construct so many one-time public keys that if we pick a signing key randomly (or pseudorandomly), then with high probability the same signing will never be used twice. This is intuition, of course. For a highly optimized and specific instantiation of this idea, see the SPHINCS proposal by Bernstein et alThe concrete SPHINCS-256 instantiation gives signatures that are approximately 41KB in size.

Picnic: post-quantum zero-knowledge based signatures. In a completely different direction lies Picnic. Picnic is based on a new non-interactive zero-knowledge proof system called ZKBoo. ZKBoo is a new ZK proof system that works on the basis of a technique called “MPC in the head”, where the prover uses multi-party computation to calculate a function with the prover himself. This is too complicated to explain in a lot of detail, but the end result is that one can then prove complicated statements using only hash functions.

The long and short of it is that Picnic and similar ZK proof systems provide a second direction for building signatures out of hash functions. The cost of these signatures is still quite large — hundreds of kilobytes. But future improvements in the technique could substantially reduce this size.

### Epilogue: the boring security details

If you recall a bit earlier in this article, I spent some time describing the security properties of hash functions. This wasn’t just for show. You see, the security of a hash-based signature depends strongly on which properties a hash function is able to provide.

(And by implication, the insecurity of a hash-based signature depends on which properties of a hash function an attacker has managed to defeat.)

Most original papers discussing hash-based signatures generally hang their security arguments on the preimage-resistance of the hash function. Intuitively, this seems pretty straightforward. Let’s take the Lamport signature as an example. Given a public key element $pk^{0}_1 = H(sk^{0}_1)$, an attacker who is able to compute hash preimages can easily recover a valid secret key for that component of the signature. This attack obviously renders the scheme insecure.

However, this argument considers only the case where an attacker has the public key but has not yet seen a valid signature. In this case the attacker has a bit more information. They now have (for example) both the public key and a portion of the secret key: $pk^{0}_1 = H(sk^{0}_1)$ and $sk^{0}_1$. If such an attacker can find a second pre-image for the public key $pk^{0}_1$ she can’t sign a different message. But she has produced a new signature. In the strong definition of signature security (SUF-CMA) this is actually considered a valid attack. So SUF-CMA requires the slightly stronger property of second-preimage resistance.

Of course, there’s a final issue that crops up in most practical uses of hash-based signature schemes. You’ll notice that the description above assumes that we’re signing 256-bit messages. The problem with this is that in real applications, many messages are longer than 256 bits. As a consequence, most people use the hash function $H()$ to first hash the message as $D = H(M)$ and then the sign the resulting value $D$ instead of the message.

This leads to a final attack on the resulting signature scheme, since the existential unforgeability of the scheme now depends on the collision-resistance of the hash function. An attacker who can find two different messages $M_1 \ne M_2$ such that $H(M_1) = H(M_2)$ has now found a valid signature on two different messages. This leads to a trivial break of EUF-CMA security.

# A few notes on Medsec and St. Jude Medical

In Fall 2016 I was invited to come to Miami as part of a team that independently validated some alleged flaws in implantable cardiac devices manufactured by St. Jude Medical (now part of Abbott Labs). These flaws were discovered by a company called MedSec. The story got a lot of traction in the press at the time, primarily due to the fact that a hedge fund called Muddy Waters took a large short position on SJM stock as a result of these findings. SJM subsequently sued both parties for defamation. The FDA later issued a recall for many of the devices.

Due in part to the legal dispute (still ongoing!), I never had the opportunity to write about what happened down in Miami, and I thought that was a shame: because it’s really interesting. So I’m belatedly putting up this post, which talks a bit MedSec’s findings, and implantable device security in general.

By the way: “we” in this case refers to a team of subject matter experts hired by Bishop Fox, and retained by legal counsel for Muddy Waters investments. I won’t name the other team members here because some might not want to be troubled by this now, but they did most of the work — and their names can be found in this public expert report (as can all the technical findings in this post.)

Quick disclaimers: this post is my own, and any mistakes or inaccuracies in it are mine and mine alone. I’m not a doctor so holy cow this isn’t medical advice. Many of the flaws in this post have since been patched by SJM/Abbot. I was paid for my time and travel by Bishop Fox for a few days in 2016, but I haven’t worked for them since. I didn’t ask anyone for permission to post this, because it’s all public information.

### A quick primer on implantable cardiac devices

Implantable cardiac devices are tiny computers that can be surgically installed inside a patient’s body. Each device contains a battery and a set of electrical leads that can be surgically attached to the patient’s heart muscle.

When people think about these devices, they’re probably most familiar with the cardiac pacemaker. Pacemakers issue small electrical shocks to ensure that the heart beats at an appropriate rate. However, the pacemaker is actually one of the least powerful implantable devices. A much more powerful type of device is the Implantable Cardioverter-Defibrillator (ICD). These devices are implanted in patients who have a serious risk of spontaneously entering a dangerous state in which their heart ceases to pump blood effectively. The ICD continuously monitors the patient’s heart rhythm to identify when the patient’s heart has entered this condition, and applies a series of increasingly powerful shocks to the heart muscle to restore effective heart function. Unlike pacemakers, ICDs can issue shocks of several hundred volts or more, and can both stop and restart a patient’s normal heart rhythm.

Like most computers, implantable devices can communicate with other computers. To avoid the need for external data ports – which would mean a break in the patient’s skin – these devices communicate via either a long-range radio frequency (“RF”) or a near-field inductive coupling (“EM”) communication channel, or both. Healthcare providers use a specialized hospital device called a Programmer to update therapeutic settings on the device (e.g., program the device, turn therapy off). Using the Programmer, providers can manually issue commands that cause an ICD to shock the patient’s heart. One command, called a “T-Wave shock” (or “Shock-on-T”) can be used by healthcare providers to deliberately induce ventrical fibrillation. This capability is used after a device is implanted, in order to test the device and verify it’s functioning properly.

Because the Programmer is a powerful tool – one that could cause harm if misused – it’s generally deployed in a physician office or hospital setting. Moreover, device manufacturers may employ special precautions to prevent spurious commands from being accepted by an implantable device. For example:

1. Some devices require that all Programmer commands be received over a short-range communication channel, such as the inductive (EM) channel. This limits the communication range to several centimeters.
2. Other devices require that a short-range inductive (EM) wand must be used to initiate a session between the Programmer and a particular implantable device. The device will only accept long-range RF commands sent by the Programmer after this interaction, and then only for a limited period of time.

From a computer security perspective, both of these approaches have a common feature: using either approach requires some form of close-proximity physical interaction with the patient before the implantable device will accept (potentially harmful) commands via the long-range RF channel. Even if a malicious party steals a Programmer from a hospital, she may still need to physically approach the patient – at a distance limited to perhaps centimeters – before she can use the Programmer to issue commands that might harm the patient.

In addition to the Programmer, most implantable manufacturers also produce some form of “telemedicine” device. These devices aren’t intended to deliver commands like cardiac shocks. Instead, they exist to provide remote patient monitoring from the patient’s home. Telematics devices use RF or inductive (EM) communications to interrogate the implantable device in order to obtain episode history, usually at night when the patient is asleep. The resulting data is uploaded to a server (via telephone or cellular modem) where it can be accessed by healthcare providers.

### What can go wrong?

Before we get into specific vulnerabilities in implantable devices, it’s worth asking a very basic question. From a security perspective, what should we even be worried about?

There are a number of answers to this question. For example, an attacker might abuse implantable device systems or infrastructure to recover confidential patient data (known as PHI). Obviously this would be bad, and manufacturers should design against it. But the loss of patient information is, quite frankly, kind of the least of your worries.

A much scarier possibility is that an attacker might attempt to harm patients. This could be as simple as turning off therapy, leaving the patient to deal with their underlying condition. On the much scarier end of the spectrum, an ICD attacker could find a way to deliberately issue dangerous shocks that could stop a patient’s heart from functioning properly.

Now let me be clear: this isn’t not what you’d call a high probability attack. Most people aren’t going to be targeted by sophisticated technical assassins. The concerning thing about this  the impact of such an attack is significantly terrifying that we should probably be concerned about it. Indeed, some high-profile individuals have already taken precautions against it.

The real nightmare scenario is a mass attack in which a single resourceful attacker targets thousands of individuals simultaneously — perhaps by compromising a manufacturer’s back-end infrastructure — and threatens to harm them all at the same time. While this might seem unlikely, we’ve already seen attackers systematically target hospitals with ransomware. So this isn’t entirely without precedent.

### Securing device interaction physically

The real challenge in securing an implantable device is that too much security could hurt you. As tempting as it might be to lard these devices up with security features like passwords and digital certificates, doctors need to be able to access them. Sometimes in a hurry.

This is a big deal. If you’re in a remote emergency room or hospital, the last thing you want is some complex security protocol making it hard to disable your device or issue a required shock. This means we can forget about complex PKI and revocation lists. Nobody is going to have time to remember a password. Even merely complicated procedures are out — you can’t afford to have them slow down treatment.

At the same time, these devices obviously must perform some sort of authentication: otherwise anyone with the right kind of RF transmitter could program them — via RF, from a distance. This is exactly what you want to prevent.

Many manufacturers have adopted an approach that cut through this knot. The basic idea is to require physical proximity before someone can issue commands to your device. Specifically, before anyone can issue a shock command (even via a long-range RF channel) they must — at least briefly — make close physical contact with the patient.

This proximity be enforced in a variety of ways. If you remember, I mentioned above that most devices have a short-range inductive coupling (“EM”) communications channel. These short-range channels seem ideal for establishing a “pairing” between a Programmer and an implantable device — via a specialized wand. Once the channel is established, of course, it’s possible to switch over to long-range RF communications.

This isn’t a perfect solution, but it has a lot going for it: someone could still harm you, but they would have to at least get a transmitter within a few inches of your chest before doing so. Moreover, you can potentially disable harmful commands from an entire class of device (like telemedecine monitoring devices) simply by leaving off the wand.

### St. Jude Medical and MedSec

So given this background, what did St. Jude Medical do? All of the details are discussed in a full expert report published by Bishop Fox. In this post we I’ll focus on the most serious of MedSec’s claims, which can be expressed as follows:

Using only the hardware contained within a “Merlin @Home” telematics device, it was possible to disable therapy and issue high-power “shock” commands to an ICD from a distance, and without first physically interacting with the implantable device at close range.

1. The existence of this vulnerability implies that – through a relatively simple process of “rooting” and installing software on a Merlin @Home device – a malicious attacker could create a device capable of issuing harmful shock commands to installed SJM ICD devices at a distance. This is particularly worrying given that Merlin @Home devices are widely deployed in patients’ homes and can be purchased on eBay for prices under \$30. While it might conceivably be possible to physically secure and track the location of all PCS Programmer devices, it seems challenging to physically track the much larger fleet of Merlin @Home devices.
2. More critically, it implies that St. Jude Medical implantable devices do not enforce a close physical interaction (e.g., via an EM wand or other mechanism) prior to accepting commands that have the potential to harm or even kill patients. This may be a deliberate design decision on St. Jude Medical’s part. Alternatively, it could be an oversight. In either case, this design flaw increases the risk to patients by allowing for the possibility that remote attackers might be able to cause patient harm solely via the long-range RF channel.
3. If it is possible – using software modifications only – to issue shock commands from the Merlin @Home device, then patients with an ICD may be vulnerable in the hypothetical event that their Merlin @Home device becomes remotely compromised by an attacker. Such a compromise might be accomplished remotely via a network attack on a single patient’s Merlin @Home device. Alternatively, a compromise might be accomplished at large scale through a compromise of St. Jude Medical’s server infrastructure.

We stress that the final scenario is strictly hypothetical. MedSec did not allege a specific vulnerability that allows for the remote compromise of Merlin @Home devices or SJM infrastructure. However, from the perspective of software and network security design, these attacks are one of the potential implications of a design that permits telematics devices to send such commands to an implantable device. It is important to stress that none of these attacks would be possible if St. Jude Medical’s design prohibited the implantable from accepting therapeutic commands from the Merlin @Home device (e.g., by requiring close physical interaction via the EM wand, or by somehow authenticating the provenance of commands and restricting critical commands to be sent by the Programmer only).

### Validating MedSec’s claim

To validate MedSec’s claim, we examined their methodology from start to finish. This methodology included extracting and decompiling Java-based software from a single PCS Programmer; accessing a Merlin @Home device to obtain a root shell via the JTAG port; and installing a new package of custom software written by MedSec onto a used Merlin @Home device.

We then observed MedSec issue a series of commands to an ICD device using a Merlin @Home device that had been customized (via software) as described above. We used the Programmer to verify that these commands were successfully received by the implantable device, and physically confirmed that MedSec had induced shocks by attaching a multimeter to the leads on the implantable device.

Finally, we reproduced MedSec’s claims by opening the case of a second Merlin @Home device (after verifying that the tape was intact over the screw holes), obtaining a shell by connecting a laptop computer to the JTAG port, and installing MedSec’s software on the device. We were then able to issue commands to the ICD from a distance of several feet. This process took us less than three hours in total, and required only inexpensive tools and a laptop computer.

### What are the technical details of the attack?

Simply reproducing a claim is only part of the validation process. To verify MedSec’s claims we also needed to understand why the attack described above was successful. Specifically, we were interested in identifying the security design issues that make it possible for a Merlin @Home device to successfully issue commands that are not intended to be issued from this type of device. The answer to this question is quite technical, and involves the specific way that SJM implantable devices verify commands before accepting them.

MedSec described to us the operation of SJM’s command protocol as part of their demonstration. They also provided us with Java JAR executable code files taken from the hard drive of the PCS Programmer. These files, which are not obfuscated and can easily be “decompiled” into clear source code, contain the software responsible for implementing the Programmer-to-Device communications protocol.

By examining the SJM Programmer code, we verified that Programmer commands are authenticated through the inclusion of a three-byte (24 bit) “authentication tag” that must be present and correct within each command message received by the implantable device. If this tag is not correct, the device will refuse to accept the command.

From a cryptographic perspective, 24 bits is a surprisingly short value for an important authentication field. However, we note that even this relatively short tag might be sufficient to prevent forgery of command messages – provided the tag ws calculated using a secure cryptographic function (e.g., a Message Authentication Code) using a fresh secret key that cannot be predicted by an the attacker.

Based on MedSec’s demonstration, and on our analysis of the Programmer code, it appears that SJM does not use the above approach to generate authentication tags. Instead, SJM authenticates the Programmer to the implantable with the assistance of a “key table” that is hard-coded within the Java code within the Programmer. At minimum, any party who obtains the (non-obfuscated) Java code from a legitimate SJM Programmer can gain the ability to calculate the correct authentication tags needed to produce viable commands – without any need to use the Programmer itself.

Moreover, MedSec determined – and successfully demonstrated – that there exists a “Universal Key”, i.e., a fixed three-byte authentication tag, that can be used in place of the calculated authentication tag. We identified this value in the Java code provided by MedSec, and verified that it was sufficient to issue shock commands from a Merlin @Home to an implantable device.

While these issues alone are sufficient to defeat the command authentication mechanism used by SJM implantable devices, we also analyzed the specific function that is used by SJM to generate the three-byte authentication tag.  To our surprise, SJM does not appear to use a standard cryptographic function to compute this tag. Instead, they use an unusual and apparently “homebrewed” cryptographic algorithm for the purpose.

Specifically, the PCS Programmer Java code contains a series of hard-coded 32-bit RSA public keys. To issue a command, the implantable device sends a value to the Programmer. This value is then “encrypted” by the Programmer using one of the RSA public keys, and the resulting output is truncated to produce a 24-bit output tag.

The above is not a standard cryptographic protocol, and quite frankly it is difficult to see what St. Jude Medical is trying to accomplish using this technique. From a cryptographic perspective it has several problems:

1. The RSA public keys used by the PCS Programmers are 32 bits long. Normal RSA keys are expected to be a minimum of 1024 bits in length. Some estimates predict that a 1024-bit RSA key can be factored (and thus rendered insecure) in approximately one year using a powerful network of supercomputers. Based on experimentation, we were able to factor the SJM public keys in less than one second on a laptop computer.
2. Even if the RSA keys were of an appropriate length, the SJM protocol does not make use of the corresponding RSA secret keys. Thus the authentication tag is not an RSA signature, nor does it use RSA in any way that we are familiar with.
3. As noted above, since there is no shared session key established between the specific implantable device and the Programmer, the only shared secret available to both parties is contained within the Programmer’s Java code. Thus any party who extracts the Java code from a PCS Programmer will be able to transmit valid commands to any SJM implantable device.

Our best interpretation of this design is that the calculation is intended as a form of “security by obscurity”, based on the assumption that an attacker will not be able to reverse engineer the protocol. Unfortunately, this approach is rarely successful when used in security systems. In this case, the system is fundamentally fragile – due to the fact that code for computing the correct authentication tag is likely available in easily-decompiled Java bytecode on each St. Jude Medical Programmer device. If this code is ever extracted and published, all St. Jude Medical devices become vulnerable to command forgery.

### How to remediate these attacks?

To reiterate, the fundamental security concerns with these St. Jude Medical devices (as of 2016) appeared to be problems of design. These were:

1. SJM implantable devices did not require close physical interaction prior to accepting commands (allegedly) sent by the Programmer.
2. SJM did not incorporate a strong cryptographic authentication mechanism in its RF protocol to verify that commands are truly sent by the Programmer.
3. Even if the previous issue was addressed, St. Jude did not appear to have an infrastructure for securely exchanging shared cryptographic keys between a legitimate Programmer and an implantable device.

There are various ways to remediate these issues. One approach is to require St. Jude implantable devices to exchange a secret key with the Programmer through a close-range interaction involving the Programmer’s EM wand. A second approach would be to use a magnetic sensor to verify the presence of a magnet on the device, prior to accepting Programmer commands. Other solutions are also possible. I haven’t reviewed the solution SJM ultimately adopted in their software patches, and I don’t know how many users patched.

### Conclusion

Implantable devices offer a number of unique security challenges. It’s naturally hard to get these things right. At the same time, it’s important that vendors take these issues seriously, and spend the time to get cryptographic authentication mechanisms right — because once deployed, these devices are very hard to repair, and the cost of a mistake is extremely high.