Attack of the Week: Group Messaging in WhatsApp and Signal

If you’ve read this blog before, you know that secure messaging is one of my favorite whatsapp-icontopics. However, recently I’ve been a bit disappointed. My sadness comes from the fact that lately these systems have been getting too damned good. That is, I was starting to believe that most of the interesting problems had finally been solved.

If nothing else, today’s post helped disabuse me of that notion.

This result comes from a new paper by Rösler, Mainka and Schwenk from Ruhr-Universität Bochum (affectionately known as “RUB”). The RUB paper paper takes a close look at the problem of group messaging, and finds that while messengers may be doing fine with normal (pairwise) messaging, group messaging is still kind of a hack.

If all you want is the TL;DR, here’s the headline finding: due to flaws in both Signal and WhatsApp (which I single out because I use them), it’s theoretically possible for strangers to add themselves to an encrypted group chat. However, the caveat is that these attacks are extremely difficult to pull off in practice, so nobody needs to panic. But both issues are very avoidable, and tend to undermine the logic of having an end-to-end encryption protocol in the first place. (Wired also has a good article.)

First, some background.

How do end-to-end encryption and group chats work?

In recent years we’ve seen plenty of evidence that centralized messaging servers aren’t a very good place to store confidential information. The good news is: we’re not stuck with them. One of the most promising advances in the area of secure communications has been the recent widespread deployment of end-to-end (e2e) encrypted messaging protocols. 

At a high level, e2e messaging protocols are simple: rather than sending plaintext to a server — where it can be stolen or read — the individual endpoints (typically smartphones) encrypt all of the data using keys that the server doesn’t possess. The server has a much more limited role, moving and storing only meaningless ciphertext. With plenty of caveats, this means a corrupt server shouldn’t be able to eavesdrop on the communications.

In pairwise communications (i.e., Alice communicates with only Bob) this encryption is conducted using a mix of public-key and symmetric key algorithms. One of the most popular mechanisms is the Signal protocol, which is used by Signal and WhatsApp (notable for having 1.3 billion users!) I won’t discuss the details of the Signal protocol here, except to say that it’s complicated, but it works pretty well.

A fly in the ointment is that the standard Signal protocol doesn’t work quite as well for group messaging, primarily because it’s not optimized for broadcasting messages to many users.

To handle that popular case, both WhatsApp and Signal use a small hack. It works like this: each group member generates a single “group key” that this member will use to encrypt all of her messages to everyone else in the group. When a new member joins, everyone who is already in the group needs to send a copy of their group key to the new member (using the normal Signal pairwise encryption protocol). This greatly simplifies the operation of group chats, while ensuring that they’re still end-to-end encrypted.

How do members know when to add a new user to their chat?

Here is where things get problematic.

From a UX perspective, the idea is that only one person actually initiates the adding of a new group member. This person is called the “administrator”. This administrator is the only human being who should actually do anything — yet, her one click must cause some automated action on the part of every other group members’ devices. That is, in response to the administrator’s trigger, all devices in the group chat must send their keys to this new group member.

IMG_1291
Notification messages in WhatsApp.

(In Signal, every group member is an administrator. In WhatsApp it’s just a subset of the members.)

The trigger is implemented using a special kind of message called (unimaginatively) a “group management message”. When I, as an administrator, add Tom to a group, my phone sends a group management message to all the existing group members. This instructs them to send their keys to Tom — and to notify the members visually so that they know Tom is now part of the group. Obviously this should only happen if I really did add Tom, and not if some outsider (like that sneaky bastard Tom himself!) tries to add Tom.

And this is where things get problematic.

Ok, what’s the problem?

According to the RUB paper, both Signal and WhatsApp fail to properly authenticate group management messages.

The upshot is that, at least in theory, this makes it possible for an unauthorized person — not a group administrator, possibly not even a member of the group — to add someone to your group chat.

The issues here are slightly different between Signal and WhatsApp. To paraphrase Tolstoy, every working implementation is alike, but every broken one is broken in its own way. And WhatsApp’s implementation is somewhat worse than Signal. Here I’ll break them down.

Signal. Signal takes a pragmatic (and reasonable) approach to group management. In Signal, every group member is considered an administrator — which means that any member can add a new member. Thus if I’m a member of a group, I can add a new member by sending a group management message to every other member. These messages are sent encrypted via the normal (pairwise) Signal protocol.

The group management message contains the “group ID” (a long, unpredictable number), along with the identity of the person I’m adding. Because messages are sent using the Signal (pairwise) protocol, they should be implicitly authenticated as coming from me — because authenticity is a property that the pairwise Signal protocol already offers. So far, this all sounds pretty good.

The problem that the RUB researchers discovered through testing, is that while the Signal protocol does authenticate that the group management comes from me, it doesn’t actually check that I am a member of the group — and thus authorized to add the new user!

In short, if this finding is correct, it turns out that any random Signal user in the world can you send a message of the form “Add Mallory to the Group 8374294372934722942947”, and (if you happen to belong to that group) your app will go ahead and try to do it.

The good news is that in Signal the attack is very difficult to execute. The reason is that in order to add someone to your group, I need to know the group ID. Since the group ID is a random 128-bit number (and is never revealed to non-group-members or even the server**) that pretty much blocks the attack. The main exception to this is former group members, who already know the group ID — and can now add themselves back to the group with impunity.

(And for the record, while the group ID may block the attack, it really seems like a lucky break — like falling out of a building and landing on a street awning. There’s no reason the app should process group management messages from random strangers.)

So that’s the good news. The bad news is that WhatsApp is a bit worse.

WhatsApp. WhatsApp uses a slightly different approach for its group chat. Unlike Signal, the WhatsApp server plays a significant role in group management, which means that it determines who is an administrator and thus authorized to send group management messages.

Additionally, group management messages are not end-to-end encrypted or signed. They’re sent to and from the WhatsApp server using transport encryption, but not the actual Signal protocol.

When an administrator wishes to add a member to a group, it sends a message to the server identifying the group and the member to add. The server then checks that the user is authorized to administer that group, and (if so), it sends a message to every member of the group indicating that they should add that user.

The flaw here is obvious: since the group management messages are not signed by the administrator, a malicious WhatsApp server can add any user it wants into the group. This means the privacy of your end-to-end encrypted group chat is only guaranteed if you actually trust the WhatsApp server.

This undermines the entire purpose of end-to-end encryption.

But this is silly. Don’t we trust the WhatsApp server? And what about visual notifications?

One perfectly reasonable response is that exploiting this vulnerability requires a compromise of the WhatsApp server (or legal compulsion, perhaps). This seems fairly unlikely.

And yet, the entire point of end-to-end encryption is to remove the server from the trusted computing base. We haven’t entirely achieved this yet, thanks to things like key servers. But we are making progress. This bug is a step back, and it’s one a sophisticated attacker potentially could exploit.

A second obvious objection to these issues is that adding a new group member results in a visual notification to each group member. However, it’s not entirely clear that these messages are very effective. In general they’re relatively easy to miss. So these are meaningful bugs, and things that should be fixed.

How do you fix this?

The great thing about these bugs is that they’re both eminently fixable.

The RUB paper points out some obvious countermeasures. In Signal, just make sure that the group management messages come from a legitimate member of the group. In WhatsApp, make sure that the group management messages are signed by an administrator.*

Obviously fixes like this are a bit complex to roll out, but none of these should be killers.

Is there anything else in the paper?

Oh yes, there’s quite a bit more. But none of it is quite as dramatic. For one thing, it’s possible for attackers to block message acknowledgements in group chats, which means that different group members could potentially see very different versions of the chat. There are also several cases where forward secrecy can be interrupted. There’s also some nice analysis of Threema, if you’re interested.

I need a lesson. What’s the moral of this story?

The biggest lesson is that protocol specifications are never enough. Both WhatsApp and Signal (to an extent) have detailed protocol specifications that talk quite a bit about the cryptography used in their systems. And yet the issues reported in the RUB paper not obvious from reading these summaries. I certainly didn’t know about them.

In practice, these problems were only found through testing.

mallory5
Mallory.

So the main lesson here is: test, test, test. This is a strong argument in favor of open-source applications and frameworks that can interact with private-garden services like Signal and WhatsApp. It lets us see what the systems are getting right and getting wrong.

The second lesson — and a very old one — is that cryptography is only half the battle. There’s no point in building the most secure encryption protocol in the world if someone can simply instruct your client to send your keys to Mallory. The greatest lesson of all time is that real cryptosystems are always broken this way — and almost never through the fancy cryptographic attacks we love to write about.

Notes:

* The challenge here is that since WhatsApp itself determines who the administrators are, this isn’t quite so simple. But at very least you can ensure that someone in the group was responsible for the addition.

** According to the paper, the Signal group IDs are always sent encrypted between group members and are never revealed to the Signal server. Indeed, group chat messages look exactly like pairwise chats, as far as the server is concerned. This means only current or former group members should know the group ID.

Attack of the Week: Apple iMessage

Today’s Washington Post has a story entitled “Johns Hopkins researchers poke a hole in Apple’s encryption“, which describes the results of some research my students and I have been working on over the past few months.

As you might have guessed from the headline, the work concerns Apple, and specifically Apple’s iMessage text messaging protocol. Over the past months my students Christina Garman, Ian Miers, Gabe Kaptchuk and Mike Rushanan and I have been looking closely at the encryption used by iMessage, in order to determine how the system fares against sophisticated attackers. The results of this analysis include some very neat new attacks that allow us to — under very specific circumstances — decrypt the contents of iMessage attachments, such as photos and videos.

The research team. From left: Gabe Kaptchuk, Mike Rushanan, Ian Miers, Christina Garman

Now before I go further, it’s worth noting that the security of a text messaging protocol may not seem like the most important problem in computer security. And under normal circumstances I might agree with you. But today the circumstances are anything but normal: encryption systems like iMessage are at the center of a critical national debate over the role of technology companies in assisting law enforcement.

A particularly unfortunate aspect of this controversy has been the repeated call for U.S. technology companies to add “backdoors” to end-to-end encryption systems such as iMessage. I’ve always felt that one of the most compelling arguments against this approach — an argument I’ve made along with other colleagues — is that we just don’t know how to construct such backdoors securely. But lately I’ve come to believe that this position doesn’t go far enough — in the sense that it is woefully optimistic. The fact of the matter is that forget backdoors: we barely know how to make encryption work at all. If anything, this work makes me much gloomier about the subject.

But enough with the generalities. The TL;DR of our work is this:

Apple iMessage, as implemented in versions of iOS prior to 9.3 and Mac OS X prior to 10.11.4, contains serious flaws in the encryption mechanism that could allow an attacker — who obtains iMessage ciphertexts — to decrypt the payload of certain attachment messages via a slow but remote and silent attack, provided that one sender or recipient device is online. While capturing encrypted messages is difficult in practice on recent iOS devices, thanks to certificate pinning, it could still be conducted by a nation state attacker or a hacker with access to Apple’s servers. You should probably patch now.

For those who want the gory details, I’ll proceed with the rest of this post using the “fun” question and answer format I save for this sort of post.

What is Apple iMessage and why should I care?

Those of you who read this blog will know that I have a particular obsession with Apple iMessage. This isn’t because I’m weirdly obsessed with Apple — although it is a little bit because of that. Mostly it’s because I think iMessage is an important protocol. The text messaging service, which was introduced in 2011, has the distinction of being the first widely-used end-to-end encrypted text messaging system in the world.

To understand the significance of this, it’s worth giving some background. Before iMessage, the vast majority of text messages were sent via SMS or MMS, meaning that they were handled by your cellular provider. Although these messages are technically encrypted, this encryption exists only on the link between your phone and the nearest cellular tower. Once an SMS reaches the tower, it’s decrypted, then stored and delivered without further protection. This means that your most personal messages are vulnerable to theft by telecom employees or sophisticated hackers. Worse, many U.S. carriers still use laughably weak encryption and protocols that are vulnerable to active interception.

So from a security point of view, iMessage was a pretty big deal. In a single stroke, Apple deployed encrypted messaging to millions of users, ensuring (in principle) that even Apple itself couldn’t decrypt their communications. The even greater accomplishment was that most people didn’t even notice this happened — the encryption was handled so transparently that few users are aware of it. And Apple did this at very large scale: today, iMessage handles peak throughput of more than 200,000 encrypted messages per second, with a supported base of nearly one billion devices.

So iMessage is important. But is it any good?

Answering this question has been kind of a hobby of mine for the past couple of years. In the past I’ve written about Apple’s failure to publish the iMessage protocol, and on iMessage’s dependence on a vulnerable centralized key server. Indeed, the use of a centralized key server is still one of iMessage’s biggest weaknesses, since an attacker who controls the keyserver can use it to inject keys and conduct man in the middle attacks on iMessage users.

But while key servers are a risk, attacks on a key server seem fundamentally challenging to implement — since they require the ability to actively manipulate Apple infrastructure without getting caught. Moreover, such attacks are only useful for prospective surveillance. If you fail to substitute a user’s key before they have an interesting conversation, you can’t recover their communications after the fact.A more interesting question is whether iMessage’s encryption is secure enough to stand up against retrospective decryption attacks — that is, attempts to decrypt messages after they have been sent. Conducting such attacks is much more interesting than the naive attacks on iMessage’s key server, since any such attack would require the existence of a fundamental vulnerability in iMessage’s encryption itself. And in 2016 encryption seems like one of those things that we’ve basically figured out how to get right.

Which means, of course, that we probably haven’t.

How does iMessage encryption work?

What we know about the iMessage encryption protocol comes from a previous reverse-engineering effort by a group from Quarkslab, as well as from Apple’s iOS Security Guide. Based on these sources, we arrive at the following (simplified) picture of the basic iMessage encryption scheme:

To encrypt an iMessage, your phone first obtains the RSA public key of the person you’re sending to. It then generates a random AES key k and encrypts the message with that key using CTR mode. Then it encrypts k using the recipient’s RSA key. Finally, it signs the whole mess using the sender’s ECDSA signing key. This prevents tampering along the way.

So what’s missing here?

Well, the most obviously missing element is that iMessage does not use a Message Authentication Code (MAC) or authenticated encryption scheme to prevent tampering with the message. To simulate this functionality, iMessage simply uses an ECDSA signature formulated by the sender. Naively, this would appear to be good enough. Critically, it’s not.

The attack works as follows. Imagine that a clever attacker intercepts the message above and is able to register her own iMessage account. First, the attacker strips off the original ECDSA signature made by the legitimate sender, and replaces it with a signature of her own. Next, she sends the newly signed message to the original recipient using her own account:

The outcome is that the user receives and decrypts a copy of the message, which has now apparently originated from the attacker rather than from the original sender. Ordinarily this would be a pretty mild attack — but there’s a useful wrinkle. In replacing the sender’s signature with one of her own, the attacker has gained a powerful capability. Now she can tamper with the AES ciphertext (red) at will.

Specifically, since in iMessage the AES ciphertext is not protected by a MAC, it is therefore malleable. As long as the attacker signs the resulting message with her key, she can flip any bits in the AES ciphertext she wants — and this will produce a corresponding set of changes when the recipient ultimately decrypts the message. This means that, for example, if the attacker guesses that the message contains the word “cat” at some position, she can flip bits in the ciphertext to change that part of the message to read “dog” — and she can make this change even though she can’t actually read the encrypted message.

Only one more big step to go.

Now further imagine that the recipient’s phone will decrypt the message correctly provided that the underlying plaintext that appears following decryption is correctly formatted. If the plaintext is improperly formatted — for a silly example, our tampering made it say “*7!” instead of “pig” — then on receiving the message, the recipient’s phone might return an error that the attacker can see.

It’s well known that such a configuration capability allows our attacker the ability to learn information about the original message, provided that she can send many “mauled” variants to be decrypted. By mauling the underlying message in specific ways — e.g., attempting to turn “dog” into “pig” and observing whether decryption succeeds — the attacker can gradually learn the contents of the original message. The technique is known as a format oracle, and it’s similar to the padding oracle attack discovered by Vaudenay.

So how exactly does this format oracle work?

The format oracle in iMessage is not a padding oracle. Instead it has to do with the compression that iMessage uses on every message it sends.

You see, prior to encrypting each message payload, iMessage applies a complex formatting that happens to conclude with gzip compression. Gzip is a modestly complex compression scheme that internally identifies repeated strings, applies Huffman coding, then tacks a CRC checksum computed over the original data at the end of the compressed message. It’s this gzip-compressed payload that’s encrypted within the AES portion of an iMessage ciphertext.

It turns out that given the ability to maul a gzip-compressed, encrypted ciphertext, there exists a fairly complicated attack that allows us to gradually recover the contents of the message by mauling the original message thousands of times and sending the modified versions to be decrypted by the target device. The attack turns on our ability to maul the compressed data by flipping bits, then “fix up” the CRC checksum correspondingly so that it reflects the change we hope to see in the uncompressed data. Depending on whether that test succeeds, we can gradually recover the contents of a message — one byte at a time.

While I’m making this sound sort of simple, the truth is it’s not. The message is encoded using Huffman coding, with a dynamic Huffman table we can’t see — since it’s encrypted. This means we need to make laser-specific changes to the ciphertext such that we can predict the effect of those changes on the decrypted message, and we need to do this blind. Worse, iMessage has various countermeasures that make the attack more complex.The complete details of the attack appear in the paper, and they’re pretty eye-glazing, so I won’t repeat them here. In a nutshell, we are able to decrypt a message under the following conditions:

  1. We can obtain a copy of the encrypted message
  2. We can send approximately 2^18 (invisible) encrypted messages to the target device
  3. We can determine whether or not those messages decrypted successfully or not
The first condition can be satisfied by obtaining ciphertexts from a compromise of Apple’s Push Notification Service servers (which are responsible for routing encrypted iMessages) or by intercepting TLS connections using a stolen certificate — something made more difficult due to the addition of certificate pinning in iOS 9. The third element is the one that initially seems the most challenging. After all, when I send an iMessage to your device, there’s no particular reason that your device should send me any sort of response when the message decrypts. And yet this information is fundamental to conducting the attack!

It turns out that there’s a big exception to this rule: attachment messages.

How do attachment messages differ from normal iMessages?

When I include a photo in an iMessage, I don’t actually send you the photograph through the normal iMessage channel. Instead, I first encrypt that photo using a random 256-bit AES key, then I compute a SHA1 hash and upload the encrypted photo to iCloud. What I send you via iMessage is actually just an iCloud.com URL to the encrypted photo, the SHA1 hash, and the decryption key.

Contents of an “attachment” message.

 

When you successfully receive and decrypt an iMessage from some recipient, your Messages client will automatically reach out and attempt to download that photo. It’s this download attempt, which happens only when the phone successfully decrypts an attachment message, that makes it possible for an attacker to know whether or not the decryption has succeeded.

One approach for the attacker to detect this download attempt is to gain access to and control your local network connections. But this seems impractical. A more sophisticated approach is to actually maul the URL within the ciphertext so that rather than pointing to iCloud.com, it points to a related URL such as i8loud.com. Then the attacker can simply register that domain, place a server there and allow the client to reach out to it. This requires no access to the victim’s local network.

By capturing an attachment message, repeatedly mauling it, and monitoring the download attempts made by the victim device, we can gradually recover all of the digits of the encryption key stored within the attachment. Then we simply reach out to iCloud and download the attachment ourselves. And that’s game over. The attack is currently quite slow — it takes more than 70 hours to run — but mostly because our code is slow and not optimized. We believe with more engineering it could be made to run in a fraction of a day.

Result of decrypting the AES key for an attachment. Note that the ? symbol represents a digit we could not recover for various reasons, typically due to string repetitions. We can brute-force the remaining digits.

The need for an online response is why our attack currently works against attachment messages only: those are simply the messages that make the phone do visible things. However, this does not mean the flaw in iMessage encryption is somehow limited to attachments — it could very likely be used against other iMessages, given an appropriate side-channel.

How is Apple fixing this?

Apple’s fixes are twofold. First, starting in iOS 9.0 (and before our work), Apple began deploying aggressive certificate pinning across iOS applications. This doesn’t fix the attack on iMessage crypto, but it does make it much harder for attackers to recover iMessage ciphertexts to decrypt in the first place.

Unfortunately even if this works perfectly, Apple still has access to iMessage ciphertexts. Worse, Apple’s servers will retain these messages for up to 30 days if they are not delivered to one of your devices. A vulnerability in Apple Push Network authentication, or a compromise of these servers could read them all out. This means that pinning is only a mitigation, not a true fix.

As of iOS 9.3, Apple has implemented a short-term mitigation that my student Ian Miers proposed. This relies on the fact that while the AES ciphertext is malleable, the RSA-OAEP portion of the ciphertext is not. The fix maintains a “cache” of recently received RSA ciphertexts and rejects any repeated ciphertexts. In practice, this shuts down our attack — provided the cache is large enough. We believe it probably is.

In the long term, Apple should drop iMessage like a hot rock and move to Signal/Axolotl.

So what does it all mean?

As much as I wish I had more to say, fundamentally, security is just plain hard. Over time we get better at this, but for the foreseeable future we’ll never be ahead. The only outcome I can hope for is that people realize how hard this process is — and stop asking technologists to add unacceptable complexity to systems that already have too much of it.

Let’s talk about iMessage (again)

Yesterday’s New York Times carried a story entitled “Apple and other tech companies tangle with U.S. over data access“. It’s a vague headline that manages to obscure the real thrust of the story, which is that according to reporters at the Times, Apple has not been forced to backdoor their popular encrypted iMessage system. This flies in the face of some rumors to the contrary.

While there’s not much new information in here, people on Twitter seem to have some renewed interest in how iMessage works; whether Apple could backdoor it if they wanted to; and whether the courts could force them to. The answers to those questions are respectively: “very well“, “absolutely“, and “do I look like a national security lawyer?”

So rather than tackle the last one, which nobody seems to know the answer to, I figure it would be informative to talk about the technical issues with iMessage (again). So here we go.

How does iMessage work?

Fundamentally the mantra of iMessage is “keep it simple, stupid”. It’s not really designed to be an encryption system as much as it is a text message system that happens to include encryption. As such, it’s designed to take away most of the painful bits you expect from modern encryption software, and in the process it makes the crypto essentially invisible to the user. Unfortunately, this simplicity comes at some cost to security.

Let’s start with the good: Apple’s marketing material makes it clear that iMessage encryption is “end-to-end” and that decryption keys never leave the device. This claim is bolstered by their public security documentation as well as outside efforts to reverse-engineer the system. In iMessage, messages are encrypted with a combination of 1280-bit RSA public key encryption and 128-bit AES, and signed with ECDSA under a 256-bit NIST curve. It’s honestly kind of ridiculous, but whatever. Let’s call it good enough.

iMessage encryption in a nutshell boils down to this: I get your public key, you get my public key, I can send you messages encrypted to you, and you can be sure that they’re authentic and really came from me. Everyone’s happy.

But here’s the wrinkle: where do those public keys come from?

Where do you get the keys?

Key request to Apple’s server.

It’s this detail that exposes the real weakness of iMessage. To make key distribution ‘simple’, Apple takes responsibility for handing out your friends’ public keys. It does this using a proprietary key server that Apple owns and operates. Your iPhone requests keys from Apple using a connection that’s TLS-encrypted, and employs some fancy cryptographic tokens. But fundamentally, it relies on the assumption that Apple is good, and is really going to give you you the right keys for the person you want to talk to.

But this honesty is just an assumption. Since the key lookup is completely invisible to the user, there’s nothing that forces Apple to be honest. They could, if inspired, give you a public key of their choosing, one that they hold the decryption key for. They could give you the FBI’s key. They could give you Dwayne “The Rock” Johnson’s key, though The Rock would presumably be very non-plussed by this.

Indeed it gets worse. Because iMessage is designed to support several devices attached to the same account, each query to the directory server can bring back many keys — one for each of your devices. An attacker can simply add a device (or a fake ‘ghost device’) to Apple’s key server, and senders will encrypt messages to that key along with the legitimate ones. This enables wiretapping, provided you can get Apple to help you out.

But why do you need Apple to help you out?

As described, this attack doesn’t really require direct collaboration from Apple. In principle, the FBI could just guess the target’s email password, or reset the password and add a new device all on their own. Even with a simple subpoena, Apple might be forced to hand over security questions and/or password hashes.

The real difficulty is caused by a final security feature in iMessage: when you add a new device, or modify the devices attached to your account, Apple’s key server sends a notification to each of the existing devices already to the account. It’s not obvious how this feature is implemented, but one thing is clear — it seems likely that, at least in theory, Apple could shut it off if they needed to.* After all, this all comes down to code in the key server.

Fixing this problem seems hard. You could lock the key server in a giant cage, then throw away the key. But as long as Apple retains the ability to update their key server software, solving this problem seems fundamentally challenging. (Though not impossible — I’ll come back to this in a moment.)

Can governments force Apple to modify their key server?

It’s not clear. While it seems pretty obvious that Apple could in theory substitute keys and thus enable eavesdropping, in practice it may require substantial changes to Apple’s code. And while there are a few well-known cases in which the government has forced companies to turn over keys, changing the operation of a working system is a whole different ball of wax.

And iMessage is not just any working system. According to Apple, it handles several billion messages every day, and is fundamental to the operation of millions of iPhones. When you have a deployed system at that scale, the last thing you want to do is mess with it — particularly if it involves crypto code that may not even be well understood by its creators. There’s no amount of money you could pay me to be ‘the guy who broke iMessage’, even for an hour.

Any way you slice it, it’s a risky operation. But for a real answer, you’ll have to talk to a lawyer.

Why isn’t key substitution a good solution to the ‘escrow’ debate?

Another perspective on iMessage — one I’ve heard from some attorney friends — is that key server tampering sounds like a pretty good compromise solution to the problem of creating a ‘secure golden key‘ (AKA giving governments access to plaintext).

This view holds that key substitution allows only proactive eavesdropping: the government has to show up with a warrant before they can eavesdrop on a customer. They can’t spy on everyone, and they can’t go back and read your emails from last month. At the same time, most customers still get true ‘end to end’ encryption.

I see two problems with this view. First, tampering with the key server fundamentally betrays user trust, and undermines most of the guarantees offered by iMessage. Apple claims that they offer true end-to-end encryption that they can’t read — and that’s reasonable in the threat model they’ve defined for themselves. The minute they start selectively substituting keys, that theory goes out the window. If you can substitute a few keys, why not all of them? In this world, Apple should expect requests from every Tom, Dick and Harry who wants access to plaintext, ranging from divorce lawyers to foreign governments.

A snapshot of my seven (!) currently enrolled iMessage
devices, courtesy Frederic Jacobs.

The second, more technical problem is that key substitution is relatively easy to detect. While Apple’s protocols are an obfuscated mess, it is at least in theory possible for users to reverse-engineer them to view the raw public keys being transmitted — and thus determine whether the key server is being honest. While most criminals are not this sophisticated, a few are. And if they aren’t sophisticated, then tools can be built to make this relatively easy. (Indeed, people have already built such tools — see my key registration profile at right.)

Thus key substitution represents at most a temporary solution to the ‘government access’ problem, and one that’s fraught with peril for law enforcement, and probably disastrous for the corporations involved. It might seem tempting to head down this rabbit hole, but it’s rabbits all the way down.

What can providers do to prevent key substitution attacks?

Signal’s “key fingerprint” screen.


From a technical point of view, there are a number of things that providers can do to harden their key servers. One is to expose ‘key fingerprints’ to users who care, which would allow them to manually compare the keys they receive with the keys actually registered by other users. This approach is used by OpenWhisperSystems’ Signal, as well as PGP. But even I acknowledge that this kind of stinks.

A more user-friendly approach is to deploy a variant of Certificate Transparency, which requires providers to publish a publicly verifiable proof that every public key they hand out is being transmitted to the whole world. This allows each client to check that the server is handing out the actual keys they registered — and by implication, that every other user is seeing the same thing.

The most complete published variant of this is called CONIKS, and it was proposed by a group at Princeton, Stanford and the EFF (one of the more notable authors is Ed Felten, now Deputy U.S. Chief Technology Officer). CONIKS combined key transparency with a ‘verification protocol’ that allows clients to ensure that they aren’t being sidelined and fed false information.

CONIKS isn’t necessarily the only game in town when it comes to preventing key substitution attacks, but it represents a powerful existence proof that real defenses can be mounted. Even though Apple hasn’t chosen to implement CONIKS, the fact that it’s out there should be a strong disincentive for law enforcement to rely heavily on this approach.

So what next?

That’s the real question. If we believe the New York Times, all is well — for the moment. But not for the future. In the long term, law enforcement continues to ask for an approach that allows them to access the plaintext of encrypted messages. And Silicon Valley continues to find new ways to protect the confidentiality of their user’s data, against a range of threats beginning in Washington and proceeding well beyond.

How this will pan out is anyone’s guess. All we can say is that it will be messy.

Notes:

* How they would do this is really a question for Apple. The feature may involve the key server sending an explicit push message to each of the devices, in which case it would be easy to turn this off. Alternatively, the devices may periodically retrieve their own keys to see what Apple’s server is sending out to the world, and alert the user when they see a new one. In the latter case, Apple could selectively transmit a doctored version of the key list to the device owner.

What’s the matter with PGP?

Last Thursday, Yahoo announced their plans to support end-to-end 6443976239_b20c3cbb28_mencryption using a fork of Google’s end-to-end email extension. This is a Big Deal. With providers like Google and Yahoo onboard, email encryption is bound to get a big kick in the ass. This is something email badly needs.

So great work by Google and Yahoo! Which is why following complaint is going to seem awfully ungrateful. I realize this and I couldn’t feel worse about it.

As transparent and user-friendly as the new email extensions are, they’re fundamentally just re-implementations of OpenPGP — and non-legacy-compatible ones, too. The problem with this is that, for all the good PGP has done in the past, it’s a model of email encryption that’s fundamentally broken.

It’s time for PGP to die.

In the remainder of this post I’m going to explain why this is so, what it means for the future of email encryption, and some of the things we should do about it. Nothing I’m going to say here will surprise anyone who’s familiar with the technology — in fact, this will barely be a technical post. That’s because, fundamentally, most of the problems with email encryption aren’t hyper-technical problems. They’re still baked into the cake.

Background: PGP

Back in the late 1980s a few visionaries realized that this new ‘e-mail’ thing was awfully convenient and would likely be the future — but that Internet mail protocols made virtually no effort to protect the content of transmitted messages. In those days (and still in these days) email transited the Internet in cleartext, often coming to rest in poorly-secured mailspools.

This inspired folks like Phil Zimmermann to create tools to deal with the problem. Zimmermann’s PGP was a revolution. It gave users access to efficient public-key cryptography and fast symmetric ciphers in package you could install on a standard PC. Even better, PGP was compatible with legacy email systems: it would convert your ciphertext into a convenient ASCII armored format that could be easily pasted into the sophisticated email clients of the day — things like “mail”, “pine” or “the Compuserve e-mail client”.

It’s hard to explain what a big deal PGP was. Sure, it sucked badly to use. But in those days, everything sucked badly to use. Possession of a PGP key was a badge of technical merit. Folks held key signing parties. If you were a geek and wanted to discreetly share this fact with other geeks, there was no better time to be alive.

We’ve come a long way since the 1990s, but PGP mostly hasn’t. While the protocol has evolved technically — IDEA replaced BassOMatic, and was in turn replaced by better ciphers — the fundamental concepts of PGP remain depressingly similar to what Zimmermann offered us in 1991. This has become a problem, and sadly one that’s difficult to change.

Let’s get specific.

PGP keys suck

Before we can communicate via PGP, we first need to exchange keys. PGP makes this downright unpleasant. In some cases, dangerously so.

Part of the problem lies in the nature of PGP public keys themselves. For historical reasons they tend to be large and contain lots of extraneous information, which it difficult to print them a business card or manually compare. You can write this off to a quirk of older technology, but even modern elliptic curve implementations still produce surprisingly large keys.

Three public keys offering roughly the same security level. From top-left: (1) Base58-encoded Curve25519 public key used in miniLock. (2) OpenPGP 256-bit elliptic curve public key format. (3a) GnuPG 3,072 bit RSA key and (3b) key fingerprint.

Since PGP keys aren’t designed for humans, you need to move them electronically. But of course humans still need to verify the authenticity of received keys, as accepting an attacker-provided public key can be catastrophic.

PGP addresses this with a hodgepodge of key servers and public key fingerprints. These components respectively provide (untrustworthy) data transfer and a short token that human beings can manually verify. While in theory this is sound, in practice it adds complexity, which is always the enemy of security.

Now you may think this is purely academic. It’s not. It can bite you in the ass.

Imagine, for example, you’re a source looking to send secure email to a reporter at the Washington Post. This reporter publishes his fingerprint via Twitter, which means most obvious (and recommended) approach is to ask your PGP client to retrieve the key by fingerprint from a PGP key server. On the GnuPG command line can be done as follows:

Now let’s ignore the fact that you’ve just leaked your key request to an untrusted server via HTTP. At the end of this process you should have the right key with high reliability. Right?

Except maybe not: if you happen to do this with GnuPG 2.0.18 — one version off from the very latest GnuPG — the client won’t actually bother to check the fingerprint of the received key. A malicious server (or HTTP attacker) can ship you back the wrong key and you’ll get no warning. This is fixed in the very latest versions of GPG but… Oy Vey.

PGP Key IDs are also pretty terrible,
due to the short length and continued
support for the broken V3 key format.

You can say that it’s unfair to pick on all of PGP over an implementation flaw in GnuPG, but I would argue it speaks to a fundamental issue with the PGP design. PGP assumes keys are too big and complicated to be managed by mortals, but then in practice it practically begs users to handle them anyway. This means we manage them through a layer of machinery, and it happens that our machinery is far from infallible.

Which raises the question: why are we bothering with all this crap infrastructure in the first place. If we must exchange things via Twitter, why not simply exchange keys? Modern EC public keys are tiny. You could easily fit three or four of them in the space of this paragraph. If we must use an infrastructure layer, let’s just use it to shunt all the key metadata around.

PGP key management sucks

Manual key management is a mug’s game. Transparent (or at least translucent) key management is the hallmark of every successful end-to-end secure encryption system.

If you can’t trust Phil, who can you trust?

Now often this does involve some tradeoffs — e.g.,, the need to trust a central authority to distribute keys — but even this level of security would be lightyears better than the current situation with webmail.

To their credit, both Google and Yahoo have the opportunity to build their own key management solutions (at least, for those who trust Google and Yahoo), and they may still do so in the future. But today’s solutions don’t offer any of this, and it’s not clear when they will. Key management, not pretty web interfaces, is the real weakness holding back widespread secure email.

signalstring
ZRTP authentication string, as used in Signal.

For the record, classic PGP does have a solution to the problem. It’s called the “web of trust“, and it involves individuals signing each others’ keys. I refuse to go into the problems with WoT because, frankly, life is too short. The TL;DR is that ‘trust’ means different things to you than it does to me. Most OpenPGP implementations do a lousy job of presenting any of this data to their users anyway.

The lack of transparent key management in PGP isn’t unfixable. For those who don’t trust Google or Yahoo, there are experimental systems like Keybase.io that attempt to tie keys to user identities. In theory we could even exchange our offline encryption keys through voice-authenticated channels using apps like OpenWhisperSystems’ Signal. So far, nobody’s bothered to do this — all of these modern encryption tools are islands with no connection to the mainland. Connecting them together represents one of the real challenges facing widespread encrypted communications.

No forward secrecy

Try something: go delete some mail from your Gmail account. You’ve hit the archive button. Presumably you’ve also permanently wiped your Deleted Items folder. Now make sure you wipe your browser cache and the mailbox files for any IMAP clients you might be running (e.g., on your phone). Do any of your devices use SSD drives? Probably a safe bet to securely wipe those devices entirely. And at the end of this Google may still have a copy which could be vulnerable to law enforcement request or civil subpoena.

(Let’s not get into the NSA’s collect-it-all policy for encrypted messages. If the NSA is your adversary just forget about PGP.)

Forward secrecy (usually misnamed “perfect forward secrecy”) ensures that if you can’t destroy the ciphertexts, you can at least dispose of keys when you’re done with them. Many online messaging systems like off-the-record messaging use PFS by default, essentially deriving a new key with each message volley sent. Newer ‘ratcheting’ systems like Trevor Perrin’s Axolotl (used by TextSecure) have also begun to address the offline case.

Adding forward secrecy to asynchronous offline email is a much bigger challenge, but fundamentally it’s at least possible to some degree. While securing the initial ‘introduction’ message between two participants may be challenging*, each subsequent reply can carry a new ephemeral key to be used in future communications. However this requires breaking changes to the PGP protocol and to clients — changes that aren’t likely to happen in a world where webmail providers have doubled down on the PGP model.

The OpenPGP format and defaults suck

Poking through a modern OpenPGP implementation is like visiting a museum of 1990s crypto. For legacy compatibility reasons, many clients use old ciphers like CAST5 (a cipher that predates the AES competition). RSA encryption uses padding that looks disturbingly like PKCS#1v1.5 — a format that’s been relentlessly exploited in the past. Key size defaults don’t reach the 128-bit security level. MACs are optional. Compression is often on by default. Elliptic curve crypto is (still!) barely supported.

If Will Smith looked like this when your cryptography was current, you need better cryptography.

Most of these issues are not exploitable unless you use PGP in a non-standard way, e.g., for instant messaging or online applications. And some people do use PGP this way.

But even if you’re just using PGP just to send one-off emails to your grandmother, these bad defaults are pointless and unnecessary. It’s one thing to provide optional backwards compatibility for that one friend who runs PGP on his Amiga. But few of my contacts do — and moreover, client versions are clearly indicated in public keys.** Even if these archaic ciphers and formats aren’t exploitable today, the current trajectory guarantees we’ll still be using them a decade from now. Then all bets are off.

On the bright side, both Google and Yahoo seem to be pushing towards modern implementations that break compatibility with the old. Which raises a different question. If you’re going to break compatibility with most PGP implementations, why bother with PGP at all?

Terrible mail client implementations

This is by far the worst aspect of the PGP ecosystem, and also the one I’d like to spend the least time on. In part this is because UX isn’t technically PGP’s problem; in part because the experience is inconsistent between implementations, and in part because it’s inconsistent between users: one person’s ‘usable’ is another person’s technical nightmare.

But for what it’s worth, many PGP-enabled mail clients make it ridiculously easy to send confidential messages with encryption turned off, to send unimportant messages with encryption turned on, to accidentally send to the wrong person’s key (or the wrong subkey within a given person’s key). They demand you encrypt your key with a passphrase, but routinely bug you to enter that passphrase in order to sign outgoing mail — exposing your decryption keys in memory even when you’re not reading secure email.

Most of these problems stem from the fact that PGP was designed to retain compatibility with standard (non-encrypted) email. If there’s one lesson from the past ten years, it’s that people are comfortable moving past email. We now use purposebuilt messaging systems on a day-to-day basis. The startup cost of a secure-by-default environment is, at this point, basically an app store download.

Incidentally, the new Google/Yahoo web-based end-to-end clients dodge this problem by providing essentially no user interface at all. You enter your message into a separate box, and then plop the resulting encrypted data into the Compose box. This avoids many of the nastier interface problems, but only by making encryption non-transparent. This may change; it’s too soon to know how.

So what should we be doing?

Quite a lot actually. The path to a proper encrypted email system isn’t that far off. At minimum, any real solution needs:

  • A proper approach to key management. This could be anything from centralized key management as in Apple’s iMessage — which would still be better than nothing — to a decentralized (but still usable) approach like the one offered by Signal or OTR. Whatever the solution, in order to achieve mass deployment, keys need to be made much more manageable or else submerged from the user altogether.
  • Forward secrecy baked into the protocol. This should be a pre-condition to any secure messaging system.
  • Cryptography that post-dates the Fresh Prince. Enough said.
  • Screw backwards compatibility. Securing both encrypted and unencrypted email is too hard. We need dedicated networks that handle this from the start.

A number of projects are already going in this direction. Aside above-mentioned projects like Axolotl and TextSecure — which pretend to be text messaging systems, but are really email in disguise — projects like Mailpile are trying to re-architect the client interface (though they’re sticking with the PGP paradigm). Projects like SMIMP are trying to attack this at the protocol level.*** At least in theory projects like DarkMail are also trying to adapt text messaging protocols to the email case, though details remain few and far between.

It also bears noting that many of the issues above could, in principle at least, be addressed within the confines of the OpenPGP format. Indeed, if you view ‘PGP’ to mean nothing more than the OpenPGP transport, a lot of the above seems easy to fix — with the exception of forward secrecy, which really does seem hard to add without some serious hacks. But in practice, this is rarely all that people mean when they implement ‘PGP’.

Conclusion

I realize I sound a bit cranky about this stuff. But as they say: a PGP critic is just a PGP user who’s actually used the software for a while. At this point so much potential in this area and so many opportunities to do better. It’s time for us to adopt those ideas and stop looking backwards.

Notes:

* Forward security even for introduction messages can be implemented, though it either require additional offline key distribution (e.g., TextSecure’s ‘pre-keys’) or else the use of advanced primitives. For the purposes of a better PGP, just handling the second message in a conversation would be sufficient.

** Most PGP keys indicate the precise version of the client that generated them (which seems like a dumb thing to do). However if you want to add metadata to your key that indicates which ciphers you prefer, you have to use an optional command.

*** Thanks to Taylor Hornby for reminding me of this.

Noodling about IM protocols

The last couple of months have been a bit slow in the blogging department. It’s hard to blog when there are exciting things going on. But also: I’ve been a bit blocked. I have two or three posts half-written, none of which I can quite get out the door.

Instead of writing and re-writing the same posts again, I figured I might break the impasse by changing the subject. Usually the easiest way to do this is to pick some random protocol and poke at it for a while to see what we learn.

The protocols I’m going to look at today aren’t particularly ‘random’ — they’re both popular encrypted instant messaging protocols. The first is OTR (Off the Record Messaging). The second is Cryptocat’s group chat protocol. Each of these protocols has a similar end-goal, but they get there in slightly different ways.
I want to be clear from the start that this post has absolutely no destination. If you’re looking for exciting vulnerabilities in protocols, go check out someone else’s blog. This is pure noodling.

The OTR protocol

OTR is probably the most widely-used protocol for encrypting instant messages. If you use IM clients like Adium, Pidgin or ChatSecure, you already have OTR support. You can enable it in some other clients through plugins and overlays.

OTR was originally developed by Borisov, Goldberg and Brewer and has rapidly come to dominate its niche. Mostly this is because Borisov et al. are smart researchers who know what they’re doing. Also: they picked a cool name and released working code.

OTR works within the technical and usage constraints of your typical IM system. Roughly speaking, these are:

  1. Messages must be ASCII-formatted and have some (short) maximum length.
  2. Users won’t bother to exchange keys, so authentication should be “lazy” (i.e., you can authenticate your partners after the fact).
  3. Your chat partners are all FBI informants so your chat transcripts must be plausibly deniable — so as to keep them from being used as evidence against you in a court of law.

Coming to this problem fresh, you might find goal (3) a bit odd. In fact, to the best of my knowledge no court in the history of law has ever used a cryptographic transcript as evidence that a conversation occurred. However it must be noted that this requirement makes the problem a bit more sexy. So let’s go with it!

“Dammit, they used a deniable key exchange protocol” said no Federal prosecutor ever.

The OTR (version 2/3) handshake is based on the SIGMA key exchange protocol. Briefly, it assumes that both parties generate long-term DSA public keys which we’ll denote by (pubA, pubB). Next the parties interact as follows:

The OTRv2/v3 AKE. Diagram by Bonneau and Morrison, all colorful stuff added. There’s also
an OTRv1 protocol that’s too horrible to talk about here.

There are four elements to this protocol:

  1. Hash commitment. First, Bob commits to his share of a Diffie-Hellman key exchange (g^x) by encrypting it under a random AES key r and sending the ciphertext and a hash of g^x over to Alice.
  2. Diffie-Hellman Key Exchange. Next, Alice sends her half of the key exchange protocol (g^y). Bob can now ‘open’ his share to Alice by sending the AES key r that he used to encrypt it in the previous step. Alice can decrypt this value and check that it matches the hash Bob sent in the first message.Now that both sides have the shares (g^x, g^y) they each use their secrets to compute a shared secret g^{xy} and hash the value several ways to establish shared encryption keys (c’, Km2, Km’2) for subsequent messages. In addition, each party hashes g^{xy} to obtain a short “session ID”.

    The sole purpose of the commitment phase (step 1) is to prevent either Alice or Bob from controlling the value of the shared secret g^{xy}. Since the session ID value is derived by hashing the Diffie-Hellman shared secret, it’s possible to use a relatively short session ID value to authenticate the channel, since neither Alice nor Bob will be able to force this ID to a specific value.

  3. Exchange of long-term keys and signatures. So far Alice and Bob have not actually authenticated that they’re talking to each other, hence their Diffie-Hellman exchange could have been intercepted by a man-in-the-middle attacker. Using the encrypted channel they’ve previously established, they now set about to fix this.Alice and Bob each send their long-term DSA public key (pubA, pubB) and key identifiers, as well as a signature on (a MAC of) the specific elements of the Diffie-Hellman message (g^x, g^y) and their view of which party they’re communicating with. They can each verify these signatures and abort the connection if something’s amiss.**
  4. Revealing MAC keys. After sending a MAC, each party waits for an authenticated response from its partner. It then reveals the MAC keys for the previous messages.
  5. Lazy authentication. Of course if Alice and Bob never exchange public keys, this whole protocol execution is still vulnerable to a man-in-the-middle (MITM) attack. To verify that nothing’s amiss, both Alice and Bob should eventually authenticate each other. OTR provides three mechanisms for doing this: parties may exchange fingerprints (essentially hashes) of (pubA, pubB) via a second channel. Alternatively, they can exchange the “session ID” calculated in the second phase of the protocol. A final approach is to use the Socialist Millionaires’ Problem to prove that both parties share the same secret.
The OTR key exchange provides the following properties:

Protecting user identities. No user-identifying information (e.g., long-term public keys) is sent until the parties have first established a secure channel using Diffie-Hellman. The upshot is that a purely passive attacker doesn’t learn the identity of the communicating partners — beyond what’s revealed by the higher-level IM transport protocol.*

Unfortunately this protection fails against an active attacker, who can easily smash an existing OTR connection to force a new key agreement and run an MITM on the Diffie-Hellman used during the next key agreement. This does not allow the attacker to intercept actual message content — she’ll get caught when the signatures don’t check out — but she can view the public keys being exchanged. From the client point of view the likely symptoms are a mysterious OTR error, followed immediately by a successful handshake.

One consequence of this is that an attacker could conceivably determine which of several clients you’re using to initiate a connection.

Weak deniability. The main goal of the OTR designers is plausible deniability. Roughly, this means that when you and I communicate there should be no binding evidence that we really had the conversation. This rules out obvious solutions like GPG-based chats, where individual messages would be digitally signed, making them non-repudiable.

Properly defining deniability is a bit complex. The standard approach is to show the existence of an efficient ‘simulator’ — in plain English, an algorithm for making fake transcripts. The theory is simple: if it’s trivial to make fake transcripts, then a transcript can hardly be viewed as evidence that a conversation really occurred.

OTR’s handshake doesn’t quite achieve ‘strong’ deniability — meaning that anyone can fake a transcript between any two parties — mainly because it uses signatures. As signatures are non-repudiable, there’s no way to fake one without actually knowing your public key. This reveals that we did, in fact, communicate at some point. Moreover, it’s possible to create an evidence trail that I communicated with you, e.g., by encoding my identity into my Diffie-Hellman share (g^x). At very least I can show that at some point you were online and we did have contact.

But proving contact is not the same thing as proving that a specific conversation occurred. And this is what OTR works to prevent. The guarantee OTR provides is that if the target was online at some pointand you could contact them, there’s an algorithm that can fake just about any conversation with the individual. Since OTR clients are, by design, willing to initiate a key exchange with just about anyone, merely putting your client online makes it easy for people to fake such transcripts.***

Towards strong deniability. The ‘weak’ deniability of OTR requires at least tacit participation of the user (Bob) for which we’re faking the transcript. This isn’t a bad property, but in practice it means that fake transcripts can only be produced by either Bob himself, or someone interacting online with Bob. This certainly cuts down on your degree of deniability.

A related concept is ‘strong deniability‘, which ensures that any party can fake a transcript using only public information (e.g., your public keys).

OTR doesn’t try achieve strong deniability — but it does try for something in between. The OTR version of deniability holds that an attacker who obtains the network traffic of a real conversation — even if they aren’t one of the participants — should be able alter the conversation to say anything he wants. Sort of.

The rough outline of the OTR deniability process is to generate a new message authentication key for each message (using Diffie-Hellman) and then reveal those keys once they’ve been used up. In theory, a third party can obtain this transcript and — if they know the original message content — they can ‘maul’ the AES-CTR encrypted messages into messages of their choice, then they can forge their own MACs on the new messages.

OTR message transport (source: Bonneau and Morrison, all colored stuff added).

Thus our hypothetical transcript forger can take an old transcript that says “would you like a Pizza and turn it into a valid transcript that says, for example, “would you like to hack STRATFOR“… Except that they probably can’t, since the first message is too short and… oh lord, this whole thing is a stupid idea — let’s stop talking about it.

The OTRv1 handshake. Oh yes, there’s also an OTRv1 protocol that has a few issues and isn’t really deniable. Even better, an MITM attacker can force two clients to downgrade to it, provided both support that version. Yuck.

So that’s the OTR protocol. While I’ve pointed out a few minor issues above, the truth is that the protocol is generally an excellent way to communicate. In fact it’s such a good idea that if you really care about secrecy it’s probably one of the best options out there.

Cryptocat

Since we’re looking at IM protocols I thought it might be nice to contrast with another fairly popular chat protocol: Cryptocat‘s group chat. Cryptocat is a web-based encrypted chat app that now runs on iOS (and also in Thomas Ptacek’s darkest nightmares).

Cryptocat implements OTR for ‘private’ two-party conversations. However OTR is not the default. If you use Cryptocat in its default configuration, you’ll be using its hand-rolled protocol for group chats.

The Cryptocat group chat specification can be found here, and it’s remarkably simple. There are no “long-term” keys in Cryptocat. Diffie-Hellman keys are generated at the beginning of each session and re-used for all conversations until the app quits. Here’s the handshake between two parties:

Cryptocat group chat handshake (current revision). Setting is Curve25519. Keys are generated when the application launches, and re-used through the session.

If multiple people join the room, every pair of users repeats this handshake to derive a shared secret between every pair of users. Individuals are expected to verify each others’ keys by checking fingerprints and/or running the Socialist Millionaire protocol.

Unlike OTR, the Cryptocat handshake includes no key confirmation messages, nor does it attempt to bind users to their identity or chat room. One implication of this is that I can transmit someone else’s public key as if it were my own — and the recipients of this transmission will believe that the person is actually part of the chat.

Moreover, since public keys aren’t bound to the user’s identity or the chat room, you could potentially route messages between a different user (even a user in a different chat room) while making it look like they’re talking to you. Since Cryptocat is a group chat protocol, there might be some interesting things you could do to manipulate the conversation in this setting.****

Does any of this matter? Probably not that much, but it would be relatively easy (and good) to fix these issues.

Message transmission and consistency. The next interesting aspect of Cryptocat is the way it transmits encrypted chat messages. One of the core goals of Cryptocat is to ensure that messages are consistent between individual users. This means that all users should be able to verify that the other user is receiving the same data as it is.

Cryptocat uses a slightly complex mechanism to achieve this. For each pair of users in the chat, Cryptocat derives an AES key and an MAC key from the Diffie-Hellman shared secret. To send a message, the client:

  1. Pads the message by appending 64 bytes of random padding.
  2. Generates a random 12-byte Initialization Vector for each of the Nusers in the chat.
  3. Encrypts the message using AES-CTR under the shared encryption key for each user.
  4. Concatenates all of the N resulting ciphertexts/IVs and computes an HMAC of the whole blob under each recipient’s key.
  5. Calculates a ‘tag’ for the message by hashing the following data:

    padded plaintext || HMAC-SHA512alice || HMAC-SHA512bob || HMAC-SHA512carol || …

  6. Broadcasts the ciphertexts, IVs, MACs and the single ‘tag’ value to all users in the conversation.

When a recipient receives a message from another user, it verifies that:

  1. The message contains a valid HMAC under its shared key.
  2. This IV has not been received before from this sender.
  3. The decrypted plaintext is consistent with the ‘tag’.

Roughly speaking, the idea here is to make sure that every user receives the same message. The use of a hashed plaintext is a bit ugly, but the argument here is that the random padding protects the message from guessing attacks. Make what you will of this.

Anti-replay. Cryptocat also seeks to prevent replay attacks, e.g., where an attacker manipulates a conversation by simply replaying (or reflecting) messages between users, so that users appear to be repeating statements. For example, consider the following chat transcripts:

Replays and reflection attacks.

Replay attacks are prevented through the use of a global ‘IV array’ that stores all previously received and sent IVs to/from all users. If a duplicate IV arrives, Cryptocat will reject the message. This is unwieldy but it generally seems ok to prevent replays and reflection.

A limitation of this approach is that the IV array does not live forever. In fact, from time to time Cryptocat will reset the IV array without regenerating the client key. This means that if Alice and Bob both stay online, they can repeat the key exchange and wind up using the same key again — which makes them both vulnerable to subsequent replays and reflections. (Update: This issue has since been fixed).

In general the solution to these issues is threefold:

  1. Keys shouldn’t be long-term, but should be regenerated using new random components for each key exchange.
  2. Different keys should be derived for the Alice->Bob and Bob->Alice direction
  3. It would be be more elegant to use a message counter than to use this big, unwieldy key array.

The good news is that the Cryptocat developers are working on a totally new version of the multi-party chat protocol that should be enormously better.

In conclusion

I said this would be a post that goes nowhere, and I delivered! But I have to admit, it helps to push it out of my system.

None of the issues I note above are the biggest deal in the world. They’re all subtle issues, which illustrates two things: first, that crypto is hard to get right. But also: that crypto rarely fails catastrophically. The exciting crypto bugs that cause you real pain are still few and far between.

Notes:

* In practice, you might argue that the higher-level IM protocol already leaks user identities (e.g., Jabber nicknames). However this is very much an implementation choice. Moreover, even when using Jabber with known nicknames, you might access the Jabber server using one of several different clients (your computer, phone, etc.). Assuming you use Tor, the only indication of this might be the public key you use during OTR. So there’s certainly useful information in this protocol.

** Notice that OTR signs a MAC instead of a hash of the user identity information. This happens to be a safe choice given that the MAC used is based on HMAC-SHA2, but it’s not generally a safe choice. Swapping the HMAC out for a different MAC function (e.g., CBC-MAC) would be catastrophic.

*** To get specific, imagine I wanted to produce a simulated transcript for some conversation with Bob. Provided that Bob’s client is online, I can send Bob any g^x value I want. It doesn’t matter if he really wants to talk to me — by default, his client will cheerfully send me back his own g^y and a signature on (g^x, g^y, pub_B, keyid_B) which, notably, does not include my identity. From this point on all future authentication is performed using MACs and encrypted under keys that are known to both of us. There’s nothing stopping me from faking the rest of the conversation.

**** Incidentally, a similar problem exists in the OTRv1 protocol.

Can Apple read your iMessages?

About a year ago I wrote a short post urging Apple to publish the technical details of iMessage encryption. I’d love tell you that Apple saw my influential crypto blogging and fell all over themselves to produce a spec, but, no. iMessage is the same black box it’s always been.

What’s changed is that suddenly people seem to care. Some of this interest is due to Apple’s (alleged) friendly relationship with the NSA. Some comes from their not-so-friendly relationship with the DEA. Whatever the reason, people want to know which of our data Apple has and who they’re sharing it with.

And that brings us back to iMessage encryption. Apple runs one of the most popular encrypted communications services on Earth, moving over two billion iMessage every day. Each one is loaded with personal information the NSA/DEA would just love to get their hands on. And yet Apple claims they can’t. In fact, even Apple can’t read them:

There are certain categories of information which we do not provide to law enforcement or any other group because we choose not to retain it.

For example, conversations which take place over iMessage and FaceTime are protected by end-to-end encryption so no one but the sender and receiver can see or read themApple cannot decrypt that data.

This seems almost too good to be true, which in my experience means it probably is. My view is inspired by something I like to call “Green’s law of applied cryptography”, which holds that applied cryptography mostly sucks. Crypto never offers the unconditional guarantees you want it to, and when it does your users suffer terribly.

And that’s the problem with iMessage: users don’t suffer enough. The service is almost magically easy to use, which means Apple has made tradeoffs — or more accurately, they’ve chosen a particular balance between usability and security. And while there’s nothing wrong with tradeoffs, the particulars of their choices make a big difference when it comes to your privacy. By witholding these details, Apple is preventing its users from taking steps to protect themselves.

The details of this tradeoff are what I’m going to talk about in this post. A post which I swear will be the last post I ever write on iMessage. From here on out it’ll be ciphers and zero knowledge proofs all the way.

Apple backs up iMessages to iCloud

That’s the super-secret
NSA spying chip.
The biggest problem with Apple’s position is that it just plain isn’t true. If you use the iCloud backup service to back up your iDevice, there’s a very good chance that Apple can access the last few days of your iMessage history.
For those who aren’t in the Apple ecosystem: iCloud is an optional backup service that Apple provides for free. Backups are great, but if iMessages are backed up we need to ask how they’re protected. Taking Apple at their word — that they really can’t get your iMessages — leaves us with two possibilities:
  1. iMessage backups are encrypted under a key that ‘never leaves the device‘.
  2. iMessage backups are encrypted using your password as a key.

Unfortunately neither of these choices really works — and it’s easy to prove it. All you need to do is run the following simple experiment: First, lose your iPhone. Now change your password using Apple’s iForgot service (this requires you to answer some simple security questions or provide a recovery email). Now go to an Apple store and shell out a fortune buying a new phone.

If you can recover your recent iMessages onto a new iPhone — as I was able to do in an Apple store this afternoon — then Apple isn’t protecting your iMessages with your password or with a device key. Too bad. (Update 6/27: Ashkan Soltani also has some much nicer screenshots from a similar test.)

The sad thing is there’s really no crypto to understand here. The simple and obvious point is this: if I could do this experiment, then someone at Apple could have done it too. Possibly at the request of law enforcement. All they need are your iForgot security questions, something that Apple almost certainly does keep.* 

Apple distributes iMessage encryption keys

But maybe you don’t use backups. In this case the above won’t apply to you, and Apple clearly says that their messages are end-to-end encrypted. The question you should be asking now is: encrypted to whom?

The problem here is that encryption only works if I have your encryption key. And that means before I can talk to you I need to get hold of it. Apple has a simple solution to this: they operate a directory lookup service that iMessage can use to look up the public key associated with any email address or phone number. This is great, but represents yet another tradeoff: you’re now fundamentally dependent on Apple giving you the right key.

HTTPS request/response containing a
“message identity key” associated with an
iPhone phone number (modified). These keys are
sent over SSL.

The concern here is that Apple – or a hacker who compromises Apple’s directory server – might instead deliver their own key. Since you won’t know the difference, you’ll be encrypting to that person rather than to your friend.**

Moreover, iMessage lets you associate multiple public keys with the same account — for example, you can you add a device (such as a Mac) to receive copies of messages sent to your phone. From what I can tell, the iMessage app gives the sender no indication of how many keys have been associated with a given iMessage recipient, nor does it warn them if the recipient suddenly develops new keys.

The practical upshot is that the integrity of iMessage depends on Apple honestly handing out keys. If they cease to be honest (or if somebody compromises the iMessage servers) it may be possible to run a man-in-the-middle attack and silently intercept iMessage data.

Now to some people this is obvious, and to other’s it’s no big deal. All of which is fine. But people should at least understand the strengths and weaknesses of the particular design that Apple has chosen. Armed with that knowledge they can make up their minds how much they want to trust Apple.

Apple can retain metadata

While Apple may encrypt the contents of your communication, their statement doesn’t exactly rule out the possibility they store who you’re talking to. This is the famous meta-data the NSA already sweeps up and (as I’ve said before) it’s almost impossible not to at least collect this information, especially since Apple actually delivers your messages through their servers.

This metadata can be as valuable as the data itself. And while Apple doesn’t retain the content of your messages, their statement says nothing about all that metadata.

Apple doesn’t use Certificate Pinning

As a last – and fairly minor point – iMessage client applications (for iPhone and Mac) communicate with Apple’s directory service using the HTTPS protocol. (Note that this applies to directory lookup messages: the actual iMessages are encrypted separately and travel over XMPP Apple’s push network protocol.)

Using HTTPS is a good thing, and in general it provides strong protections against interception. But it doesn’t protect against all attacks. There’s still a very real possibility that a capable attacker could obtain a forged certificate (possibly by compromising a Certificate Authority) and thus intercept or modify communications with Apple.

This kind of thing isn’t as crazy as it sounds. It happened to hundreds of thousands of Iranian Gmail users, and it’s likely to happen again in the future. The standard solution to this problem is called ‘certificate pinning‘ — this essentially tells the application not to trust unknown certificates. Many apps such as Twitter do this. However based on the testing I did while writing this post, Apple doesn’t.

Conclusion

I don’t write any of this stuff because I dislike Apple. In fact I love their products and would bathe with them if it didn’t (unfortunately) violate the warranty.

But the flipside of my admiration is simple: I rely on these devices and want to know how secure they are. I see absolutely no downside to Apple presenting at least a high-level explanation to experts, even if they keep the low-level details to themselves. This would include the type and nature of the encryption algorithms used, the details of the directory service and the key agreement protocol.

Apple may Think Different, but security rules apply to them too. Sooner or later someone will compromise or just plain reverse-engineer the iMessage system. And then it’ll all come out anyway.

Notes:

* Of course it’s possible that Apple is using your security questions to derive an encryption key. However this seems unlikely. First because it’s likely that Apple has your question/answers on file. But even if they don’t, it’s unlikely that many security answers contain enough entropy to use for encryption. There are only so many makes/models of cars and so many birthdays. Apple’s 2-step authentication may improve things if you use it — but if so Apple isn’t saying.

** In practice it’s not clear if Apple devices encrypt to this key directly or if they engage in an OTR-like key exchange protocol. What is clear is that iMessage does not include a ‘key fingerprint’ or any means for users to verify key authenticity, which means fundamentally you have to trust Apple to guarantee the authenticity of your keys. Moreover iMessage allows you to send messages to offline users. It’s not clear how this would work with OTR.

How to ‘backdoor’ an encryption app

Over the past week or so there’s been a huge burst of interest in encryption software. Applications like Silent Circle and RedPhone have seen a major uptick in new installs. CryptoCat alone has seen a zillion new installs, prompting several infosec researchers to nearly die of irritation.

From my perspective this is a fantastic glass of lemonade, if one made from particular bitter lemons. It seems all we ever needed to get encryption into the mainstream was… ubiquitous NSA surveillance. Who knew?

Since I’ve written about encryption software before on this blog, I received several calls this week from reporters who want to know what these apps do. Sooner or later each interview runs into the same question: what happens when somebody plans a crime using one of these? Shouldn’t law enforcement have some way to listen in?

This is not a theoretical matter. The FBI has been floating a very real proposal that will either mandate wiretap backdoors in these systems, or alternatively will impose fines on providers that fail to cough up user data. This legislation goes by the name ‘CALEA II‘, after the CALEA act which governs traditional (POTS) phone wiretapping.

Personally I’m strongly against these measures, particularly the ones that target client software. Mandating wiretap capability jeopardizes users’ legitimate privacy needs and will seriously hinder technical progress in this area. Such ‘backdoors’ may be compromised by the very same criminals we’re trying to stop. Moreover, smart/serious criminals will easily bypass them.

To me, a more interesting question is how such ‘backdoors’ would even work. This isn’t something you can really discuss in an interview, which is why I decided to blog about them. The answers range from ‘dead stupid‘ to ‘diabolically technical‘, with the best answers sitting somewhere in the middle. Even if many of these are pretty obvious from a technical perspective, we can’t really have a debate until they’ve all been spelled out.

And so: in the rest of this post I’m going to discuss five of the most likely ways to add backdoors to end-to-end encryption systems.

1. Don’t use end-to-end encryption in the first place (just say you do)

There’s no need to kick down the door when you already have the keys. Similarly there’s no reason to add a ‘backdoor’ when you already have the plaintext. Unfortunately this is the case for a shocking number of popular chat systems — ranging from Google Talk (er, ‘Hangouts’) to your typical commercial Voice-over-IP system. The same statement also applies to at least some components of more robust systems: for example, Skype text messages.

Many of these systems use encryption at some level, but typically only to protect communications from the end user to the company’s servers. Once there, the data is available to capture or log to your heart’s content.

2. Own the directory service (or be the Certificate Authority)

Fortunately an increasing number of applications really do encrypt voice and text messages end-to-end — meaning that the data is encrypted all the way from sender directly to the recipient. This cuts the service out of the equation (mostly), which is nice. But unfortunately it’s only half the story.

The problem here is that encrypting things is generally the easy bit. The hard part is distributing the keys (key signing parties anyone?) Many ‘end-to-end’ systems — notably Skype*, Apple’s iMessage and Wickr — try to make your life easier by providing a convenient ‘key lookup service’, or else by acting as trusted certificate authorities to sign your keys. Some will even store your secret keys.**

This certainly does make life easier, both for you and the company, should it decide to eavesdrop on you. Since the service controls the key, it can just as easily send you its own public key — or a public key belonging to the FBI. This approach makes it ridiculously easy for providers to run a Man-in-the-Middle attack (MITM) and intercept any data they want.

This is always ‘best’ way to distinguish seirous encryption systems from their lesser cousins. When a company tells you they’re encrypting end-to-end, just ask them: how are you distributing keys? If they can’t answer — or worse, they blabber about ‘military grade encryption’ — you might want to find another service.

3. Metadata is the new data

The best encryption systems push key distribution offline, or even better, perform a true end-to-end key exchange that only involves the parties to the communication. The latter applies to several protocols — notably OTR and ZRTP — used by apps like Silent Circle, RedPhone and CryptoCat.

You still have to worry about the possibility that an attacker might substitute her own key material in the connection (an MITM attack). So the best of these systems add a verification phase in which the parties check a key fingerprint — preferably in person, but possibly by reading it over a voice connection (you know what your friends’ voice sounds like, don’t you?) Some programs will even convert the fingerprint into a short ‘authentication string‘ that you can read to your friend.

From a cryptographic perspective the design of these systems is quite good. But you don’t need to attack the software to get useful information out of them. That’s because while encryption may hide what you say, it doesn’t necessarily hide who you’re talking to.

The problem here is that someone needs to move your (encrypted) data from point A to point B. Typically this work is done by a server operated by the company that wrote the app. While the server may not be able to eavesdrop you, it can easily log the details (including IP addresses) of each call. This is essentially the same data the NSA collects from phone carriers.

Particularly when it comes to VoIP (where anonymity services like Tor just aren’t very effective), this is a big problem. Some companies are out ahead of it: Silent Circle (a company whose founders have threatened chew off their own limbs rather than comply with surveillance orders) don’t log any IP addresses. One hopes the other services are as careful.

But even this isn’t perfect: just because you choose not to collect doesn’t mean you can’t. If the government shows up with a National Security Letter compelling your compliance — or just hacks your servers — that information will obtained.

4. Escrow your keys

If you want to add real eavesdropping backdoors to a properly-designed encryption protocol you have to take things to a whole different level. Generally this requires that you modify the encryption software itself.

If you’re doing this above board you’d refer to it as ‘key escrow‘. A simple technique is just to an extra field to the wire protocol. Each time your clients agree on a session key, you have one of the parties encrypt that key under the public key of a third party (say, the encryption service, or a law enforcement agency). The encrypted key gets shipped along with the rest of the handshake data. PGP used to provide this as an optional feature, and the US government unsuccessfully tried to mandate an escrow-capable system called Clipper.***

In theory key escrow features don’t weaken the system. In practice this is debatable. The security of every connection now depends on the security of your master ‘escrow’ secret key. And experience tells us that wiretapping systems are surprisingly vulnerable. In 2009, for example, a group of Chinese hackers were able to breach the servers used to manage Google’s law enforcement surveillance infrastructure — giving them access to confidential data on every target the US government was surveilling.

One hopes that law enforcement escrow keys would be better secured. But they probably won’t be.

5. Compromise, Update, Exfiltrate

But what if your software doesn’t have escrow functionality? Then it’s time to change the software.

The simplest way to add an eavesdropping function is just to issue a software update. Ship a trustworthy client, ask your users to enable automatic updates, then deliver a new version when you need to. This gets even easier now that some operating systems are adding automatic background app updates.

If updates aren’t an option, there are always software vulnerabilities. If you’re the one developing the software you have some extra capabilities here. All you need to do is keep track of a few minor vulnerabilities in your server-client communication protocol — which may be secured by SSL and thus protected from third party exploits. These can be weaknesses as minor as an uninitialized memory structure or a ‘wild read’ that can be used to scan key material.

Or better yet, put your vulnerabilities in at the level of the crypto implementation itself. It’s terrifyingly easy to break crypto code — for example, the difference between a working random number generator and a badly broken one can be a single line of code, or even a couple of instructions. Re-use some counters in your AES implementation, or (better yet) implement ECDSA without a proper random nonce. You can even exflitrate your keys using a subliminal channel.

Or just write a simple exploit like the normal kids do.

Unfortunately there’s very little we can do about things like this. Probably the best defense is to use open source code, disable software updates until others have reviewed them, and then pray you’re never the target of a National Security Letter. Because if you are — none of this crap is going to save you.

Conclusion

I hope nobody comes away with the wrong idea about any of this. I wouldn’t seriously recommend that anyone add backdoors to a piece of encryption software. In fact, this is just about the worst idea in the world.

That said, encryption software is likely to be a victim of its own success. Either we’ll stay in the technical ghetto, with only a few boring nerds adopting the technology. Or the world will catch on. And then the pressure will come. At that point the authors of these applications are going to face some tough choices. I don’t envy them one bit.

Notes:

* See this wildly out of date security analysis (still available on Skype’s site) for a description of how this system worked circa 2005.

** A few systems (notably Hushmail back in the 90s) will store your secret keys encrypted under a password. This shouldn’t inspire a lot of confidence, since passwords are notoriously easy to crack. Moreover, if the system has a ‘password recovery’ service (such as Apple’s iForgot) you can more or less guarantee that even this kind of encryption isn’t happening.

*** The story of Clipper (and how it failed) is a wonderful one. Go read Matt Blaze’s paper.