Attack of the week: Airdrop tracing

It’s been a while since I wrote an “attack of the week” post, and the fault for this is entirely mine. I’ve been much too busy writing boring posts about Schnorr signatures! But this week’s news brings an exciting story with both technical and political dimensions: new reports claim that Chinese security agencies have developed a technique to trace the sender of AirDrop transmissions.

Typically my “attack of the week” posts are intended to highlight recent research. What’s unusual about this one is that the attack is not really new; it was discovered way back in 2019, when a set of TU Darmstadt researchers — Heinrich, Hollick, Schneider, Stute, and Weinert — reverse-engineered the Apple AirDrop protocol and disclosed several privacy flaws to Apple. (The resulting paper, which appeared in Usenix Security 2021 can be found here.)

What makes this an attack of the week is a piece of news that was initially reported by Bloomberg (here’s some other coverage without paywall) claiming that researchers in China’s Beijing Wangshendongjian Judicial Appraisal Institute have used these vulnerabilities to help police to identify the sender of “unauthorized” AirDrop materials, using a technique based on rainbow tables. While this new capability may not (yet) be in widespread deployment, it represents a new tool that could strongly suppress the use of AirDrop in China and Hong Kong.

And this is a big deal, since AirDrop is apparently one of a few channels that can still be used to disseminate unauthorized protest materials — and indeed, that was used in both places in 2019 and 2022, and (allegedly as a result) has already been subject to various curtailments.

In this post I’m going to talk about the Darmstadt research and how it relates to the news out of Beijing. Finally, I’ll talk a little about what Apple can do about it — something that is likely to be as much of a political problem as a technical one.

As always, the rest of this will be in the “fun” question-and-answer format I use for these posts.

What is AirDrop and why should I care?

Image from Apple. Used without permission.

If you own an iPhone, you already know the answer to this question. Otherwise: AirDrop is an Apple-specific protocol that allows Apple devices to send files (and contacts and other stuff) in a peer-to-peer manner over various wireless protocols, including Bluetooth and WiFi.

The key thing to know about AirDrop is that it has two settings, which can be enabled by a potential receiver. In “Contacts Only” mode, AirDrop will accept files only from people who are in your Contacts list (address book.) When set to “Everyone”, AirDrop will receive files from any random person within transmit range. This latter mode has been extensively used to distribute protest materials in China and Hong Kong, as well as to distribute indecent photos to strangers all over the world.

The former usage of AirDrop became such a big deal in protests that in 2022, Apple pushed a software update exclusively to Chinese users that limited the “Everyone” receive-from mode — ensuring that phones would automatically switch back to “Contacts only” after 10 minutes. The company later extended this software update to all users worldwide, but only after they were extensively criticized for the original move.

Is AirDrop supposed to be private? And how does AirDrop know if a user is in their Contacts list?

While AirDrop is not explicitly advertised as an “anonymous” communication protocol, any system that has your phone talking to strangers has implicit privacy concerns baked into it. This drives many choices around how AirDrop works.

Let’s start with the most important one: do AirDrop senders provide their ID to potential recipients? The answer, at some level, must be “yes.”

The reason for this is straightforward. In order for AirDrop recipients in “Contacts only” mode to check that a sender is in their Contacts list, there must be a way for them to check the sender’s ID. This implies that the sender must somehow reveal their identity to the recipient. And since AirDrop presents a list of possible recipients any time a sending user pops up the AirDrop window, this will happen at “discovery” time — typically before you’ve even decided if you really want to send a file.

But this poses a conundrum: the sender’s phone doesn’t actually know which nearby AirDrop users are willing to receive files from it — i.e., which AirDrop users have the sender in their Contacts — and it won’t know this until it actually talks to them. But talking to them means your phone is potentially shouting at everyone around it all the time, saying something like:

Hi there! My Apple ID is john.doe.28@icloud.com. Will you accept files from me!??

Now forget that this is being done by phones. Instead imagine yourself, as a human being, doing this to every random stranger you encounter on the subway. It should be obvious that this will quickly become a privacy concern, one that would scare even a company that doesn’t care about privacy. But Apple generally does care quite a bit about privacy!

Thus, just solving this basic problem requires a clever way by which phones can figure out whether they should talk to each other — i.e., whether the receiver has the sender in its Contacts — without either side leaking any useful information to random strangers. Fortunately cryptographic researchers have thought a lot about this problem! We’ve even given it a cool name: it’s called Private Set Intersection, or PSI.

To make a long story short: a Private Set Intersection protocol takes a set of strings from the Sender and a set from the Receiver. It gives one (or both) parties the intersection of both sets: that is, the set of entries that appear on both lists. Most critically, a good PSI protocol doesn’t reveal any other information about either of the sets.

In Apple’s case, the Sender would have just a few entries, since you can have a few different email addresses and phone numbers. The Receiver would have a big set containing its entire Contacts list. The output of the protocol would contain either (1) one or more of the Sender’s addresses, or (2) nothing. A PSI protocol would therefore solve Apple’s problem nicely.

Great, so which PSI protocol does Apple use?

The best possible answer to this is: 😔.

For a variety of mildly defensible reasons — which I will come back to in a moment — Apple does not use a secure PSI protocol to solve their AirDrop problem. Instead they did the thing that every software developer does when faced with the choice of doing complicated cryptography or “hacking something together in time for the next ship date”: they threw together their own solution using hash functions.

The TU Darmstadt researchers did a nice job of reverse-engineering Apple’s protocol in their paper. Read it! The important bit happens during the “Discovery” portion of the protocol, which is marked by an HTTPS POST request as shown in the excerpt below:

The very short TL;DR is this:

In the POST request, a sender attaches a truncated SHA-256 hash of its own Apple ID, which is contained within a signed certificate that it gets from Apple. (If the sender has more than one identifier, e.g., a phone number and an email address, this will contain hashes of each one.)
The recipient then hashes every entry in its Contacts list, and compares the results to see if it finds a match.
If the recipient is in Contacts Only mode and finds a match, it indicates this and accepts later file transfers. Otherwise it aborts the connection.

(As a secondary issue, AirDrop also includes a very short [two byte] portion of the same hashes in its BLE advertisements. Two bytes is pretty tiny, which means this shouldn’t leak much information, since many different addresses will collide on a two-byte hash. However, some other researchers have determined that it generally does work well enough to guess identities. Or they may have, the source isn’t translating well for me.)

A second important issue here is that the hash identifiers are apparently stored in logs within the recipient’s phone, which means that to obtain them you don’t have to be physically present when the transfer happens. You can potentially scoop them out of someone else’s phone after the fact.

So what’s the problem?

Many folks who have some experience with cryptography will see the problem immediately. But let’s be explicit.

Hash functions are designed to be one-way. In theory, this means that there is should be no efficient algorithm for “directly” taking the output of a hash function and turning it back into its input. But that guarantee has a huge asterisk: if I can guess a set of possible inputs that could have produced the hash, I can simply hash each one of my guesses and compare it to the target. If one input matches, then chances are overwhelming that I’ve found the right input (also called a pre-image.)

In its most basic form, this naive approach is called a “dictionary attack” based on the idea that one can assemble a dictionary of likely candidates, then test every one. Since these hashes apparently don’t contain any session-dependent information (such as salt), you can even do the hashing in advance to assemble a dictionary of candidate hashes, making the attack even faster.

This approach won’t work if your Apple ID (or phone number) is not guessable. The big question in exploiting this vulnerability is whether it’s possible to assemble a complete list of candidate Apple ID emails and phone numbers. The answer for phone numbers, as the Darmstadt researchers point out, is absolutely yes. Since there are only a few billion phone numbers, it is entirely possible to make a list of every phone number and have a computer grind through them — given a not-unreasonable amount of time. For email addresses this is more complicated, but there are many lists of email addresses in the world, and the Chinese state authorities almost certainly have some good approaches to collecting and/or generating those lists.

As an aside, exploiting these dictionaries can be done in three different ways:

You can make a list of candidate identifiers (or generate them programmatically) and then, given a new target hash, you can hash each identifier and check for a match. This requires you to compute a whole lot of SHA256 hashes for each target you crack, which is pretty fast on a GPU or FPGA (or ASIC) but not optimal.
You can pre-hash the list and make a database of hashes and identifiers. Then when you see a target hash, you just need to do a fast lookup. This means all computation is done once, and lookups are fast. But it requires a ton of storage.
Alternatively, you can use an intermediate approach called a time-memory tradeoff in which you exchange some storage for some computation once the target is found. The most popular technique is called a rainbow table, and it really deserves its own separate blog post, though I will not elaborate today.

The Chinese announcement explicitly mentions a rainbow table, so that’s a good indicator that they’re exploiting this vulnerability.

Well that sucks. What can we, or rather Apple, do about it?

If you’re worried about leaking your identifier, an immediate solution is to turn off AirDrop, assuming such a thing is possible. (I haven’t tried it, so I don’t know if turning this off will really stop your phone from talking to other people!) Alternatively you can unregister your Apple ID, or use a bizarre high-entropy Apple ID that nobody will possibly guess. Apple could also reduce their use of logging.

But those solutions are all terrible.

The proper technical solution is for Apple to replace their hashing-based protocol with a proper PSI protocol, which will — as previously discussed — reveal only one bit of information: whether the receiver has the sender’s address(es) in their Contacts list. Indeed, that’s the solution that the Darmstadt researchers propose. They even devised a Diffie-Hellman-based PSI protocol called “PrivateDrop” and showed that it can be used to solve this problem.

But this is not necessarily an easy solution, for reasons that are both technical and political. It’s worth noting that Apple almost certainly knew from the get-go that their protocol was vulnerable to these attacks — but even if they didn’t, they were told about these issues back in May 2019 by the Darmstadt folks. It’s now 2024, and Chinese authorities are exploiting it. So clearly it was not an easy fix.

Some of this stems from the fact that PSI protocols are more computationally heavy that the hashing-based protocol, and some of it (may) stem from the need for more interaction between each pair of devices. Although these costs are not particularly unbearable, it’s important to remember that phone battery life and BLE/WiFi bandwidth is precious to Apple, so even minor costs are hard to bear. Finally, Apple may not view this as really being an issue.

However in this case there is an even tougher political dimension.

Will Apple even fix this, given that Chinese authorities are now exploiting it?

And here we find the hundred billion dollar question: if Apple actually replaced their existing protocol with PrivateDrop, would that be viewed negatively by the Chinese government?

Those of us on the outside can only speculate about this. However, the facts are pretty worrying: Apple has enormous manufacturing and sales resources located inside of China, which makes them extremely vulnerable to an irritated Chinese government. They have, in the past, taken actions that appeared to be targeted at restricting AirDrop use within China — and although there’s no definitive proof of their motivations, it certainly looked bad.

Finally, Apple has recently been the subject of pressure by the Indian government over its decision to alert journalists about a set of allegedly state-sponsored attacks. Apple’s response to this pressure was to substantially tone down its warnings. And Apple has many fewer resources at stake in India than in China, although that’s slowly changing.

Hence there is a legitimate question about whether it’s politically wise for Apple to make a big technical improvement to their AirDrop privacy, right at the moment that the lack of privacy is being viewed as an asset by authorities in China. Even if this attack isn’t really that critical to law enforcement within China, the decision to “fix” it could very well be seen as a slap in the face.

One hopes that despite all these concerns, we’ll soon see a substantial push to improve the privacy of AirDrop. But I’m not going to hold my breath.

Why encrypted backup is so important

You might have seen the news today that Apple is announcing a raft of improvements to Macs and iOS devices aimed at improving security and privacy. These include FIDO support, improvements to iMessage key verification, and a much anticipated announcement that the company is abandoning their plans for (involuntary) photo scanning.

While every single one of these is exciting, one announcement stands above the others. This is Apple’s decision to roll out (opt-in) end-to-end encryption for iCloud backups. While this is only one partial step in the right direction, it’s still a huge and decisive step — one that I think will substantially raise the bar for cloud security across the whole industry.

If you’re looking for precise details on all of these features, see Apple’s description here or their platform security guide. Others will no doubt have the time to do deep-dive explanations on each one. (I was given a short presentation by Apple today, and was provided the opportunity to ask a bunch of questions that their representative answered thoughtfully. But this is no substitute for a detailed look at the technical specs.)

In the rest of this post I want to zero in on end-to-end encrypted iCloud backup, and why I think this announcement is such a big deal.

Smartphones and cloud backup: the biggest consumer privacy compromise you never heard of

If you’re the typical smartphone or tablet user, your devices have become the primary repository for your private papers, photos and communications. Imagine some document that your grandparents would have kept on a shelf or inside of a locked drawer in their home. Today the equivalent document probably resides in one of your devices. This data is the most personal stuff in a human life: your private family photos, your mail, your financial records, even a history of the books you read and which pages you found meaningful. Of course, it also includes new types of information that are unimaginably more valuable and invasive than anything your grandparents could have ever imagined.

But this is only half the story.

If you’re the typical user, you don’t only keep this data in your device. An exact duplicate exists in a data center hundreds or thousands of miles away from you. Every time you snap a photo, each night while you sleep, this doppelganger is scrupulously synchronized through the tireless efforts of cloud backup software — usually the default software built into your device’s operating system.

It goes without saying that you, dear reader, might not be the typical user. You might be one of the vanishingly small fraction of users who change their devices’ default backup policies. You might be part of the even smaller fraction who back up their phone to a local computer. If you’re one of those people, congratulations: you’ve made good choices. But I would beg you to get over it. You don’t really matter.

The typical user does not make the same choices as you did.

The typical user activates cloud backup because their device urges them to do at setup time and it’s just so easy to go along. The typical user sends their most personal photos to Apple or Google, not because they’ve thought deeply about the implications, but because they can’t afford to lose a decade of family memories when their phone or laptop breaks down. The typical user can’t afford to shell out an extra $300 to purchase extra storage capacity, so they buy a base-model phone and rely on cloud sync to offload the bulk of their photo library into the cloud (for a small monthly fee), so their devices can still do useful things.

And because the typical user does these things, our society does these things.

I am struggling to try to find an analogy for how crazy this is. Imagine your country held a national referendum to decide whether most citizens should be compelled to photocopy their private photos and store them in a centralized library — one that was available to both police and motivated criminals alike. Would anyone vote in favor of that, even if there was technically an annoying way to opt out? As ridiculous as this sounds, it’s effectively what we’ve done to ourselves over the past ten years: but of course we didn’t choose any of it. A handful of Silicon Valley executives made the choice for us, in pursuit of adoption metrics and a “magical” user experience.

What’s done is done, and those repositories now exist.

And that should scare you. It terrifies me, because these data repositories are not only a risk to individual user privacy, they’re effectively a surveillance super-weapon. However much damage as we’ve done to our privacy with search engines and cellphone location data, the private content of our papers is the final frontier in the battle for our privacy. And in less than a decade, we’ve already lost the war.

Apple’s slow motion battle to encrypt your backups

To give credit where it’s due, I think the engineers at Apple and Google were the first to realize what they’d unleashed — maybe even before many of us on the outside were even aware of the scale of the issue.

In 2016, Apple began quietly deploying new infrastructure designed to secure user encryption keys in an “end-to-end” fashion: this means that keys would only be accessible only to the user who generated them. The system Apple deployed was called the “iCloud Key Vault“, and it is consists of hundreds of specialized devices called Hardware Security Modules (HSMs) that live in the company’s data centers. The devices store user encryption keys. Those keys are in turn gated by a user-chosen passcode, which is typically the same passcode you use daily to unlock your device. A user who knows their passcode can ask for a copy of their key. An attacker who can’t guess that passcode (in a small number of attempts) cannot. Most critically: Apple counts themselves in the category of people who might be attackers. This means they went to some trouble to ensure that even they cannot (be forced to) bypass this system.

When it comes to encrypted backup there is essentially one major problem: how to store keys. I’m not saying this is literally the only issue, far from it. But once you’ve found a way for users to securely store and recover their keys, every other element of the system can be hung from that.

The remaining problems are still important! There are still, of course, reasonable concerns that some users will forget their device passcode and thus lose access to backups. You need a good strategy when this does happen. But even if solving these problems took some time and experimentation, it should only have been a matter of time until Apple activated end-to-end encryption for at least a portion of their user base. Once broadly deployed, this feature would have sent a clear signal to motivated attackers that future abuse of cloud backup repositories wasn’t a good place to invest resources.

But this is not quite what happened.

What actually happened is unclear, and Apple refuses to talk about it. But the outlines of what we do know tells a story that is somewhere between “meh” and “ugh“. Specifically, reporting from Reuters indicates that Apple came under pressure from government agencies: these agencies wished Apple to maintain the availability of cleartext backup data, since this is now an important law enforcement priority. Whatever the internal details, the result was not so much a retreat but a rout:

Once the decision was made, the 10 or so experts on the Apple encryption project — variously code-named Plesio and KeyDrop — were told to stop working on the effort, three people familiar with the matter told Reuters.

For what it’s worth, some have offered alternative explanations. John Gruber wrote a post that purports to push back on this reporting, arguing that the main issues were with users who got locked out of their own backups. (Apple has recently addressed this by deploying a feature that allows you to set another user as your recovery assistant.) However even that piece acknowledges that government pressure was likely an issue — a key dispute is about whether the FBI killed the plan, or whether fear of angering the FBI caused Apple to kill its own plan.

Whatever caused it, this setback did not completely close the door on end-to-end encrypted backups, of course. Despite Apple’s reticence, other companies — notably Google and Meta’s WhatsApp — have continued to make progress by deploying end-to-end encrypted systems very similar to Apple’s. At present, the coverage is partial: Google’s system may not encrypt everything, and WhatsApp’s backups are opt-in.

Selective encryption and client-side scanning: a road not taken

As of July 2021 the near-term deployment of end-to-end encrypted backups seemed inevitable to me. In the future, firms would finally launch the technology and demonstrate that it works — at least for some users. This would effectively turn us back towards the privacy world of 2010 and give users a clear distinction between private data and non-private user data. There was another future where that might not happen, but I thought that was unlikely.

One thing I did not foresee was a third possible future: one where firms like Apple rebuilt their encryption so we could have both end-to-end encryption — and governments could have their surveillance too.

In August of last year, Apple proposed such a vision. In a sweeping announcement, the company unveiled a plan to deploy “client-side image scanning” to 1.75 billion iCloud users. The system, billed as part of the company’s “Child Safety” initiative, used perceptual hashing and cryptography to scan users’ private photo libraries for the presence of known child sexual abuse media, or CSAM. This would allow Apple to rapidly identify non-compliant users and, subject to an internal review process, report violators to the police.

Apple’s proposal was not the first system designed to scan cloud-stored photos for such imagery. It was the first system capable of working harmoniously with end-to-end encrypted backups. This fact is due to the specific way that Apple proposed to conduct the scanning.

In previous content scanning systems, user files are scanned on a server. This required that content must be uploaded in plaintext, i.e., unencrypted form, so that the server can process it. Apple’s system, on the other hand, performed the necessary hashing and scanning on the user’s own device — before the data was uploaded. The technical implications of this design are critical: Apple’s scanning would continue to operate even if Apple eventually flipped the switch to activate end-to-end encryption for your private photos (as they did today.)

And let’s please not be dense about this. While Apple’s system did not yet encrypt cloud-stored photos last year (that’s the new announcement Apple made today), encryption plans were the only conceivable reason one would deploy a client-side scanning system. There was no other reasonable explanation.

Users have a difficult time understanding even simple concepts around encryption. And that’s not their fault! Firms constantly say things like “your files are encrypted” even when they store the decryption keys right next to the encrypted data. Now try explaining the difference between “encryption” and “end-to-end encryption” along with forty-six variants of “end-to-end encryption that has some sort of giant asterisk in which certain types of files can be decrypted by your cloud provider and reported to the police.” Who even knows what privacy guarantees those systems would offer you — and how they would evolve. To me it felt like the adoption of these systems would signal the end of a meaningful concept of user-controlled data.

Yet this came very close to happening. It could still happen.

It didn’t though. And to this day I’m not entire sure why. Security and privacy researchers told the company exactly how dangerous the idea was. Apple employees reacted negatively to the proposal. But much to my surprise, the real clincher was the public’s negative reaction: as much as people hate CSAM, people really seemed to hate the idea that their private data might be subject to police surveillance. The company delayed the feature and eventually abandoned it, with today’s result being the end of the saga.

I would love to be a fly on the wall to understand how this went down inside of Apple. I doubt I’ll ever learn what happened. I’m just glad that this is where we wound up.

What’s next?

I wish I could tell you that Apple’s announcement today is the end of the story, and now all of your private data will be magically protected — from hackers, abusive partners and the government. But that is not how things work.

Apple’s move today is an important step. It hardens certain walls: very important, very powerful walls. It will send a clear message to certain attackers that deeper investment in cloud attacks is probably not worthwhile. Maybe. But there is still a lot of work to do.

For one thing, Apple’s proposal (which rolls out in a future release) is opt-in: users will have to activate “Advanced Protection” features for their iCloud account. With luck Apple will learn from this early adoption, and find ways to make it easier to encourage more users to adopt this feature. But that’s a ways off.

And even if Apple does eventually move most of their users into end-to-end encrypted cloud backups, there will always be other ways to compromise someone’s data. Steal their phone, guess their password, jailbreak a partner’s phone, use sophisticated targeted malware. And of course a huge fraction of the world will still live under repressive governments that don’t need to trouble with breaking into cloud providers.

But none of these attacks will be quite as easy as attacks on non-E2E cloud backup, and none will offer quite the same level convenience and scale. Today’s announcement makes me optimistic that we seem to be heading — in fits and starts — to a world where your personal data will belong to you.

Cover photo by Scott Robinson, used under CC license.

A case against security nihilism

This week a group of global newspapers is running a series of articles detailing abuses of NSO Group’s Pegasus spyware. If you haven’t seen any of these articles, they’re worth reading — and likely will continue to be so as more revelations leak out. The impetus for the stories is a leak comprising more than 50,000 phone numbers that are allegedly the targets of NSO’s advanced iPhone/Android malware.

Notably, these targets include journalists and members of various nations’ political opposition parties — in other words, precisely the people who every thinking person worried would be the target of the mass-exploitation software that NSO sells. And indeed, that should be the biggest lesson of these stories: the bad thing everyone said would happen now has.

This is a technical blog, so I won’t advocate for, say, sanctioning NSO Group or demanding answers from the luminaries on NSO’s “governance and compliance” committee. Instead I want to talk a bit about some of the technical lessons we’ve learned from these leaks — and even more at a high level, precisely what’s wrong with shrugging these attacks away.

We should all want perfect security!

Don’t feel bad, targeted attacks are super hard!

A perverse reaction I’ve seen from some security experts is to shrug and say “there’s no such thing as perfect security.” More concretely, some folks argue, this kind of well-resourced targeted attack is fundamentally impossible to prevent — no matter how much effort companies like Apple put into stopping it.

And at the extremes, this argument is not wrong. NSO isn’t some script-kiddy toy. Deploying it costs hundreds of thousands of dollars, and fighting attackers with that level of resources is always difficult. Plus, the argument goes, even if we raise the bar for NSO then someone with even more resources will find their way into the gap — perhaps charging an even more absurd price. So let’s stop crapping on Apple, a company that works hard to improve the baseline security of their products, just because they’re failing to solve an impossible problem.

Still that doesn’t mean today’s version of those products are doing everything they could be to stop attacks. There is certainly more that corporations like Apple and Google could be doing to protect their users. However, the only way we’re going to get those changes is if we demand them.

Not all vectors are created equal

Because spyware is hard to capture, we don’t know precisely how Pegasus works. The forensic details we do have come from an extensive investigation conducted by Amnesty International’s technology group. They describe a sophisticated infection process that proved capable of infecting a fully-patched iPhone 12 running the latest version of Apple’s iOS (14.6).

Many attacks used “network injection” to redirect the victim to a malicious website. That technique requires some control of the local network, which makes it hard to deploy to remote users in other countries. A more worrying set of attacks appear to use Apple’s iMessage to perform “0-click” exploitation of iOS devices. Using this vector, NSO simply “throws” a targeted exploit payload at some Apple ID such as your phone number, and then sits back and waits for your zombie phone to contact its infrastructure.

This is really bad. While cynics are probably correct (for now) that we probably can’t shut down every avenue for compromise, there’s good reason to believe we can close down a vector for 0-interaction compromise. And we should try to do that.

What can we do to make NSO’s life harder?

What we know that these attacks take advantage of fundamental weaknesses in Apple iMessage: most critically, the fact that iMessage will gleefully parse all sorts of complex data received from random strangers, and will do that parsing using crappy libraries written in memory unsafe languages. These issues are hard to fix, since iMessage can accept so many data formats and has been allowed to sprout so much complexity over the past few years.

There is good evidence that Apple realizes the bind they’re in, since they tried to fix iMessage by barricading it behind a specialized “firewall” called BlastDoor. But firewalls haven’t been particularly successful at preventing targeted network attacks, and there’s no reason to think that BlastDoor will do much better. (Indeed, we know it’s probably not doing its job now.)

Adding a firewall is the cheap solution to the problem, and this is probably why Apple chose this as their first line of defense. But actually closing this security hole is going to require a lot more. Apple will have to re-write most of the iMessage codebase in some memory-safe language, along with many system libraries that handle data parsing. They’ll also need to widely deploy ARM mitigations like PAC and MTE in order to make exploitation harder. All of this work has costs and (more importantly) risks associated with it — since activating these features can break all sorts of things, and people with a billion devices can’t afford to have .001% of them crashing every day.

An entirely separate area is surveillance and detection: Apple already performs some remote telemetry to detect processes doing weird things. This kind of telemetry could be expanded as much as possible while not destroying user privacy. While this wouldn’t necessarily stop NSO, it would make the cost of throwing these exploits quite a bit higher — and make them think twice before pushing them out to every random authoritarian government.

It’s the scale, stupid

Critics are correct that fixing these issues won’t stop exploits. The problem that companies like Apple need to solve is not preventing exploits forever, but a much simpler one: they need to screw up the economics of NSO-style mass exploitation.

Targeted exploits have been around forever. What makes NSO special is not that they have some exploits. Rather: NSO’s genius is that they’ve done something that attackers were never incentivized to do in this past: democratize access to exploit technology. In other words, they’ve done precisely what every “smart” tech business is supposed to do: take something difficult and very expensive, and make it more accessible by applying the magic of scale. NSO is basically the SpaceX of surveillance.

But this scalability is not inevitable.

NSO can afford to maintain a 50,000 number target list because the exploits they use hit a particular “sweet spot” where the risk of losing an exploit chain — combined with the cost of developing new ones — is low enough that they can deploy them at scale. That’s why they’re willing to hand out exploitation to every idiot dictator — because right now they think they can keep the business going even if Amnesty International or CitizenLab occasionally catches them targeting some human rights lawyer.

But companies like Apple and Google can raise both the cost and risk of exploitation — not just everywhere, but at least on specific channels like iMessage. This could make NSO’s scaling model much harder to maintain. A world where only a handful of very rich governments can launch exploits (under very careful vetting and controlled circumstances) isn’t a great world, but it’s better than a world where any tin-pot authoritarian can cut a check to NSO and surveil their political opposition or some random journalist.

So how do we get to that world?

In a perfect world, US and European governments would wake up and realize that arming authoritarianism is really is bad for democracy — and that whatever trivial benefit they get from NSO is vastly outweighed by the very real damage this technology is doing to journalism and democratic governance worldwide.

But I’m not holding my breath for that to happen.

In the world I inhabit, I’m hoping that Ivan Krstić wakes up tomorrow and tells his bosses he wants to put NSO out of business. And I’m hoping that his bosses say “great: here’s a blank check.” Maybe they’ll succeed and maybe they’ll fail, but I’ll bet they can at least make NSO’s life interesting.

But Apple isn’t going to do any of this if they don’t think they have to, and they won’t think they have to if people aren’t calling for their heads. The only people who can fix Apple devices are Apple (very much by their own design) and that means Apple has to feel responsible each time an innocent victim gets pwned while using an Apple device. If we simply pat Apple on the head and say “gosh, targeted attacks are hard, it’s not your fault” then this is exactly the level of security we should expect to get — and we’ll deserve it.

Why the FBI can’t get your browsing history from Apple iCloud (and other scary stories)

It’s not every day that I wake up thinking about how people back up their web browsers. Mostly this is because I don’t feel the need to back up any aspect of my browsing. Some people lovingly maintain huge libraries of bookmarks and use fancy online services to organize them. I pay for one of those because I aspire to be that kind of person, but I’ve never been organized enough to use it.

In fact, the only thing I want from my browser is for my history to please go away, preferably as quickly as possible. My browser is a part of my brain, and backing my thoughts up to a cloud provider is the most invasive thing I can imagine. Plus, I’m constantly imagining how I’ll explain specific searches to the FBI.

All of these thoughts are apropos a Twitter thread I saw last night from the Engineering Director on Chrome Security & Privacy at Google, which explains why “browser sync” features (across several platforms) can’t provide end-to-end encryption by default.

Were you aware that no major browser provides *default* end-to-end encryption (E2EE) for sync data (history, bookmarks, etc.)? There are good reasons why, but I recently learned that many people—even in security & privacy—have serious misconceptions about all of this [1/9]
— Justin Schuh 😷 (@justinschuh) March 24, 2021

This thread sent me down a rabbit hole that ended in a series of highly-scientific Twitter polls and frantic scouring of various providers’ documentation. Because while on the one hand Justin’s statement is mostly true, it’s also a bit wrong. Specifically, I learned that Apple really seems to have solved this problem. More interestingly, the specific way that Apple has addressed this problem highlights some strange assumptions that make this whole area unnecessarily messy.

This munging of expectations also helps to explain why “browser sync” features and the related security tradeoffs seem so alien and horrible to me, while other folks think these are an absolute necessity for survival.

Let’s start with the basics.

What is cloud-based browser “sync”, and how secure is it?

Most web browsers (and operating systems with a built-in browser) incorporate some means of “synchronizing” browsing history and bookmarks. By starting with this terminology we’ve already put ourselves on the back foot, since “synchronize” munges together three slightly different concepts:

Synchronizing content across devices. Where, for example you have a phone, a laptop and a tablet all active and in occasional use and want your data to propagate from one to the others.
Backing up your content. Wherein you lose all your device(s) and need to recover this data onto a fresh clean device.
Logging into random computers. If you switch computers regularly (for example, back when we worked in offices) then you might want to be able to quickly download your data from the cloud.

(Note that the third case is kind of weird. It might be a subcase of #1 if you have another device that’s active and can send you the data. It might be a subcase of #2. I hate this one and am sending it to live on a farm upstate.)

You might ask why I call these concepts “very different” when they all seem quite similar. The answer is that I’m thinking about a very specific question: namely, how hard is it to end-to-end encrypt this data so that the cloud provider can’t read it? The answer is different between (at least) the first two cases.

If what we really want to do is synchronize your data across many active devices, then the crypto problem is relatively easy. The devices generate public keys and register them with your cloud provider, and then each one simply encrypts relevant content to the others. Apple has (I believe) begun to implement this across their device ecosystem.

If what we want is cloud backup, however, then the problem is much more challenging. Since the base assumption is that the device(s) might get lost, we can’t store decryption keys there. We could encrypt the data under the user’s device passcode or something, but most users choose terrible passcodes that are trivially subject to dictionary attacks. Services like Apple iCloud and Google (Android) have begun to deploy trusted hardware in their data centers to mitigate this: these “Hardware Security Modules” (HSMs) store encryption keys for each user, and only allow a limited number of password guesses before they wipe the keys forever. This keeps providers and hackers out of your stuff. Yay!

Except: not yay! Because, as Justin points out (and here I’m paraphrasing in my own words) users are the absolute worst. Not only do they choose lousy passcodes, but they constantly forget them. And when they forget their passcode and can’t get their backups, do they blame themselves? Of course not! They blame Justin. Or rather, they complain loudly to their cloud backup providers.

While this might sound like an extreme characterization, remember: when you have a billion users, the extreme ones will show up quite a bit.

The consequence of this, argues Justin, is that most cloud backup services don’t use default end-to-end encryption for browser synchronization, and hence your bookmarks and in this case your browsing history will be stored at your provider in plaintext. Justin’s point is that this decision flows from the typical user’s expectations and is not something providers have much discretion about.

And if that means your browsing history happens to get data-mined, well: the spice must flow.

Except none of this is quite true, thanks to Apple!

The interesting thing about this explanation is that it’s not quite true. I was inclined to believe this explanation, until I went spelunking through the Apple iCloud security docs and found that Apple does things slightly differently.

(Note that I don’t mean to blame Justin for not knowing this. The problem here is that Apple absolutely sucks at communicating their security features to an audience that isn’t obsessed with reading their technical documentation. My students and I happen to be obsessive, and sometimes it pays dividends.)

What I learned from my exploration (and here I pray the documentation is accurate) is that Apple actually does seem to provide end-to-end encryption for browser data. Or more specifically: they provide end-to-end encryption for browser history data starting as of iOS 13.

More concretely, Apple claims that this data is protected “with a passcode”, and that “nobody else but you can read this data.” Presumably this means Apple is using their iCloud Keychain HSMs to store the necessary keys, in a way that Apple itself can’t access.

What’s interesting about the Apple decision is that it appears to explicitly separate browsing history and bookmarks, rather than lumping them into a single take-it-or-leave-it package. Apple doesn’t claim to provide any end-to-end encryption guarantees whatsoever for bookmarks: presumably someone who resets your iCloud account password can get those. But your browsing history is protected in a way that even Apple won’t be able to access, in case the FBI show up with a subpoena.

That seems like a big deal and I’m surprised that it’s gotten so little attention.

Why should browser history be lumped together with bookmarks?

This question gets at the heart of why I think browser synchronization is such an alien concept. From my perspective, browsing history is an incredibly sensitive and personal thing that I don’t want anywhere. Bookmarks, if I actually used them, would be the sort of thing I’d want to preserve.

I can see the case for keeping history on my local devices. It makes autocomplete faster, and it’s nice to find that page I browsed yesterday. I can see the case for (securely) synchronizing history across my active devices. But backing it up to the cloud in case my devices all get stolen? Come on. This is like the difference between backing up my photo library, and attaching a GoPro to my head while I’m using the bathroom.

(And Google’s “sync” services only stores 90 days of history, so it isn’t even a long-term backup.)

One cynical answer to this question is: these two very different forms of data are lumped together because one of them — browser history — is extremely valuable for advertising companies. The other one is valuable to consumers. So lumping them together gets consumers to hand over the sweet, sweet data in exchange for something they want. This might sound critical, but on the other hand, we’re just describing the financial incentive that we know drives most of today’s Internet.

A less cynical answer is that consumers really want to preserve their browsing history. When I asked on Twitter, a bunch of tech folks noted that they use their browsing history as an ad-hoc bookmarking system. This all seemed to make some sense, and so maybe there’s just something I don’t get about browser history.

However, the important thing to keep in mind here is that just because you do this doesn’t mean it should drive a few billion people’s security posture. The implications of prioritizing the availability of browser history backups (as a default) is that vast numbers of people will essentially have their entire history uploaded to the cloud, where it can be accessed by hackers, police and surveillance agencies.

Apple seems to have made a different calculation: not that history isn’t valuable, but that it isn’t a good idea to hold the detailed browser history of a billion human beings in a place where any two-bit police agency or hacker can access it. I have a very hard time faulting them in that.

And if that means a few users get upset, that seems like a good tradeoff to me.

How safe is Apple’s Safe Browsing?

This morning brings new and exciting news from the land of Apple. It appears that, at least on iOS 13, Apple is sharing some portion of your web browsing history with the Chinese conglomerate Tencent. This is being done as part of Apple’s “Fraudulent Website Warning”, which uses the Google-developed Safe Browsing technology as the back end. This feature appears to be “on” by default in iOS Safari, meaning that millions of users could potentially be affected.

apple-safari-ip-addresses-tencent-2 — (image source)

As is the standard for this sort of news, Apple hasn’t provided much — well, any — detail on whose browsing history this will affect, or what sort of privacy mechanisms are in place to protect its users. The changes probably affect only Chinese-localized users (see Github commits, courtesy Eric Romang), although it’s difficult to know for certain. However, it’s notable that Apple’s warning appears on U.S.-registered iPhones.

Regardless of which users are affected, Apple hasn’t said much about the privacy implications of shifting Safe Browsing to use Tencent’s servers. Since we lack concrete information, the best we can do is talk a bit about the technology and its implications. That’s what I’m going to do below.

What is “Safe Browsing”, and is it actually safe?

Several years ago Google noticed that web users tended to blunder into malicious sites as they browsed the web. This included phishing pages, as well as sites that attempted to push malware at users. Google also realized that, due to its unique vantage point, it had the most comprehensive list of those sites. Surely this could be deployed to protect users.

The result was Google’s “safe browsing”. In the earliest version, this was simply an API at Google that would allow your browser to ask Google about the safety of any URL you visited. Since Google’s servers received the full URL, as well as your IP address (and possibly a tracking cookie to prevent denial of service), this first API was kind of a privacy nightmare. (This API still exists, and is supported today as the “Lookup API“.)

To address these concerns, Google quickly came up with a safer approach to, um, “safe browsing”. The new approach was called the “Update API”, and it works like this:

Google first computes the SHA256 hash of each unsafe URL in its database, and truncates each hash down to a 32-bit prefix to save space.
Google sends the database of truncated hashes down to your browser.
Each time you visit a URL, your browser hashes it and checks if its 32-bit prefix is contained in your local database.
If the prefix is found in the browser’s local copy, your browser now sends the prefix to Google’s servers, which ship back a list of all full 256-bit hashes of the matching URLs, so your browser can check for an exact match.

At each of these requests, Google’s servers see your IP address, as well as other identifying information such as database state. It’s also possible that Google may drop a cookie into your browser during some of these requests. The Safe Browsing API doesn’t say much about this today, but Ashkan Soltani noted this was happening back in 2012.

It goes without saying that Lookup API is a privacy disaster. The “Update API” is much more private: in principle, Google should only learn the 32-bit hashes of some browsing requests. Moreover, those truncated 32-bit hashes won’t precisely reveal the identity of the URL you’re accessing, since there are likely to be many collisions in such a short identifier. This provides a form of k-anonymity.

The weakness in this approach is that it only provides some privacy. The typical user won’t just visit a single URL, they’ll browse thousands of URLs over time. This means a malicious provider will have many “bites at the apple” (no pun intended) in order to de-anonymize that user. A user who browses many related websites — say, these websites — will gradually leak details about their browsing history to the provider, assuming the provider is malicious and can link the requests. (Updated to add: There has been some academic research on such threats.)

And this is why it’s so important to know who your provider actually is.

What does this mean for Apple and Tencent?

That’s ultimately the question we should all be asking.

The problem is that Safe Browsing “update API” has never been exactly “safe”. Its purpose was never to provide total privacy to users, but rather to degrade the quality of browsing data that providers collect. Within the threat model of Google, we (as a privacy-focused community) largely concluded that protecting users from malicious sites was worth the risk. That’s because, while Google certainly has the brainpower to extract a signal from the noisy Safe Browsing results, it seemed unlikely that they would bother. (Or at least, we hoped that someone would blow the whistle if they tried.)

But Tencent isn’t Google. While they may be just as trustworthy, we deserve to be informed about this kind of change and to make choices about it. At very least, users should learn about these changes before Apple pushes the feature into production, and thus asks millions of their customers to trust them.

We shouldn’t have to read the fine print

When Apple wants to advertise a major privacy feature, they’re damned good at it. As an example: this past summer the company announced the release of the privacy-preserving “Find My” feature at WWDC, to widespread acclaim. They’ve also been happy to claim credit for their work on encryption, including technology such as iCloud Keychain.

But lately there’s been a troubling silence out of Cupertino, mostly related to the company’s interactions with China. Two years ago, the company moved much of iCloud server infrastructure into mainland China, for default use by Chinese users. It seems that Apple had no choice in this, since the move was mandated by Chinese law. But their silence was deafening. Did the move involve transferring key servers for end-to-end encryption? Would non-Chinese users be affected? Reporters had to drag the answers out of the company, and we still don’t know many of them.

In the Safe Browsing change we have another example of Apple making significant modifications to its privacy infrastructure, largely without publicity or announcement. We have learn about this stuff from the fine print. This approach to privacy issues does users around the world a disservice.

It increasingly feels like Apple is two different companies: one that puts the freedom of its users first, and another that treats its users very differently. Maybe Apple feels it can navigate this split personality disorder and still maintain its integrity.

I very much doubt it will work.

How does Apple (privately) find your offline devices?

At Monday’s WWDC conference, Apple announced a cool new feature called “Find My”. Unlike Apple’s “Find my iPhone“, which uses cellular communication and the lost device’s own GPS to identify the location of a missing phone, “Find My” also lets you find devices that don’t have cellular support or internal GPS — things like laptops, or (and Apple has hinted at this only broadly) even “dumb” location tags that you can attach to your non-electronic physical belongings.

The idea of the new system is to turn Apple’s existing network of iPhones into a massive crowdsourced location tracking system. Every active iPhone will continuously monitor for BLE beacon messages that might be coming from a lost device. When it picks up one of these signals, the participating phone tags the data with its own current GPS location; then it sends the whole package up to Apple’s servers. This will be great for people like me, who are constantly losing their stuff: if I leave my backpack ~~on a tour bus in China~~ in my office, sooner or later someone else will stumble on its signal and I’ll instantly know where to find it.

(It’s worth mentioning that Apple didn’t invent this idea. In fact, companies like Tile have been doing this for quite a while. And yes, they should probably be worried.)

If you haven’t already been inspired by the description above, let me phrase the question you ought to be asking: how is this system going to avoid being a massive privacy nightmare?

Let me count the concerns:

If your device is constantly emitting a BLE signal that uniquely identifies it, the whole world is going to have (yet another) way to track you. Marketers already use WiFi and Bluetooth MAC addresses to do this: Find My could create yet another tracking channel.
It also exposes the phones who are doing the tracking. These people are now going to be sending their current location to Apple (which they may or may not already be doing). Now they’ll also be potentially sharing this information with strangers who “lose” their devices. That could go badly.
Scammers might also run active attacks in which they fake the location of your device. While this seems unlikely, people will always surprise you.

The good news is that Apple claims that their system actually does provide strong privacy, and that it accomplishes this using clever cryptography. But as is typical, they’ve declined to give out the details how they’re going to do it. Andy Greenberg talked me through an incomplete technical description that Apple provided to Wired, so that provides many hints. Unfortunately, what Apple provided still leaves huge gaps. It’s into those gaps that I’m going to fill in my best guess for what Apple is actually doing.

A big caveat: much of this could be totally wrong. I’ll update it relentlessly when Apple tells us more.

Some quick problem-setting

To lay out our scenario, we need to bring several devices into the picture. For inspiration, we’ll draw from the 1950s television series “Lassie”.

A first device, which we’ll call Timmy, is “lost”. Timmy has a BLE radio but no GPS or connection to the Internet. Fortunately, he’s been previously paired with a second device called Ruth, who wants to find him. Our protagonist is Lassie: she’s a random (and unknowing) stranger’s iPhone, and we’ll assume that she has at least an occasional Internet connection and solid GPS. She is also a very good girl. The networked devices communicate via Apple’s iCloud servers, as shown below:

(Since Timmy and Ruth have to be paired ahead of time, it’s likely they’ll both be devices owned by the same person. Did I mention that you’ll need to buy two Apple devices to make this system work? That’s also just fine for Apple.)

Since this is a security system, the first question you should ask is: who’s the bad guy? The answer in this setting is unfortunate: everyone is potentially a bad guy. That’s what makes this problem so exciting.

Keeping Timmy anonymous

The most critical aspect of this system is that we need to keep unauthorized third parties from tracking Timmy, especially when he’s not lost. This precludes some pretty obvious solutions, like having the Timmy device simply shout “Hi my name is Timmy, please call my mom Ruth and let her know I’m lost.” It also precludes just about any unchanging static identifier, even an opaque and random-looking one.

This last requirement is inspired by the development of services that abuse static identifiers broadcast by your devices (e.g., your WiFi MAC address) to track devices as you walk around. Apple has been fighting this — with mixed success — by randomizing things like MAC addresses. If Apple added a static tracking identifier to support the “Find My” system, all of these problems could get much worse.

This requirement means that any messages broadcast by Timmy have to be opaque — and moreover, the contents of these messages must change, relatively frequently, to new values that can’t be linked to the old ones. One obvious way to realize this is to have Timmy and Ruth agree on a long list of random “pseudonyms” for Timmy, and have Timmy pick a different one each time.

This helps a lot. Each time Lassie sees some (unknown) device broadcasting an identifier, she won’t know if it belongs to Timmy: but she can send it up to Apple’s servers along with her own GPS location. In the event that Timmy ever gets lost, Ruth can ask Apple to search for every single one of Timmy‘s possible pseudonyms. Since neither nobody outside of Apple ever learns this list, and even Apple only learns it after someone gets lost, this approach prevents most tracking.

A slightly more efficient way to implement this idea is to use a cryptographic function (like a MAC or hash function) in order to generate the list of pseudonyms from a single short “seed” that both Timmy and Ruth will keep a copy of. This is nice because the data stored by each party will be very small. However, to find Timmy, Ruth must still send all of the pseudonyms — or her “seed” — up to Apple, who will have to search its database for each one.

Hiding Lassie’s location

The pseudonym approach described above should work well to keep Timmy‘s identity hidden from Lassie, and even from Apple (up until the point that Ruth searches for it.) However, it’s got a big drawback: it doesn’t hide Lassie‘s GPS coordinates.

This is bad for at least a couple of reasons. Each time Lassie detects some device broadcasting a message, she needs to transmit her current position (along with the pseudonym she sees) to Apple’s servers. This means Lassie is constantly telling Apple where she is. And moreover, even if Apple promises not to store Lassie‘s identity, the result of all these messages is a huge centralized database that shows every GPS location where some Apple device has been detected.

Note that this data, in the aggregate, can be pretty revealing. Yes, the identifiers of the devices might be pseudonyms — but that doesn’t make the information useless. For example: a record showing that some Apple device is broadcasting from my home address at certain hours of the day would probably reveal when I’m in my house.

An obvious way to prevent this data from being revealed to Apple is to encrypt it — so that only parties who actually need to know the location of a device can see this information. If Lassie picks up a broadcast from Timmy, then the only person who actually needs to know Lassie‘s GPS location is Ruth. To keep this information private, Lassie should encrypt her coordinates under Ruth’s encryption key.

This, of course, raises a problem: how does Lassie get Ruth‘s key? An obvious solution is for Timmy to shout out Ruth’s public key as part of every broadcast he makes. Of course, this would produce a static identifier that would make Timmy‘s broadcasts linkable again.

To solve that problem, we need Ruth to have many unlinkable public keys, so that Timmy can give out a different one with each broadcast. One way to do this is have Ruth and Timmy generate many different shared keypairs (or generate many from some shared seed). But this is annoying and involves Ruth storing many secret keys. And in fact, the identifiers we mentioned in the previous section can be derived by hashing each public key.

A slightly better approach (that Apple may not employ) makes use of key randomization. This is a feature provided by cryptosystems like Elgamal: it allows any party to randomize a public key, so that the randomized key is completely unlinkable to the original. The best part of this feature is that Ruth can use a single secret key regardless of which randomized version of her public key was used to encrypt.

All of this leads to a final protocol idea. Each time Timmy broadcasts, he uses a fresh pseudonym and a randomized copy of Ruth‘s public key. When Lassie receives a broadcast, she encrypts her GPS coordinates under the public key, and sends the encrypted message to Apple. Ruth can send in Timmy‘s pseudonyms to Apple’s servers, and if Apple finds a match, she can obtain and decrypt the GPS coordinates.

Does this solve all the problems?

The nasty thing about this problem setting is that, with many weird edge cases, there just isn’t a perfect solution. For example, what if Timmy is evil and wants to make Lassie reveal her location to Apple? What if Old Man Smithers tries to kidnap Lassie?

At a certain point, the answer to these question is just to say that we’ve done our best: any remaining problems will have to be outside the threat model. Sometimes even Lassie knows when to quit.

A few thoughts on Ray Ozzie’s “Clear” Proposal

Yesterday I happened upon a Wired piece by Steven Levy that covers Ray Ozzie’s proposal for “CLEAR”. I’m quoted at the end of the piece (saying nothing iphone-x-silver-select-2017_AV3 much), so I knew the piece was coming. But since many of the things I said to Levy were fairly skeptical — and most didn’t make it into the piece — I figured it might be worthwhile to say a few of them here.

Ozzie’s proposal is effectively a key escrow system for encrypted phones. It’s receiving attention now due to the fact that Ozzie has a stellar reputation in the industry, and due to the fact that it’s been lauded by law enforcement (and some famous people like Bill Gates). Ozzie’s idea is the just the latest bit of news in this second edition of the “Crypto Wars”, in which the FBI and various law enforcement agencies have been arguing for access to end-to-end encryption technologies — like phone storage and messaging — in the face of pretty strenuous opposition by (most of) the tech community.

In this post I’m going to sketch a few thoughts about Ozzie’s proposal, and about the debate in general. Since this is a cryptography blog, I’m mainly going to stick to the technical, and avoid the policy details (which are substantial). Also, since the full details of Ozzie’s proposal aren’t yet public — some are explained in the Levy piece and some in this patent — please forgive me if I get a few details wrong. I’ll gladly correct.

[Note: I’ve updated this post in several places in response to some feedback from Ray Ozzie. For the updated parts, look for the *. Also, Ozzie has posted some slides about his proposal.]

How to Encrypt a Phone

The Ozzie proposal doesn’t try tackle every form of encrypted data. Instead it focuses like a laser on the simple issue of encrypted phone storage. This is something that law enforcement has been extremely concerned about. It also represents the (relatively) low-hanging fruit of the crypto debate, for essentially two reasons: (1) there are only a few phone hardware manufacturers, and (2) access to an encrypted phone generally only takes place after law enforcement has gained physical access to it.

I’ve written about the details of encrypted phone storage in a couple of previous posts. A quick recap: most phone operating systems encrypt a large fraction of the data stored on your device. They do this using an encryption key that is (typically) derived from the user’s passcode. Many recent phones also strengthen this key by “tangling” it with secrets that are stored within the phone itself — typically with the assistance of a secure processor included in the phone. This further strengthens the device against simple password guessing attacks.

The upshot is that the FBI and local law enforcement have not — until very recently (more on that further below) — been able to obtain access to many of the phones they’ve obtained during investigation. This is due the fact that, by making the encryption key a function of the user’s passcode, manufacturers like Apple have effectively rendered themselves unable to assist law enforcement.

The Ozzie Escrow Proposal

Ozzie’s proposal is called “Clear”, and it’s fairly straightforward. Effectively, it calls for manufacturers (e.g., Apple) to deliberately put themselves back in the loop. To do this, Ozzie proposes a simple form of key escrow (or “passcode escrow”). I’m going to use Apple as our example in this discussion, but obviously the proposal will apply to other manufacturers as well.

Ozzie’s proposal works like this:

Prior to manufacturing a phone, Apple will generate a public and secret “keypair” for some public key encryption scheme. They’ll install the public key into the phone, and keep the secret key in a “vault” where hopefully it will never be needed.
When a user sets a new passcode onto their phone, the phone will encrypt a passcode under the Apple-provided public key. This won’t necessarily be the user’s passcode, but it will be an equivalent passcode that can unlock the phone.* It will store the encrypted result in the phone’s storage.
In the unlikely event that the FBI (or police) obtain the phone and need to access its files, they’ll place the phone into some form of law enforcement recovery mode. Ozzie describes doing this with some special gesture, or “twist”. Alternatively, Ozzie says that Apple itself could do something more complicated, such as performing an interactive challenge/response with the phone in order to verify that it’s in the FBI’s possession.
The phone will now hand the encrypted passcode to law enforcement. (In his patent, Ozzie suggests it might be displayed as a barcode on a screen.)
The law enforcement agency will send this data to Apple, who will do a bunch of checks (to make sure this is a real phone and isn’t in the hands of criminals). Apple will access their secret key vault, and decrypt the passcode. They can then send this back to the FBI.
Once the FBI enters this code, the phone will be “bricked”. Let me be more specific: Ozzie proposes that once activated, a secure chip inside the phone will now permanently “blow” several JTAG fuses monitored by the OS, placing the phone into a locked mode. By reading the value of those fuses as having been blown, the OS will never again overwrite its own storage, will never again talk to any network, and will become effectively unable to operate as a normal phone again.

When put into its essential form, this all seems pretty simple. That’s because it is. In fact, with the exception of the fancy “phone bricking” stuff in step (6), Ozzie’s proposal is a straightforward example of key escrow — a proposal that people have been making in various guises for many years. The devil is always in the details.

A vault of secrets

If we picture how the Ozzie proposal will change things for phone manufacturers, the most obvious new element is the key vault. This is not a metaphor. It literally refers to a giant, ultra-secure vault that will have to be maintained individually by different phone manufacturers. The security of this vault is no laughing matter, because it will ultimately store the master encryption key(s) for every single device that manufacturer ever makes. For Apple alone, that’s about a billion active devices.

Does this vault sound like it might become a target for organized criminals and well-funded foreign intelligence agencies? If it sounds that way to you, then you’ve hit on one of the most challenging problems with deploying key escrow systems at this scale. Centralized key repositories — that can decrypt every phone in the world — are basically a magnet for the sort of attackers you absolutely don’t want to be forced to defend yourself against.

So let’s be clear. Ozzie’s proposal relies fundamentally on the ability of manufacturers to secure extremely valuable key material for a massive number of devices against the strongest and most resourceful attackers on the planet. And not just rich companies like Apple. We’re also talking about the companies that make inexpensive phones and have a thinner profit margin. We’re also talking about many foreign-owned companies like ZTE and Samsung. This is key material that will be subject to near-constant access by the manufacturer’s employees, who will have to access these keys regularly in order to satisfy what may be thousands of law enforcement access requests every month.

If ever a single attacker gains access to that vault and is able to extract, a few “master” secret keys (Ozzie says that these master keys will be relatively small in size*) then the attackers will gain unencrypted access to every device in the world. Even better: if the attackers can do this surreptitiously, you’ll never know they did it.

Now in fairness, this element of Ozzie’s proposal isn’t really new. In fact, this key storage issue an inherent aspect of all massive-scale key escrow proposals. In the general case, the people who argue in favor of such proposals typically make two arguments:

We already store lots of secret keys — for example, software signing keys — and things works out fine. So this isn’t really a new thing.
Hardware Security Modules.

Let’s take these one at a time.

It is certainly true that software manufacturers do store secret keys, with varying degrees of success. For example, many software manufacturers (including Apple) store secret keys that they use to sign software updates. These keys are generally locked up in various ways, and are accessed periodically in order to sign new software. In theory they can be stored in hardened vaults, with biometric access controls (as the vaults Ozzie describes would have to be.)

But this is pretty much where the similarity ends. You don’t have to be a technical genius to recognize that there’s a world of difference between a key that gets accessed once every month — and can be revoked if it’s discovered in the wild — and a key that may be accessed dozens of times per day and will be effectively undetectable if it’s captured by a sophisticated adversary.

Moreover, signing keys leak all the time. The phenomenon is so common that journalists have given it a name: it’s called “Stuxnet-style code signing”. The name derives from the fact that the Stuxnet malware — the nation-state malware used to sabotage Iran’s nuclear program — was authenticated with valid code signing keys, many of which were (presumably) stolen from various software vendors. This practice hasn’t remained with nation states, unfortunately, and has now become common in retail malware.

The folks who argue in favor of key escrow proposals generally propose that these keys can be stored securely in special devices called Hardware Security Modules (HSMs). Many HSMs are quite solid. They are not magic, however, and they are certainly not up to the threat model that a massive-scale key escrow system would expose them to. Rather than being invulnerable, they continue to cough up vulnerabilities like this one. A single such vulnerability could be game-over for any key escrow system that used it.

In some follow up emails, Ozzie suggests that keys could be “rotated” periodically, ensuring that even after a key compromise the system could renew security eventually. He also emphasizes the security mechanisms (such as biometric access controls) that would be present in such a vault. I think that these are certainly valuable and necessary protections, but I’m not convinced that they would be sufficient.

Assume a secure processor

Let’s suppose for a second that an attacker does get access to the Apple (or Samsung, or ZTE) key vault. In the section above I addressed the likelihood of such an attack. Now let’s talk about the impact.

Ozzie’s proposal has one significant countermeasure against an attacker who wants to use these stolen keys to illegally spy on (access) your phone. Specifically, should an attacker attempt to illegally access your phone, the phone will be effectively destroyed. This doesn’t protect you from having your files read — that horse has fled the stable — but it should alert you to the fact that something fishy is going on. This is better than nothing.

This measure is pretty important, not only because it protects you against evil maid attacks. As far as I can tell, this protection is pretty much the only measure by which theft of the master decryption keys might ever be detected. So it had better work well.

The details on how this might work aren’t very clear in Ozzie’s patent, but the Wired article describes it as follows. This quote to repeat Ozzie’s presentation at Columbia University:

DbptcOuX4AIR3vy

What Ozzie appears to describe here is a secure processor contained within every phone. This processor would be capable if securely and irreversibly enforcing that once law enforcement has accessed a phone, that phone could no longer be placed into an operational state.

My concern with this part of Ozzie’s proposal is fairly simple: this processor does not currently exist. To explain why this, let me tell a story.

Back in 2013, Apple began installing a secure processor in each of their phones. While this secure processor (called the Secure Enclave Processor, or SEP) is not exactly the same as the one Ozzie proposes, the overall security architecture seems very similar.

One main goal of Apple’s SEP was to limit the number of passcode guessing attempts that a user could make against a locked iPhone. In short, it was designed to keep track of each (failed) login attempt and keep a counter. If the number of attempts got too high, the SEP would make the user wait a while — in the best case — or actively destroy the phone’s keys. This last protection is effectively identical to Ozzie’s proposal. (With some modest differences: Ozzie proposes to “blow fuses” in the phone, rather than erasing a key; and he suggests that this event would triggered by entry of a recovery passcode.*)

For several years, the SEP appeared to do its job fairly effectively. Then in 2017, everything went wrong. Two firms, Cellebrite and Grayshift, announced that they had products that effectively unlocked every single Apple phone, without any need to dismantle the phone. Digging into the details of this exploit, it seems very clear that both firms — working independently — have found software exploits that somehow disable the protections that are supposed to be offered by the SEP.

The cost of this exploit (to police and other law enforcement)? About $3,000-$5,000 per phone. Or (if you like to buy rather than rent) about $15,000. Aso, just to add an element of comedy to the situation, the GrayKey source code appears to have recently been stolen. The attackers are extorting the company for two Bitcoin. Because 2018. (🤡👞)

Let me sum this up my point in case I’m not beating you about the head quite enough:

The richest and most sophisticated phone manufacturer in the entire world tried to build a processor that achieved goals similar to those Ozzie requires. And as of April 2018, after five years of trying, they have been unable to achieve this goal — a goal that is critical to the security of the Ozzie proposal as I understand it.

Now obviously the lack of a secure processor today doesn’t mean such a processor will never exist. However, let me propose a general rule: if your proposal fundamentally relies on a secure lock that nobody can ever break, then it’s on you to show me how to build that lock.

Conclusion

While this mainly concludes my notes about on Ozzie’s proposal, I want to conclude this post with a side note, a response to something I routinely hear from folks in the law enforcement community. This is the criticism that cryptographers are a bunch of naysayers who aren’t trying to solve “one of the most fundamental problems of our time”, and are instead just rejecting the problem with lazy claims that it “can’t work”.

As a researcher, my response to this is: phooey.

Cryptographers — myself most definitely included — love to solve crazy problems. We do this all the time. You want us to deploy a new cryptocurrency? No problem! Want us to build a system that conducts a sugar-beet auction using advanced multiparty computation techniques? Awesome. We’re there. No problem at all.

But there’s crazy and there’s crazy.

The reason so few of us are willing to bet on massive-scale key escrow systems is that we’ve thought about it and we don’t think it will work. We’ve looked at the threat model, the usage model, and the quality of hardware and software that exists today. Our informed opinion is that there’s no detection system for key theft, there’s no renewability system, HSMs are terrifically vulnerable (and the companies largely staffed with ex-intelligence employees), and insiders can be suborned. We’re not going to put the data of a few billion people on the line an environment where we believe with high probability that the system will fail.

Maybe that’s unreasonable. If so, I can live with that.

Apple in China: who holds the keys?

Last week Apple made an announcement describing changes to the iCloud service for tuhao-gold-iphone-640x405 users residing in mainland China. Beginning on February 28th, all users who have specified China as their country/region will have their iCloud data transferred to the GCBD cloud services operator in Guizhou, China.

Chinese news sources optimistically describe the move as a way to offer improved network performance to Chinese users, while Apple admits that the change was mandated by new Chinese regulations on cloud services. Both explanations are almost certainly true. But neither answers the following question: regardless of where it’s stored, how secure is this data?

Apple offers the following:

Apple has strong data privacy and security protections in place and no backdoors will be created into any of our systems”

That sounds nice. But what, precisely, does it mean? If Apple is storing user data on Chinese services, we have to at least accept the possibility that the Chinese government might wish to access it — and possibly without Apple’s permission. Is Apple saying that this is technically impossible?

This is a question, as you may have guessed, that boils down to encryption.

Does Apple encrypt your iCloud backups?

Unfortunately there are many different answers to this question, depending on which part of iCloud you’re talking about, and — ugh — which definition you use for “encrypt”. The dumb answer is the one given in the chart on the right: all iCloud data probably is encrypted. But that’s the wrong question. The right question is: who holds the key(s)?

Untitled 4 — This kind of thing is Not Helpful.

There’s a pretty simple thought experiment you can use to figure out whether you (or a provider) control your encryption keys. I call it the “mud puddle test”. It goes like this:

Imagine you slip in a mud puddle, in the process (1) destroying your phone, and (2) developing temporary amnesia that causes you to forget your password. Can you still get your iCloud data back? If you can (with the help of Apple Support), then you don’t control the key.

With one major exception — iCloud Keychain, which I’ll discuss below — iCloud fails the mud puddle test. That’s because most Apple files are not end-to-end encrypted. In fact, Apple’s iOS security guide is clear that it sends the keys for encrypted files out to iCloud.

However, there is a wrinkle. You see, iCloud isn’t entirely an Apple service, not even here in the good-old U.S.A. In fact, the vast majority of iCloud data isn’t actually stored by Apple at all. Every time you back up your phone, your (encrypted)

Untitled 6 — A list of HTTPS requests made during an iCloud backup from an iPhone. The bottom two addresses are Amazon and Google Cloud Services “blob” stores.

data is transmitted directly to a variety of third-party cloud service providers including Amazon, Google and Microsoft.

And this is, from a privacy perspective, mostly** fine! Those services act merely as “blob stores”, storing unreadable encrypted data files uploaded by Apple’s customers. At least in principle, Apple controls the encryption keys for that data, ideally on a server located in a dedicated Apple datacenter.*

So what exactly is Apple storing in China?

Good question!

You see, it’s entirely possible that the new Chinese cloud stores will perform the same task that Amazon AWS, Google, or Microsoft do in the U.S. That is, they’re storing encrypted blobs of data that can’t be decrypted without first contacting the iCloud mothership back in the U.S. That would at least be one straightforward reading of Apple’s announcement, and it would also be the most straightforward mapping from iCloud’s current architecture and whatever it is Apple is doing in China.

Of course, this interpretation seems hard to swallow. In part this is due to the fact that some of the new Chinese regulations appear to include guidelines for user monitoring. I’m no lawyer, and certainly not an expert in Chinese law — so I can’t tell you if those would apply to backups. But it’s at least reasonable to ask whether Chinese law enforcement agencies would accept the total inability to access this data without phoning home to Cupertino, not to mention that this would give Apple the ability to instantly wipe all Chinese accounts. Solving these problems (for China) would require Apple to store keys as well as data in Chinese datacenters.

The critical point is that these two interpretations are not compatible. One implies that Apple is simply doing business as usual. The other implies that they may have substantially weakened the security protections of their system — at least for Chinese users.

And here’s my problem. If Apple needs to fundamentally rearchitect iCloud to comply with Chinese regulations, that’s certainly an option. But they should say explicitly and unambiguously what they’ve done. If they don’t make things explicit, then it raises the possibility that they could make the same changes for any other portion of the iCloud infrastructure without announcing it.

It seems like it would be a good idea for Apple just to clear this up a bit.

You said there was an exception. What about iCloud Keychain?

I said above that there’s one place where iCloud passes the mud puddle test. This is Apple’s Cloud Key Vault, which is currently used to implement iCloud Keychain. This is a special service that stores passwords and keys for applications, using a much stronger protection level than is used in the rest of iCloud. It’s a good model for how the rest of iCloud could one day be implemented.

For a description, see here. Briefly, the Cloud Key Vault uses a specialized piece of hardware called a Hardware Security Module (HSM) to store encryption keys. This HSM is a physical box located on Apple property. Users can access their own keys if and only if they know their iCloud Keychain password — which is typically the same as the PIN/password on your iOS device. However, if anyone attempts to guess this PIN too many times, the HSM will wipe that user’s stored keys.

The critical thing is that the “anyone” mentioned above includes even Apple themselves. In short: Apple has designed a key vault that even they can’t be forced to open. Only customers can get their own keys.

What’s strange about the recent Apple announcement is that users in China will apparently still have access to iCloud Keychain. This means that either (1) at least some data will be totally inaccessible to the Chinese government, or (2) Apple has somehow weakened the version of Cloud Key Vault deployed to Chinese users. The latter would be extremely unfortunate, and it would raise even deeper questions about the integrity of Apple’s systems.

Probably there’s nothing funny going on, but this is an example of how Apple’s vague (and imprecise) explanations make it harder to trust their infrastructure around the world.

So what should Apple do?

Unfortunately, the problem with Apple’s disclosure of its China’s news is, well, really just a version of the same problem that’s existed with Apple’s entire approach to iCloud.

Where Apple provides overwhelming detail about their best security systems (file encryption, iOS, iMessage), they provide distressingly little technical detail about the weaker links like iCloud encryption. We know that Apple can access and even hand over iCloud backups to law enforcement. But what about Apple’s partners? What about keychain data? How is this information protected? Who knows.

This vague approach to security might make it easier for Apple to brush off the security impact of changes like the recent China news (“look, no backdoors!”) But it also confuses the picture, and calls into doubt any future technical security improvements that Apple might be planning to make in the future. For example, this article from 2016 claims that Apple is planning stronger overall encryption for iCloud. Are those plans scrapped? And if not, will those plans fly in the new Chinese version of iCloud? Will there be two technically different versions of iCloud? Who even knows?

And at the end of the day, if Apple can’t trust us enough to explain how their systems work, then maybe we shouldn’t trust them either.

Notes:

* This is actually just a guess. Apple could also outsource their key storage to a third-party provider, even though this would be dumb.

** A big caveat here is that some iCloud backup systems use convergent encryption, also known as “message locked encryption”. The idea in these systems is that file encryption keys are derived by hashing the file itself. Even if a cloud storage provider does not possess encryption keys, it might be able to test if a user has a copy of a specific file. This could be problematic. However, it’s not really clear from Apple’s documentation if this attack is feasible. (Thanks to RPW for pointing this out.)

Secure computing for journalists

This morning on Twitter, Buzzfeed editor Miriam Elder asks the following question:

Possibly stupid question: is the Signal desktop client as secure as the mobile app?

— Miriam Elder (@MiriamElder) March 3, 2017

No, this is not a stupid question. Actually it’s an extremely important question, and judging by some of the responses to this Tweet there are a lot of other people who are confused about the answer.

Since I couldn’t find a perfect layperson’s reference anywhere else, I’m going to devote this post to providing the world’s simplest explanation of why, in the threat model of your typical journalist, your desktop machine isn’t very safe. And specifically, why you’re safer using a modern mobile device — and particularly, an iOS device — than just about any other platform.

A brief caveat: I’m a cryptographer, not a software security researcher. However, I’ve spent the past several years interacting with folks like Charlie and Dan and Thomas. I’m pretty confident that they agree with this advice.

What’s wrong with my laptop/desktop machine?

Sadly, most of the problem is you.

If you’re like most journalists — and really, most professionals — you spend less than 100% of your time thinking about security. You need to get work done. When you’re procrastinating from work, you visit funny sites your friends link you to on Facebook. Then you check your email. If you’re a normal and productive user, you probably do a combination of all these things every few minutes, all of which culminates in your downloading some email attachment and (shudder) opening it in Word.

You can’t tell journalists “just don’t open attachments.” They will ignore you. Journos open attachments from strangers for a living.

— Eva (@evacide) September 12, 2016

Now I’m not trying to shame you for this. It’s perfectly normal, and indeed it’s necessary if you want to get things done. But in the parlance of security professionals, it also means you have a huge attack surface.

In English, this means that from the perspective of an attacker there are many different avenues to compromise your machine. Many of these aren’t even that sophisticated. Often it’s just a matter of catching you during an unguarded moment and convincing you to download an executable file or an infected Office document. A compromised machine means that every piece of software on that machine is also vulnerable.

If you don’t believe this works, head over to Google and search for “Remote Access Trojans”. There’s an entire commercial market for these products, each of which allows you to remotely control someone else’s computer. These off-the-shelf products aren’t very sophisticated: indeed, most require you to trick your victim into downloading and running some executable attachment. Sadly, this works on most people just fine. And this is just the retail stuff. Imagine what a modestly sophisticated attacker can do.

I do some of those things on my phone as well. Why is a phone better?

Classical (desktop and laptop) operating systems were designed primarily to support application developers. This means they offer a lot of power to your applications. An application like Microsoft Word can typically read and write all the files available to your account. If Word becomes compromised, this is usually enough to pwn you in practice. And in many cases, these applications have components with root (or Administrator) access, which makes them even more dangerous.

Modern phone operating systems like Android and iOS were built on a different principle. Rather than trusting apps with much power, each app runs in a “sandbox” that (mainly) limits it to accessing its own files. If the sandbox works, even a malicious application shouldn’t be able to reach out to touch other apps’ files or permanently modify your system. This approach — combined with other protections such as in-memory code signing, hardware secret storage and routine use of anti-exploitation measures — makes your system vastly harder to compromise.

Of course, sandboxing isn’t perfect. A compromised or malicious app can always access its own files. More sophisticated exploits can “break out” of the sandbox, typically by exploiting a vulnerability in the operating system. Such vulnerabilities are routinely discovered and occasionally exploited.

The defense to this is twofold: (1) first, run a modern, up-to-date OS that receives security patches quickly. And (2) avoid downloading malicious apps. Which brings me to the main point of this post.

Why use iOS?

The fact of the matter is that when it comes to addressing these remaining issues, Apple phone operating systems (on iPhones and iPads) simply have a better track record.

Since Apple is the only manufacturer of iOS devices, there is no “middleman” when it comes to monitoring for iOS issues and deploying iOS security updates. This means that the buck stops at Apple — rather than with some third-party equipment manufacturer. Indeed, Apple routinely patches its operating systems and pushes the patches to all supported users — sometimes within hours of learning of a vulnerability (something that is relatively rare at this point in any case).

Of course, to be fair: Google has also become fairly decent at supporting its own Android devices. However, to get assurance from this process you need to be running a relatively brand new device and it needs to be manufactured by Google. Otherwise you’re liable to be several days or weeks behind the time when a security issue is discovered and patched — if you ever get it. And Google still does not support all of the features Apple does, including in-memory code signing and strong file encryption.

Apple also seems to do a relatively decent job at curating its App Store, at least as compared to Google. And because those apps support a more modern base of phones, they tend to have access to better security features, whereas Android apps more routinely get caught doing dumb stuff for backwards compatibility reasons.

640x960 — A password manager using the SEP.

Finally, every recent Apple device (starting with the iPhone 5S and up) also includes a specialized chip known as a “Secure Enclave Processor“. This hardened processor assists in securing the boot chain — ensuring that nobody can tamper with your operating system. It can also protect sensitive values like your passwords, ensuring that only a password or fingerprint can access them.

A few Android phones also offer similar features as well. However, it’s unclear how well these are implemented in contrast to Apple’s SEP. It’s not a bet I would choose to take.

So does using iOS mean I’m perfectly safe?

Of course not. Unfortunately, computer security today is about resisting attacks. We still don’t quite know how to prevent them altogether.

Indeed, well-funded attackers like governments are still capable of compromising your iOS device (and your Android, and your PC or Mac). Literally the only question is how much they’ll have to spend doing it.

Here’s one data point. Last year a human rights activist in the UAE was targeted via a powerful zero day exploit, likely by his government. However, he was careful. Instead of clicking the link he was sent, the activist sent it to the engineers at Citizenlab who reverse-engineered the exploit. The resulting 35-page technical report by Lookout Security and Citizenlab is a thing of terrifying beauty: it describes a chain of no less than three previously unpublished software exploits, which together would have led to the complete compromise of the victim’s iPhone.

But such compromises don’t come cheap. It’s easy to see this kind of attack costing a million dollars or more. This is probably orders of magnitude more than it would cost to compromise the typical desktop user. That’s important. Not perfect, but important.

You’re telling me I have to give up my desktop machine?

Not at all. Or rather, while I’d love to tell you that, I understand this may not be realistic for most users.

All I am telling you to do is to be thoughtful. If you’re working on something sensitive, consider moving the majority of that work (and communications) to a secure device until you’re ready to share it. This may be a bit of a hassle, but it doesn’t have to be your whole life. And since most of us already carry some sort of phone or tablet in addition to our regular work computer, hopefully this won’t require too much of a change in your life.

You can still use your normal computer just fine, as long as you’re aware of the relative risks. That’s all I’m trying to accomplish with this post.

In conclusion

I expect that many technical people will find this post objectionable, largely because they assume that with their expertise and care they can make a desktop operating system work perfectly safely. And maybe they can! But that’s not who this post is addressed to.

And of course, this post still only scratches the surface of the problem. There’s still the problem of selecting the right applications for secure messaging (e.g., Signal and WhatsApp) and finding a good secure application for notetaking and document collaboration and so on.

But hopefully this post at least starts the discussion.

The limitations of Android N Encryption

Over the past few years pixelphone we’ve heard more about smartphone encryption than, quite frankly, most of us expected to hear in a lifetime. We learned that proper encryption can slow down even sophisticated decryption attempts if done correctly. We’ve also learned that incorrect implementations can undo most of that security.

In other words, phone encryption is an area where details matter. For the past few weeks I’ve been looking a bit at Android Nougat’s new file-based encryption to see how well they’ve addressed some of those details in their latest release. The answer, unfortunately, is that there’s still lots of work to do. In this post I’m going to talk about a bit of that.

(As an aside: the inspiration for this post comes from Grugq, who has been loudly and angrily trying to work through these kinks to develop a secure Android phone. So credit where credit is due.)

Background: file and disk encryption

Disk encryption is much older than smartphones. Indeed, early encrypting filesystems date back at least to the early 1990s and proprietary implementations may go back before that. Even in the relatively new area of PCs operating systems, disk encryption has been a built-in feature since the early 2000s.

The typical PC disk encryption system operates as follows. At boot time you enter a password. This is fed through a key derivation function to derive a cryptographic key. If a hardware co-processor is available (e.g., a TPM), your key is further strengthened by “tangling” it with some secrets stored in the hardware. This helps to lock encryption to a particular device.

The actual encryption can be done in one of two different ways:

Full Disk Encryption (FDE) systems (like Truecrypt, BitLocker and FileVault) encrypt disks at the level of disk sectors. This is an all-or-nothing approach, since the encryption drivers won’t necessarily have any idea what files those sectors represent. At the same time, FDE is popular — mainly because it’s extremely easy to implement.
File-based Encryption (FBE) systems (like EncFS and eCryptFS) encrypt individual files. This approach requires changes to the filesystem itself, but has the benefit of allowing fine grained access controls where individual files are encrypted using different keys.

Most commercial PC disk encryption software has historically opted to use the full-disk encryption (FDE) approach. Mostly this is just a matter of expediency: FDE is just significantly easier to implement. But philosophically, it also reflects a particular view of what disk encryption was meant to accomplish.

In this view, encryption is an all-or-nothing proposition. Your machine is either on or off; accessible or inaccessible. As long as you make sure to have your laptop stolen only when it’s off, disk encryption will keep you perfectly safe.

So what does this have to do with Android?

Android’s early attempts at adding encryption to their phones followed the standard PC full-disk encryption paradigm. Beginning in Android 4.4 (Kitkat) through Android 6.0 (Marshmallow), Android systems shipped with a kernel device mapper called dm-crypt designed to encrypt disks at the sector level. This represented a quick and dirty way to bring encryption to Android phones, and it made sense — if you believe that phones are just very tiny PCs.

The problem is that smartphones are not PCs.

The major difference is that smartphone users are never encouraged to shut down their device. In practice this means that — after you enter a passcode once after boot — normal users spend their whole day walking around with all their cryptographic keys in RAM. Since phone batteries live for a day or more (a long time compared to laptops) encryption doesn’t really offer much to protect you against an attacker who gets their hands on your phone during this time.

Of course, users do lock their smartphones. In principle, a clever implementation could evict sensitive cryptographic keys from RAM when the device locks, then re-derive them the next time the user logs in. Unfortunately, Android doesn’t do this — for the very simple reason that Android users want their phones to actually work. Without cryptographic keys in RAM, an FDE system loses access to everything on the storage drive. In practice this turns it into a brick.

For this very excellent reason, once you boot an Android FDE phone it will never evict its cryptographic keys from RAM. And this is not good.

So what’s the alternative?

Android is not the only game in town when it comes to phone encryption. Apple, for its part, also gave this problem a lot of thought and came to a subtly different solution.

Starting with iOS 4, Apple included a “data protection” feature to encrypt all data stored a device. But unlike Android, Apple doesn’t use the full-disk encryption paradigm. Instead, they employ a file-based encryption approach that individually encrypts each file on the device.

In the Apple system, the contents of each file is encrypted under a unique per-file key (metadata is encrypted separately). The file key is in turn encrypted with one of several “class keys” that are derived from the user passcode and some hardware secrets embedded in the processor.

The main advantage of the Apple approach is that instead of a single FDE key to rule them all, Apple can implement fine-grained access control for individual files. To enable this, iOS provides an API developers can use to specify which class key to use in encrypting any given file. The available “protection classes” include:

Complete protection. Files encrypted with this class key can only be accessed when the device is powered up and unlocked. To ensure this, the class key is evicted from RAM a few seconds after the device locks.
Protected Until First User Authentication. Files encrypted with this class key are protected until the user first logs in (after a reboot), and the key remains in memory.
No protection. These files are accessible even when the device has been rebooted, and the user has not yet logged in.

By giving developers the option to individually protect different files, Apple made it possible to build applications that can work while the device is locked, while providing strong protection for files containing sensitive data.

Apple even created a fourth option for apps that simply need to create new encrypted files when the class key has been evicted from RAM. This class uses public key encryption to write new files. This is why you can safely take pictures even when your device is locked.

Apple’s approach isn’t perfect. What it is, however, is the obvious result of a long and careful thought process. All of which raises the following question…

Why the hell didn’t Android do this as well?

The short answer is Android is trying to. Sort of. Let me explain.

As of Android 7.0 (Nougat), Google has moved away from full-disk encryption as the primary mechanism for protecting data at rest. If you set a passcode on your device, Android N systems can be configured to support a more Apple-like approach that uses file encryption. So far so good.

The new system is called Direct Boot, so named because it addresses what Google obviously saw as fatal problem with Android FDE — namely, that FDE-protected phones are useless bricks following a reboot. The main advantage of the new model is that it allows phones to access some data even before you enter the passcode. This is enabled by providing developers with two separate “encryption contexts”:

Credential encrypted storage. Files in this area are encrypted under the user’s passcode, and won’t be available until the user enters their passcode (once).
Device encrypted storage. These files are not encrypted under the user’s passcode (though they may be encrypted using hardware secrets). Thus they are available after boot, even before the user enters a passcode.

Direct Boot even provides separate encryption contexts for different users on the phone — something I’m not quite sure what to do with. But sure, why not?

If Android is making all these changes, what’s the problem?

One thing you might have noticed is that where Apple had four categories of protection, Android N only has two. And it’s the two missing categories that cause the problems. These are the “complete protection” categories that allow the user to lock their device following first user authentication — and evict the keys from memory.

Of course, you might argue that Android could provide this by forcing application developers to switch back to “device encrypted storage” following a device lock. The problem with this idea is twofold. First, Android documentation and sample code is explicit that this isn’t how things work:

credentialenc

Moreover, a quick read of the documentation shows that even if you wanted to, there is no unambiguous way for Android to tell applications when the system has been re-locked. If keys are evicted when the device is locked, applications will unexpectedly find their file accesses returning errors. Even system applications tend to do badly when this happens.

And of course, this assumes that Android N will even try to evict keys when you lock the device. Here’s how the current filesystem encryption code handles locks:

lockuserkey

While the above is bad, it’s important to stress that the real problem here is not really in the cryptography. The problem is that since Google is not giving developers proper guidance, the company may be locking Android into years of insecurity. Without (even a half-baked) solution to define a “complete” protection class, Android app developers can’t build their apps correctly to support the idea that devices can lock. Even if Android O gets around to implementing key eviction, the existing legacy app base won’t be able to handle it — since this will break a million apps that have implemented their security according to Android’s current recommendations.

In short: this is a thing you get right from the start, or you don’t do at all. It looks like — for the moment — Android isn’t getting it right.

Are keys that easy to steal?

Of course it’s reasonable to ask whether it’s having keys in RAM is that big of concern in the first place. Can these keys actually be accessed?

The answer to that question is a bit complicated. First, if you’re up against somebody with a hardware lab and forensic expertise, the answer is almost certainly “yes”. Once you’ve entered your passcode and derived the keys, they aren’t stored in some magically secure part of the phone. People with the ability to access RAM or the bus lines of the device can potentially nick them.

But that’s a lot of work. From a software perspective, it’s even worse. A software attack would require a way to get past the phone’s lockscreen in order to get running code on the device. In older (pre-N) versions of Android the attacker might need to then escalate privileges to get access to Kernel memory. Remarkably, Android N doesn’t even store its disk keys in the Kernel — instead they’re held by the “vold” daemon, which runs as user “root” in userspace. This doesn’t make exploits trivial, but it certainly isn’t the best way to handle things.

Of course, all of this is mostly irrelevant. The main point is that if the keys are loaded you don’t need to steal them. If you have a way to get past the lockscreen, you can just access files on the disk.

What about hardware?

Although a bit of a tangent, it’s worth noting that many high-end Android phones use some sort of trusted hardware to enable encryption. The most common approach is to use a trusted execution environment (TEE) running with ARM TrustZone.

This definitely solves a problem. Unfortunately it’s not quite the same problem as discussed above. ARM TrustZone — when it works correctly, which is not guaranteed — forces attackers to derive their encryption keys on the device itself, which should make offline dictionary attacks on the password much harder. In some cases, this hardware can be used to cache the keys and reveal them only when you input a biometric such as a fingerprint.

The problem here is that in Android N, this only helps you at the time the keys are being initially derived. Once that happens (i.e., following your first login), the hardware doesn’t appear to do much. The resulting derived keys seem to live forever in normal userspace RAM. While it’s possible that specific phones (e.g., Google’s Pixel, or Samsung devices) implement additional countermeasures, on stock Android N phones hardware doesn’t save you.

So what does it all mean?

How you feel about this depends on whether you’re a “glass half full” or “glass half empty” kind of person.

If you’re an optimistic type, you’ll point out that Android is clearly moving in the right direction. And while there’s a lot of work still to be done, even a half-baked implementation of file-based implementation is better than the last generation of dumb FDE Android encryption. Also: you probably also think clowns are nice.

On the other hand, you might notice that this is a pretty goddamn low standard. In other words, in 2016 Android is still struggling to deploy encryption that achieves (lock screen) security that Apple figured out six years ago. And they’re not even getting it right. That doesn’t bode well for the long term security of Android users.

And that’s a shame, because as many have pointed out, the users who rely on Android phones are disproportionately poorer and more at-risk. By treating encryption as a relatively low priority, Google is basically telling these people that they shouldn’t get the same protections as other users. This may keep the FBI off Google’s backs, but in the long term it’s bad judgement on Google’s part.