Why the Tor attack matters

Earlier today, Motherboard posted a court document filed in a prosecution against a Silk Road 2.0 user, indicating that the user had been de-anonymized on the Tor network thanks to research conducted by a “university-based research institute”.

Source: Motherboard.

As Motherboard pointed out, the timing of this research lines up with an active attack on the Tor network that was discovered and publicized in July 2014. Moreover, the details of that attack were eerily similar to the abstract of a (withdrawn) BlackHat presentation submitted by two researchers at the CERT division of Carnegie Mellon University (CMU).

A few hours later, the Tor Project made the allegations more explicit, posting a blog entry accusing CMU of accepting $1 million to conduct the attack. A spokesperson for CMU didn’t exactly deny the allegations, but demanded better evidence and stated that he wasn’t aware of any payment. No doubt we’ll learn more in the coming weeks as more documents become public.

You might wonder why this is important. After all, the crimes we’re talking about are pretty disturbing. One defendant is accused of possessing child pornography, and if the allegations are true, the other was a staff member on Silk Road 2.0. If CMU really did conduct Tor de-anonymization research for the benefit of the FBI, the people they identified were allegedly not doing the nicest things. It’s hard to feel particularly sympathetic.

Except for one small detail: there’s no reason to believe that the defendants were the only people affected.

If the details of the attack are as we understand them, a group of academic researchers deliberately took control of a significant portion of the Tor network. Without oversight from the University research board, they exploited a vulnerability in the Tor protocol to conduct a traffic confirmation attack, which allowed them to identify Tor client IP addresses and hidden services. They ran this attack for five months, and potentially de-anonymized thousands of users. Users who depend on Tor to protect them from serious harm.

It’s quite possible that the CMU researchers exercised strict protocols to ensure that they didn’t accidentally de-anonymize innocent bystanders. This would be standard procedure in any legitimate research involving human subjects, particularly research that has the potential to do harm. If the researchers did take such steps, it would be nice to know about them. CMU hasn’t even admitted to the scope of the research project, nor have they published any results, so we just don’t know.

While most of the computer science researchers I know are fundamentally ethical people, as a community we have a blind spot when it comes to the ethical issues in our field. There’s a view in our community that Institutional Review Boards are for medical researchers, and we’ve somehow been accidentally caught up in machinery that wasn’t meant for us. And I get this — IRBs are unpleasant to work with. Sometimes the machinery is wrong.

But there’s also a view that computer security research can’t really hurt people, so there’s no real reason for sort of ethical oversight machinery in the first place. This is dead wrong, and if we want to be taken seriously as a mature field, we need to do something about it.

We may need different machinery, but we need something. That something begins with the understanding that active attacks that affect vulnerable users can be dangerous, and should never be conducted without rigorous oversight — if they must be conducted at all. It begins with the idea that universities should have uniform procedures for both faculty researchers and quasi-government organizations like CERT, if they live under the same roof. It begins with CERT and CMU explaining what went on with their research, rather than treating it like an embarrassment to be swept under the rug.

Most importantly, it begins with researchers looking beyond their own research practices. So far the response to the Tor news has been a big shrug. It’s wonderful that most of our community is responsible. But none of that matters if we look the other way when others in our community fail to act responsibly.

On the new Snowden documents

If you don’t follow NSA news obsessively, you might have missed yesterday’s massive Snowden document dump from Der Spiegel. The documents provide a great deal of insight into how the NSA breaks our cryptographic systems. I was very lightly involved in looking at some of this material, so I’m glad to see that it’s been published.

Unfortunately with so much material, it can be a bit hard to separate the signal from the noise. In this post I’m going to try to do that a little bit — point out the bits that I think are interesting, the parts that are old news, and the things we should keep an eye on.

Background

Those who read this blog will know that I’ve been wondering for a long time how NSA works its way around our encryption. This isn’t an academic matter, since it affects just about everyone who uses technology today.

What we’ve learned since 2013 is that NSA and its partners hoover up vast amounts of Internet traffic from fiber links around the world. Most of this data is plaintext and therefore easy to intercept. But at least some of it is encrypted — typically protected by protocols such as SSL/TLS or IPsec.

Conventional wisdom pre-Snowden told us that the increasing use of encryption ought to have shut the agencies out of this data trove. Yet the documents we’ve seen so far indicate just the opposite. Instead, the NSA and GCHQ have somehow been harvesting massive amounts of SSL/TLS and IPSEC traffic, and appear to be making inroads into other technologies such as Tor as well.

How are they doing this? To repeat an old observation, there are basically three ways to crack an encrypted connection:

  1. Go after the mathematics. This is expensive and unlikely to work well against modern encryption algorithms (with a few exceptions). The leaked documents give very little evidence of such mathematical breaks — though a bit more on this below.
  2. Go after the implementation. The new documents confirm a previously-reported and aggressive effort to undermine commercial cryptographic implementations. They also provide context for how important this type of sabotage is to the NSA.
  3. Steal the keys. Of course, the easiest way to attack any cryptosystem is simply to steal the keys. Yesterday we received a bit more evidence that this is happening.
I can’t possibly spend time on everything that’s covered by these documents — you should go read them yourself — so below I’m just going to focus on the highlights.

Not so Good Will Hunting

First, the disappointing part. The NSA may be the largest employer of cryptologic mathematicians in the United States, but — if the new story is any indication — those guys really aren’t pulling their weight.

In fact, the only significant piece of cryptanalytic news in the entire stack comes is a 2008 undergraduate research project looking at AES. Sadly, this is about as unexciting as it sounds — in fact it appears to be nothing more than a summer project by a visiting student. More interesting is the context it gives around the NSA’s efforts to break block ciphers such as AES, including the NSA’s view of the difficulty of such cryptanalysis, and confirmation that NSA has some ‘in-house techniques’.

Additionally, the documents include significant evidence that NSA has difficulty decrypting certain types of traffic, including Truecrypt, PGP/GPG, Tor and ZRTP from implementations such as RedPhone. Since these protocols share many of the same underlying cryptographic algorithms — RSA, Diffie-Hellman, ECDH and AES — some are presenting this as evidence that those primitives are cryptographically strong.

As with the AES note above, this ‘good news’ should also be taken with a grain of salt. With a small number of exceptions, it seems increasingly obvious that the Snowden documents are geared towards NSA’s analysts and operations staff. In fact, many of the systems actually seem aimed at protecting knowledge of NSA’s cryptanalytic capabilities from NSA’s own operational staff (and other Five Eyes partners). As an analyst, it’s quite possible you’ll never learn why a given intercept was successfully decrypted.

 To put this a bit more succinctly: the lack of cryptanalytic red meat in these documents may not truly be representative of the NSA’s capabilities. It may simply be an artifact of Edward Snowden’s clearances at the time he left the NSA.

Tor

One of the most surprising aspects of the Snowden documents — to those of us in the security research community anyway — is the NSA’s relative ineptitude when it comes to de-anonymizing users of the Tor anonymous communications network.

The reason for our surprise is twofold. First, Tor was never really designed to stand up against a global passive adversary — that is, an attacker who taps a huge number of communications links. If there’s one thing we’ve learned from the Snowden leaks, the NSA (plus GCHQ) is the very definition of the term. In theory at least, Tor should be a relatively easy target for the agency.

The real surprise, though, is that despite this huge signals intelligence advantage, the NSA has barely even tested their ability to de-anonymize users. In fact, this leak provides the first concrete evidence that NSA is experimenting with traffic confirmation attacks to find the source of Tor connections. Even more surprising, their techniques are relatively naive, even when compared to what’s going on in the ‘research’ community.

This doesn’t mean you should view Tor as secure against the NSA. It seems very obvious that the agency has identified Tor as a high-profile target, and we know they have the resources to make much more headway against the network. The real surprise is that they haven’t tried harder. Maybe they’re trying now.

SSL/TLS and IPSEC

 A few months ago I wrote a long post speculating about how the NSA breaks SSL/TLS. Because it’s increasingly clear that the NSA does break these protocols, and at relatively large scale.

The new documents don’t tell us much we didn’t already know, but they do confirm the basic outlines of the attack. The first portion requires endpoints around the world that are capable of performing the raw decryption of SSL/TLS sessions provided they know the session keys. The second is a separate infrastructure located on US soil that can recover those session keys when needed.

All of the real magic happens within the key recovery infrastructure. These documents provide the first evidence that a major attack strategy for NSA/GCHQ involves key databases containing the private keys for major sites. For the RSA key exchange ciphersuites of TLS, a single private key is sufficient to recover vast amounts of session traffic — in real time or even after the fact.

The interesting question is how the NSA gets those private keys. The easiest answer may be the least technical. A different Snowden leak shows gives some reason to believe that the NSA may have relationships with employees at specific named U.S. entities, and may even operate personnel “under cover”. This would certainly be one way to build a key database.

 

But even without the James Bond aspect of this, there’s every reason to believe that NSA has other means to exfiltrate RSA keys from operators. During the period in question, we know of at least one vulnerability (Heartbleed) that could have been used to extract private keys from software TLS implementations. There are still other, unreported vulnerabilities that could be used today.

 Pretty much everything I said about SSL/TLS also applies to VPN protocols, with the additional detail that many VPNs use broken protocols and relatively poorly-secured pre-shared secrets that can in some cases be brute-forced. The NSA seems positively gleeful about this.

Open Source packages: Redphone, Truecrypt, PGP and OTR

The documents provide at least circumstantial evidence that some open source encryption technologies may thwart NSA surveillance. These include Truecrypt, ZRTP implementations such as RedPhone, PGP implementations, and Off the Record messaging. These packages have a few commonalities:

  1. They’re all open source, and relatively well studied by researchers.
  2. They’re not used at terribly wide scale (as compared to e.g., SSL or VPNs)
  3. They all work on an end-to-end basis and don’t involve service providers, software distributers, or other infrastructure that could be corrupted or attacked.

What’s at least as interesting is which packages are not included on this list. Major corporate encryption protocols such as iMessage make no appearance in these documents, despite the fact that they ostensibly provide end-to-end encryption. This may be nothing. But given all we know about NSA’s access to providers, this is definitely worrying.

A note on the ethics of the leak

Before I finish, it’s worth addressing one major issue with this reporting: are we, as citizens, entitled to this information? Would we be safer keeping it all under wraps? And is this all ‘activist nonsense‘?

This story, more than some others, skates close to a line. I think it’s worth talking about why this information is important.

To sum up a complicated issue, we live in a world where targeted surveillance is probably necessary and inevitable. The evidence so far indicates that NSA is very good at this kind of work, despite some notable failures in actually executing on the intelligence it produces.

Unfortunately, the documents released so far also show that a great deal of NSA/GCHQ surveillance is not targeted at all. Vast amounts of data are scooped up indiscriminately, in the hope that some of it will someday prove useful. Worse, the NSA has decided that bulk surveillance justifies its efforts to undermine many of the security technologies that protect our own information systems. The President’s own hand-picked review council has strongly recommended this practice be stopped, but their advice has — to all appearances — been disregarded. These are matters that are worthy of debate, but this debate hasn’t happened.

Unfortunate if we can’t enact changes to fix these problems, technology is probably about all that’s left. Over the next few years encryption technologies are going to be widely deployed, not only by individuals but also by corporations desperately trying to reassure overseas customers who doubt the integrity of US technology.

In that world, it’s important to know what works and doesn’t work. Insofar as this story tells us that, it makes us all better off.

Tor and the Great Firewall of China

Here’s a fascinating post by Tim Wilde overat the Tor Project blog, discussing China’s growing sophistication in blocking Tor connections:

In October 2011, ticket #4185 was filed in the Tor bug tracker by a user in China who found that their connections to US-based Tor bridge relays were being regularly cut off after a very short period of time. At the time we performed some basic experimentation and discovered that Chinese IPs (presumably at the behest of the Great Firewall of China, or GFW) would reach out to the US-based bridge and connect to it shortly after the Tor user in China connected, and, if successful, shortly thereafter the connection would be blocked by the GFW.

we discovered two types of probing. First, “garbage binary” probes, containing nothing more than arbitrary (but sometimes repeated in later probes) binary data, were experienced by the non-China side of any connection that originated from China to TCP port 443 (HTTPS) in which an SSL negotiation was performed. … The purpose of these probes is unknown …

The second type of probe, on the other hand, is aimed quite directly at Tor. When a Tor client within China connected to a US-based bridge relay, we consistently found that at the next round 15 minute interval (HH:00, HH:15, HH:30, HH:45), the bridge relay would receive a probe from hosts within China that not only established a TCP connection, but performed an SSL negotiation, an SSL renegotiation, and then spoke the Tor protocol sufficiently to build a one-hop circuit and send a BEGIN_DIR cell. No matter what TCP port the bridge was listening on, once a Tor client from China connected, within 3 minutes of the next 15 minute interval we saw a series of probes including at least one connection speaking the Tor protocol.

Obviously this is disturbing. And unlike previous, passive efforts to block Tor, these active attacks are tough to defend against. After all, Tor was designed to be a public service. If the general public can download a Tor client and connect to a bridge, so can the Chinese government. This means that protocol-level workarounds (obfuscators, for example) will only work until China cares enough to stop them.

The situation isn’t hopeless: proposed workarounds include password protecting Tor bridges, which might solve the problem to an extent — though it seems to me that this is just kicking the problem down the road a bit. As with the bridge security model, it embeds the assumption that Chinese users can find bridges/passwords, but their government can’t. More to the point, any password protocol is going to have to work hard to look ‘innocent’ (i.e., not Tor-like) to someone who doesn’t know the password. There are a lot of ways this could go wrong.

On the research side there are ideas like Telex which would eliminate the need for bridges by embedding anti-censorship into the network. Chinese Tor clients would make TLS connections to arbitrary US websites; the connections would be monitored by special Telex routers along the way; any TLS connection with a special steganographic marking would get transparently re-routed to the nearest Tor node. Unfortunately, while the crypto in Telex is great, actually deploying it would be a nightmare — and would almost certainly require government cooperation. Even if Western governments were game, the Chinese government could respond by banning overseas TLS connections altogether.

One last note: I love a good mystery, so does anyone care to speculate about those “garbage probes”? What are they — a test? Automated fuzzing? Most likely they’re an attempt to provoke a response from some other TLS server that the Chinese government cares about, but if it’s not Tor then what is it?

Tim’s full investigation can be found here.

Update 1/26: Emily Stark points me to the Flash Proxies project out of Stanford. This would put Tor proxies in individual client machines, thus massively increasing the number of bridges available and eliminating the outgoing client->bridge connection. They even have an implementation, though I warn you: running untrusted traffic through your Flash plugin is not for the faint of heart!