Surveillance works! Let’s have more of it.

If you care about these things, you might have heard that Google recently detected and revoked a bogus Google certificate. Good work, and huge kudos to the folks at Google who lost their holidays to this nonsense.

From what I’ve read, this particular incident got started in late 2011 when Turkish Certificate Authority TURKTRUST accidentally handed out intermediate CA certificates to two of their customers. Intermediate CA certs are like normal SSL certs, with one tiny difference: they can be used to generate other SSL certificates. Oops.

One of the recipients noticed the error and reported it to the CA. The other customer took a different approach and installed it into an intercepting Checkpoint firewall to sniff SSL-secured connections. Because, you know, why not.

So this is bad but not exactly surprising — in fact, it’s all stuff we’ve seen before. Our CA system has far too many trust points, and it requires too many people to act collectively when someone proves untrustworthy. Unless we do something, we’re going to see lots more of this.

What’s interesting about this case — and what leads to the title above — is not so much what went wrong, but rather, what went right. You see, this bogus certificate was detected, and likely not because some good samaritan reported the violation. Rather, it was (probably) detected by Google’s unwavering surveillance.

The surveillance in question is conducted by the Chrome brower, which actively looks out for attacks and reports them. Here’s the money quote from their privacy policy:

“If you attempt to connect to a Google website using a secure
connection, and the browser blocks the connection due to information
that indicates you are being actively attacked by someone on the
network (a “man in the middle attack”), Chrome may send information
about that connection to Google for the purpose of helping to
determine the extent of the attack and how the attack functions.”

The specific technical mechanism in Chrome simple: Chrome ships with a series of ‘pins’ in its source code (thanks Moxie, thanks Tom). These tell Chrome what valid Google certificates should look like, and help it detect an obviously bogus certificate. When Chrome sees a bogus cert attached to a purported Google site, it doesn’t just stop the connection, it rings an alarm at Google HQ.

And that alarm is a fantastic thing, because in this case it may have led to discovery and revocation before this certificate could be stolen and used for something worse.

Now imagine the same thing happening in many other browsers, and not just for Google sites. Well, you don’t have to imagine. This is exactly the approach taken by plugins like Perspectives and Convergence, which monitor users’ SSL connections in a distributed fashion to detect bogus certificates. These plugins are great, but they’re not deployed widely enough. The technique should be standard in all browsers, perhaps with some kind of opt in. (I certainly would opt.)

The simple fact is that our CA system is broken and this is what we’ve got. Congratulations to Google for taking a major first step in protecting its users. Now let’s take some more.

TACK

Those who read this blog know that I have a particular fascination with our CA system and how screwed up it is. The good news is that I’m not the only person who feels this way, and even better, there are a few people out there doing more than just talking about it.

Two of the ‘doers’ are Moxie Marlinspike and Trevor Perrin, who recently dropped TACK, a new proposal for ‘pinning’ TLS public keys. I would have written about TACK last week when it was fresh, but I was experimenting with a new approach to international travel where one doesn’t adjust oneself to the local timezone (nb: if you saw me and I was acting funny last week… sorry.)

TACK stands for Trust Assertions for Certificate Keys, but really, does it matter? TACK is one of those acronyms that’s much more descriptive in short form than long: it’s a way to ‘pin’ TLS servers to the correct public key even when a Certificate Authority is telling you differently, e.g., because the CA is misbehaving or has had its keys stolen.

In the rest of this post I’m going to give a quick overview of the problem that TACK tries to solve, and what TACK brings to the party.

Why do I need TACK?

For a long answer to this question you can see this previous post. But here’s the short version:

Because ‘secure communication’ on the Internet requires us to trust way too many people, many of of whom haven’t earned our trust, and probably couldn’t earn it if they tried.

Slightly less brief: most secure communication on the Internet is handled using the TLS protocol. TLS lets us communicate securely with anyone, even someone we’ve never met before.

While TLS is wonderful, it does have one limitation: it requires you to know the public key of the server you want to talk to. If an attacker can convince you to use the wrong public key (say, his key), you’ll end up talking securely to him instead. This is called a Man-in-the-Middle (MITM) attack and yes, it happens.

In principle we solve this problem using PKI. Any one of several hundred ‘trusted’ Certificate Authorities can digitally sign the server’s public key together with a domain name. The server sends the resulting certificate at the start of the protocol. If we trust all the CAs, life is good. But if any single one of those CAs lets us down, things get complicated.

And this is where the current CA model breaks down. Since every CA has the authority to sign any domain, a random CA in Malaysia can sign your US .com, even if you haven’t authorized it to, and even if that CA has never signed a .com before. It’s an invitation to abuse, since an attacker only need steal one CA private key anywhere in the world, and he can MITM any website.

The worst part of the ‘standard’ certificate model is how dumb it is. You’ve been connecting to your bank for a year, and each time it presented a particular public key certified by Verisign. Then one day that same site shows up with a brand new public key, signed by a CA from Guatemala. Should this raise flags? Probably. Will your browser warn you? Probably not. And this is where TACK comes in.

So what is TACK and how does it differ from other pinning schemes?

TACK is an extension to TLS that allows servers to ‘pin’ their public keys, essentially making the following statement to your TLS client:

This is a valid public key for this server. If you see anything else in this space, and I haven’t explicitly told you I’m changing keys, there’s something wrong.

Now, public-key pinning is not a new idea. A similar technique has been used by protocols like SSH for years, Google Chrome has a few static pins built in, and Google even has a draft HTTP extension that adds dynamic pinning capabilities to that protocol.

The problem with earlier pinning schemes is that they’re relatively inflexible. Big sites like Google and Facebook don’t run just one server, nor do they deploy a single public key. They run dozens or hundreds of servers, TLS terminators, etc., all configured with a mess of different keys and certificates. Moreover, configurations change frequently for business reasons. Pinning a particular public key to ‘facebook.com’ seems like a great idea, but probably isn’t as simple as you’d think it is. Moreover, a mistaken ‘pin’ is an invitation to a network disaster of epic proportions.

TACK takes an end-run around these problems, or rather, a couple of them:

  1. It adds a layer of indirection. Instead of pinning a particular TLS public key (either the server’s key, or some higher-level certificate public key), TACK pins sites to a totally separate ‘TACK key’ that’s not part of the certificates at all. You then use this key to make assertions about any of the keys on your system, even if each server uses a different one.
  2. It’s only temporary. A site pin lasts only a few days, no more than a month — unless you visit the site again to renew it. This puts pretty strict limits on the amount of damage you can do, e.g., by misconfiguring your pins, or by losing the TACK key.

So what exactly happens when I visit a TACK site?

So you’ve securely connected via TLS to https://facebook.com, which sports a valid certificate chain. Your browser also supports TACK. In addition to all the usual TLS negotiation junk, the server sends down a TACK public key in a new TLS extension field. It uses the corresponding TACK signing key to sign the server’s TLS public key. Next:

  1. Your browser makes note of the TACK key, but does nothing at all.
You see, it takes two separate connections to a website before any of the TACK machinery gets started. This is designed to insulate clients against the possibility that one connection might be an MITM.

The next time you connect, the server sends the same TACK public key/signature pair again, which activates the ‘pin’ — creating an association between ‘facebook.com’ and the TACK key, and telling your browser that this server public key is legitimate. If you last connected five days ago, then the ‘pin’ will be valid for five more days.

From here on out — until the TACK pin expires — your client will require that every TLS public key associated with facebook.com be signed by the TACK key. It doesn’t matter if Facebook has eight thousand different public keys, as long as each of those public keys gets shipped to the client along with a corresponding TACK signature. But if at any point your browser does not see a necessary signature, it will reject the connection as a possible MITM.

And that’s basically it.
The beauty of this approach is that the TACK key is entirely separate from the certificate keys. You can sign dozens of different server public keys with a single TACK key, and you can install new TLS certificates whenever you want. Clients’ browsers don’t care about this — the only data they need to store is that one TACK key and the association with facebook.com.

But what if I lose my TACK key or it gets stolen?

The nice thing about TACK is that it augments the existing CA infrastructure. Without a valid certificate chain (and a legitimate TLS connection), a TACK key is useless. Even if you lose the thing, it’s not a disaster, since pins expire relatively quickly. (Earlier proposals devote a lot of time to solving these problems, and even require you to back up your pinning keys.)

The best part is that the TACK signing key can remain offline. Unlike TLS private keys, the only thing the TACK key is used for is creating static signatures, which get stored on the various servers. This dramatically reduces the probability that the key will be stolen.

So is TACK the solution to everything?

TACK addresses a real problem with our existing TLS infrastructure, by allowing sites to (cleanly) set pins between server public keys and domain names. If widely deployed, this should help to protect us against the very real threat of MITM. It’s also potentially a great solution for those edge cases like MTA-to-MTA mailserver connections, where the PKI model actually doesn’t work very well, and pinning may be the most viable route to security.

But of course, TACK isn’t the answer to everything. The pins it sets are only temporary, and so a dedicated attacker with long-term access to a connection could probably manage to set incorrect pins. At the end of the day this seems pretty unlikely, but it’s certainly something to keep in mind.

If wishes were horses then beggars would ride… a Pwnie!

In case you’re wondering about the title above, I ripped it off from an old Belle Waring post on the subject of wishful thinking in politics. Her point (inspired by this Calvin & Hobbes strip) is that whenever you find yourself imagining how things should be in your preferred view of the world, there’s no reason to limit yourself. Just go for broke!

Take a non-political example. You want Facebook to offer free services, protect your privacy and shield you from targeted advertising? No problem! And while you’re wishing for those things, why not wish for a pony too!

This principle can also apply to engineering. As proof of this, I offer up a long conversation I just had with Dan Kaminsky on Twitter, related to the superiority of DNSSEC vs. application-layer solutions to securing email in transit. Short summary: Dan thinks DNSSEC will solve this problem, all we have to do is get it deployed everywhere in a reasonable amount of time. I agree with Dan. And while we’re daydreaming, ponies! (See how it works?)

The impetus for this discussion was a blog post by Tom Ritter, who points out that mail servers are horrendously insecure when it comes to shipping mail around the Internet. If you know your mailservers, there are basically two kinds of SMTP connection. MUA-to-MTA connections, where your mail client (say, Apple Mail) sends messages to an SMTP server such as GMail. And MTA-to-MTA connections, where Gmail sends the message on to a destination server (e.g., Hotmail).

Now, we’ve gotten pretty good at securing the client-to-server connections. In practice, this is usually done with TLS, using some standard server certificate that you can pick up from a reputable company like Trustwave. This works fine, since you know the location of your outgoing mail server and can request that the connection always use TLS.

Unfortunately, we truly suck at handling the next leg of the communication, where the email is shipped on to the next MTA, e.g., from Google to Hotmail. In many cases this connection isn’t encrypted at all, making it vulnerable to large-scale snooping. But even when it is secured, it’s not secured very well.

What Tom points out is that many mail servers (e.g., Gmail’s) don’t actually bother to use a valid TLS certificate on their MTA servers. Since this is the case, most email senders are configured to accept the bogus certs anyway, because everyone has basically given up on the system.

We could probably fix the certs, but you see, it doesn’t matter! That’s because finding that next MTA depends on a DNS lookup, and (standard) DNS is fundamentally insecure. A truly nasty MITM attacker can spoof the MX record returned, causing your server to believe that mail.evilempire.com is the appropriate server to receive Hotmail messages. And of course, mail.evilempire.com could have a perfectly legitimate certificate for its domain — which means you’d be hosed even if you checked it.

(It’s worth pointing out that the current X.509 certificate infrastructure does not have a universally-accepted field equivalent to “I am an authorized mailserver for Gmail”. It only covers hostnames.)

The question is, then, what to do about this?

There are at least three options. One is that we just suck it up and assume that email’s insecure anyway. If people want more security, they should end-to-end encrypt their mail with something like GPG. I’m totally sympathetic to this view, but I also recognize that almost nobody encrypts their email with GPG. Since we already support TLS on the MTA-to-MTA connection, perhaps we should be doing it securely?

The second view is that we fix this using some chimera of X.509 extensions and public-key pinning. In this proposal (inspired by an email from Adam Langley***), we’d slightly extend X.509 implementations to recognize an “I am a mailserver for Hotmail.com” field, we get CAs to sign these, and we install them on Hotmail’s servers. Of course, we’ll have to patch mailservers like Sendmail to actually check for these certs, and we’d have to be backwards compatible to servers that don’t support TLS at all — meaning we’d need to pin public keys to prevent downgrade attacks. It’s not a very elegant solution at all.

The third, ‘elegant’ view is that we handle all of our problems using DNSSEC. After all, this is what DNSSEC was built for! To send an email to Hotmail, my server just queries DNS for Hotmail.com, and (through the magic of public-key cryptography) it gets back a signed response guaranteeing that the MX record is legitimate. And a pony too!

But of course, back in the real world this is not at all what will happen, for one very simple reason:

  • Many mail services do not support DNSSEC for their mail servers.
  • Many operating systems do not support DNSSEC resolution. (e.g., Windows 8)

Ok, that’s two reasons, and I could probably add a few more. But at the end of the day DNSSEC is the pony with the long golden mane, the one your daughter wants you to buy her — right after you buy her the turtle and the fish and the little dog named Sparkles.

So maybe you think that I’m being totally unfair. If people truly want MTA-to-MTA connections to be secure, they’ll deploy DNSSEC and proper certificates. We’ll all modify Sendmail/Qmail/Etc. to validate that the DNS resolution is ‘secured’ and they’ll proceed appropriately. If DNS does not resolve securely, they’ll… they’ll…

Well, what will they do? It seems like there are only two answers to that question. Number one: when DNSSEC resolution fails (perhaps because it’s not supported by Hotmail), then the sender fails open. It accepts whatever insecure DNS response it can get, and we stick with the current broken approach. At least your email gets there.

Alternatively, we modify Sendmail to fail closed. When it fails to accomplish a secure DNS resolution, the email does not go out. This is the security-nerd approach.

Let me tell you something: Nobody likes the security-nerd approach.

As long as DNSSEC remains at pitiful levels of deployment, it’s not going to do much at all. Email providers probably won’t add DNSSEC support for MX records, since very few mailservers will be expecting it. Mailserver applications won’t employ (default, mandatory) DNSSEC checking because none of the providers will offer it! Around and around we’ll go.

But this isn’t just a case against DNSSEC, it relates to a larger security architecture question. Namely: when you’re designing a security system, beware of relying on things that are outside of your application boundary.

What I mean by this is that your application (or transport layer implementation, etc.) should not be dependent. If possible, it should try to achieve its security goals all by itself, with as few assumptions about the system that it’s running on, unless those assumptions are rock solid. If there’s any doubt that these assumptions will hold in practice, your system needs a credible fallback plan other than “ah, screw it” or “let’s be insecure”.

The dependence on a secure DNS infrastructure is one example of this boundary-breaking problem, but it’s not the only one.

For example, imagine that you’re writing an application depends on having a strong source of pseudo-random numbers. Of course you could get these from /dev/rand (or /dev/urand), but what if someone runs your application on a system that either (a) doesn’t have these, or (b) has them, but they’re terrible?

Now sometimes you make the choice to absorb a dependency, and you just move on. Maybe DNSSEC is a reasonable thing to rely on. But before you make that decision, you should also remember that it’s not just a question of your assumptions breaking down. There are human elements to think of.

Remember that you may understand your design, but if you’re going to be successful at all, someday you’ll have to turn it over to someone else. Possibly many someones. And those people need to understand the fullness of your design, meaning that if you made eight different assumptions, some of which occur outside of your control, then your new colleagues won’t just have eight different ways to screw things up. They’ll have 2^8 or n^8 different ways to screw it up.

You as the developer have a duty to yourself and to those future colleagues to keep your assumptions right there where you can keep an eye on them, unless you’re absolutely positively certain that you can rely on them holding in the future.

Anyway, that’s enough on this subject. There’s probably room on this blog for a longer and more detailed post about how DNSSEC works, and I’d like to write that post at some point. Dan obviously knows a lot more of the nitty-gritty than I do, and I’m (sort of) sympathetic to his position that DNSSEC is awesome. But that’s a post for the future. Not the DNSSEC-distant future, hopefully!

Notes:

* Dan informs me that he already has a Pwnie, hence the title.
** Thomas Ptacek also has a much more convincing (old) case against DNSSEC.
*** In fairness to Adam, I think his point was something like ‘this proposal is batsh*t insane’, maybe we should just stick with option (1). I may be wrong.

An update on the TLS MITM situation

A couple months back I wrote a pretty overwrought post lamenting the broken state of our PKI. At the time, I was kind of upset that ‘trusted’ Certificate Authorities (*ahem*, Trustwave) had apparently made a business of selling that trust for the purpose of intercepting SSL/TLS connections.

Now, I may be off base, but this seems like a bad thing to me. It also seemed like pretty stupid business. After all, it’s not like certificates are hard to make (seriously — I made three while writing the previous sentence). When your sole distinguishing feature is that people trust you, why would you ever mess with that trust?

In any case, the worst part of the whole thing was a suspicion that the practice didn’t end with Trustwave; rather, that Trustwave was simply the CA that got caught. (No, this doesn’t make it better. Don’t even try that.)

But if that was the case, then who else was selling these man-in-the-middle ‘skeleton’ certificates? Who wasn’t? Could I just call up and order one today, or would it be all dark parking lots and windowless vans, like the last time I bought an 0day from VUPEN? (Kidding. Sort of.)

The saddest part of the whole story is that out of all the major browser vendors, only one — the Mozilla foundation, makers of Firefox — was willing to take public action on the issue. This came in the form of a letter that Mozilla sent to each of the major CAs in the Mozilla root program:

Please reply by March 2, 2012, to confirm completion of the following actions or state when these actions will be completed.

1) Subordinate CAs chaining to CAs in Mozilla’s root program cannot be used for MITM or “traffic management” of domain names or IPs that the certificate holder does not legitimately own or control, regardless of whether it is in a closed and controlled environment or not. Please review all of the subordinate CAs that chain up to your root certificates in NSS to make sure that they cannot be used in this way. Any existing subordinate CAs that can be used for that purpose must be revoked and any corresponding HSMs destroyed as soon as possible, but no later than April 27, 2012.

Which brings me to the point of this post. March 2 passed a long time ago. The results are mostly in. So what have we learned?


(Note: to read the linked spreadsheet, look at the column labeled “Action #1”. If it contains an A, B, or E, that means the CA has claimed innocence — it does not make SubCAs for signing MITM certificates, or else it does make SubCAs, but the holders are contractually-bound not to abuse them.* If there’s a C or a D there, it means that the CA is working the problem and will have things sorted out soon.)

The high-level overview is that most of the Certificate Authorities claim innocence. They simply don’t make SubCAs (now, anyway). If they do, they contractually require the holders not to use them for MITM eavesdropping. One hopes these contracts are enforceable. But on the surface, at least, things look ok.

There are a couple of exceptions: notably Verizon Business (in my hometown!), KEYNECTIS and GlobalSign. All three claim to have reasonable explanations, and are basically just requesting extensions so they can get their ducks in a row (KEYNECTIS is doing in-person inspections to make sure nothing funny is going on, which hopefully just means they’re super-diligent, and not that they’re picking up MITM boxes and throwing them in the trash.) **

But the upshot is that if we can believe this self-reported data, then Trustwave really was the outlier. They were the only people dumb enough to get into the business of selling SubCAs to their client for the purpose of intercepting TLS connections. Or at very least, the other players in this business knew it was a trap, and get themselves out long ago.

Anyone else believe this? I’d sure like to.

Notes:

* ‘Abuse’ here means signing certificates for domains that you don’t control. Yes, it’s unclear how strong these contracts are, and if there’s no possibility for abuse down the line. But we can only be so paranoid.

** I confess there some other aspects I don’t quite follow here — for example, why are some ‘compliant’ CAs marked with ‘CP/CPS does not allow MITM usage’, while others are ‘In Progress’ in making that change. I’m sure there’s a good technical reason for this. Hopefully the end result is compliance all around.

A brief update

My early-week post on the MITM certificate mess seems to have struck a nerve with readers. (Or perhaps I just picked the right time to complain!) Since folks seem interested in this subject, I wanted to follow up with a few quick updates:

  • The EFF has released a new version of HTTPS Everywhere, which includes a nifty ‘Decentralized SSL Observatory’ feature. This scans for unusual certificates (e.g., MITM certs, certs with weak keys) and reports them back to EFF for logging. A very nice step towards a better ‘net.
  • StalkR reminds me that Chrome 18 includes support for Public-key Pinning. This is an HTTP extension that allows a site operator to ‘pin’ their site to one (or more) pre-specified public keys for a given period of time. A pinned browser will reject any alternative keys that show up — even if they’re embedded in a valid certificate.
  • A couple of readers point out that popular sites (e.g., Google and Facebook) change their certificates quite frequently — possibly due to the use of load balancers — which poses a problem for “carry a list of legitimate certs with you” solutions. I recognize this. The best I can say is that we’re all better off if bogus certs are easy to detect. Hopefully site operators will find a compromise that makes this easy for us.

Appearances to the contrary, this blog is not going to become a forum for complaining about CAs. I’ll be back in a few days with more wonky crypto posts, including some ideas for dealing with bad randomness, some thoughts on patented modes of operation, and an update on the progress that researchers are making with Fully-Homomorphic Encryption.

The Internet is broken: could we please fix it?

Ok, this is a little embarrassing and I hate having to admit it publicly. But I can’t hold it in any longer: I think I’m becoming an Internet activist.

This is upsetting to me, since active is the last thing I ever thought I’d be. I have friends who live to make trouble for big corporations on the Internet, and while I admire their chutzpah (and results!), they’ve always made me a little embarrassed. Even when I agree with their cause, I still have an urge to follow along, cleaning up the mess and apologizing on behalf of all the ‘reasonable’ folks on the Internet.

But every man has a breaking point, and the proximate cause of mine is Trustwave. Or rather, the news that Trustwave — an important CA and pillar of the Internet — took it upon themselves to sell a subordinate root cert to some (still unknown) client, for the purposes of undermining the trust assumptions that make the Internet secure eavesdropping on TLS connections.

This kind of behavior is absolutely, unquestionably out of bounds for a trusted CA, and certainly deserves a response — a stronger one than it’s gotten. But the really frightening news is twofold:

  1. There’s reason to believe that other (possibly bigger) CAs are engaged in the same practice.
  2. To the best of my knowledge, only one browser vendor has taken a public stand on this issue, and that vendor isn’t gaining market share.

The good news is that the MITM revelation is exactly the sort of kick we’ve needed to improve the CA system. And even better, some very bright people are already thinking about it. The rest of this post will review the problem and talk about some of the potential solutions.

Certificates 101

For those of you who know the TLS protocol (and how certificates work), the following explanation is completely gratuitous. Feel free to skip it. If you don’t know — or don’t understand the problem — I’m going to take a minute to give some quick background.

TLS (formerly SSL) is probably the best-known security protocol on the Internet. Most people are familiar with TLS for its use in https — secure web — but it’s also used to protect email in transit, software updates, and a whole mess of other stuff you don’t even think about.

TLS protects your traffic by encrypting it with a strong symmetric key algorithm like AES or RC4. Unfortunately, this type of cryptography only works when the communicating parties share a key. Since you probably don’t share keys with most of the web servers on the Internet, TLS provides you with a wonderful means to do so: a public-key key agreement protocol.

I could spend a lot of time talking about this, but for our purposes, all you need to understand is this: when I visit https://gmail.com, Google’s server will send me a public key. If this key really belongs to Google, then everything is great: we can both derive a secure communication key, even if our attacker Mallory is eavesdropping on the whole conversation.

If, on the other hand, Mallory can intercept and modify our communications, the game is very different. In this case, she can overwrite Gmail’s key with her own public key. The result: I end up sharing a symmetric key with her! The worst part is that I probably won’t know this has happened: clever Mallory can make her own connection to Gmail and silently pass my traffic through — while reading every word. This scenario is called a Man in the Middle (MITM) Attack.

Man_in_the_middle_attack.svg
MITM attack. Alice is your grandfather, Bob is BankofAmerica.com, and Mallory establishes connections with both. (Wikipedia/CC license)

MITM attacks are older than the hills. Fortunately TLS has built-in protections to thwart them. Instead of transmitting a naked public key, the Gmail server wraps its key in a certificate; this is a simple file that embeds both the key and some identifying information, like “gmail.com”. The certificate is digitally signed by someone very trustworthy: one of a few dozen Certificate Authorities (CA) that your browser knows and trusts. These include companies like Verisign, and (yes) Trustwave.

TLS clients (e.g., web browsers) carry the verification keys for a huge number of CAs. When a certificate comes in, they can verify its signature to ensure that it’s legit. This approach works very well, under one very important assumption: namely, Mallory won’t be able to get a signed certificate on a domain she doesn’t own.

What’s wrong with the CA model?

The real problem with the CA model is that every root CA has the power to sign any domain, which completely unravels the security of TLS. So far the industry has policed itself using the Macaroni Grill model: If a CA screws up too badly, they face being removed from the ‘trusted’ list of major TLS clients. In principle this should keep people in line, since it’s the nuclear option for a CA — essentially shutting down their business.

Unfortunately while this sounds good it’s tricky to implement in practice. That’s because:

  1. It assumes that browser vendors are willing to go nuclear on their colleagues at the CAs.
  2. It assumes that browser vendors can go nuclear on a major CA, knowing that the blowback might very well hurt their product. (Imagine that your browser unilaterally stopped accepting Verisign certs. What would you do?)
  3. It assumes that someone will catch misbehaving CAs in the first place.

What’s fascinating about the Trustwave brouhaha is that it’s finally giving us some visibility into how well these assumptions play out in the real world.

So what happened with Trustwave?

In late January of this year, Trustwave made a cryptic update to their CA policy. When people started asking about it, they responded with a carefully-worded post on the company blog. When you cut through the business-speak, here’s what it says:

We sold the right to generate certificates — on any domain name, regardless of whether it belongs to one of our clients or not — and packed this magical capability into a box. We rented this box to a corporate client for the express purpose of running Man-in-the-Middle attacks to eavesdrop on their employees’ TLS-secured connections. At no point did we stop to consider how damaging this kind of practice was, nor did we worry unduly about its potential impact on our business — since quite frankly, we didn’t believe it would have any.

I don’t know which part is worse. That a company whose entire business is based on trust — on the idea that people will believe them when they say a certificate is legit — would think they could get away with selling a tool to make fraudulent certificates. Or that they’re probably right.

But this isn’t the worst of it. There’s reason to believe that Trustwave isn’t alone in this practice. In fact, if we’re to believe the rumors, Trustwave is only noteworthy in that they stopped. Other CAs may still be up to their ears.

And so this finally brings us to the important part of this post: what’s being done, and what can we do to make sure that it never happens again?

Option 1: Rely on the browser vendors

What’s particularly disturbing about the Trustwave fiasco is the response it’s gotten from the various browser manufacturers.

So far exactly one organization has taken a strong stand against this practice. The Mozilla foundation (makers of Firefox) recently sent a strongly-worded letter to all of their root CAs — demanding that they disclose whether such MITM certificates exist, and that they shut them down forthwith. With about 20% browser share (depending on who’s counting), Mozilla has the means to enforce this. Assuming the vendors are honest, and assuming Mozilla carries through on its promise. And assuming that Mozilla browser-share doesn’t fall any further.

That’s the good news. Less cheerful is the deafening silence from Apple, Microsoft and Google. These vendors control most of the remaining browser market, and to the best of my knowledge they’ve said nothing at all about the practice. Publicly, anyway. It’s possible that they’re working the issue privately; if so, more power to them. But in the absence of some evidence, I find it hard to take this on faith.

Option 2: Sunlight is the best disinfectant

The Trustwave fiasco exposes two basic problems with the CA model: (1) any CA can claim ownership of any domain, and (2) there’s no easy way to know which domains a CA has put its stamp on.

This last is very much by CA preference: CAs don’t want to reveal their doings, on the theory that it would harm their business. I can see where they’re coming from (especially if their business includes selling MITM certs!) Unfortunately, allowing CAs to operate without oversight is one of those quaint practices (like clicking on links sent by strangers) that made sense in a more innocent time, but no longer has much of a place in our world.

Hash_Tree.svg
Merkle tree (Wikipedia/CC)

Ben Laurie and Adam Langley feel the same way, and they’ve developed a plan to do something about it. The basic idea is this:

  1. Every new certificate should be published in a public audit log. This log will be open to the world, which means that everyone can scan for illegal entries (i.e., their own domain appearing in somebody else’s certificate.)
  2. Anytime a web server hands out a certificate, it must prove that the certificate is contained in the list.

The beautiful thing is that this proof can be conducted relatively efficiently using a Merkle hash tree. The resulting proofs are quite short (log(N) hashes, where N is the total number of certificates). Browsers will need to obtain the current tree root, which requires either (a) periodic scanning of the tree, or some degree of trust in an authority, who will periodically distribute signed root nodes.

Along the same lines, the EFF has a similar proposal called the Sovereign Keys Project. SKP also proposes a public log, but places stronger requirements on what it takes to get into the log. It’s quite likely that in the long run these projects will merge, or give birth to something even better.

Option 3: Eternal vigilance

The problem with SKP and the Laurie/Langley proposal is that both require changes to the CA infrastructure. Someone will need to construct these audit logs; servers will have to start shipping hash proofs. Both can be incrementally deployed, but will only be effective once deployment reaches a certain level.

Another option is to dispense with this machinery altogether, and deal with rogue CAs today by subjecting them to contant, unwavering surveillance. This is the approach taken by CMU’s Perspectives plugin and by Moxie Marlinspike’s Convergence.

The core idea behind both of these systems is to use ‘network perspectives’ to determine whether the certificate you’re receiving is the same certificate that everyone else is. This helps to avoid MITMs, since presumably the attacker can only be in the ‘middle’ of so many network paths. To accomplish this, both systems deploy servers called Notaries — run on a volunteer basis — which you can call up whenever you receive an unknown certificate. They’ll compare your version of the cert to what they see from the same server, and help you ring the alarm if there’s a mismatch.

A limitation of this approach is privacy; these Notary servers obviously learn quite a bit about the sites you visit. Convergence extends the Perspectives plugin to address some of these issues, but fundamentally there’s no free lunch here. If you’re querying some external party, you’re leaking information.

One solution to this problem is to dispense with online notary queries altogether, and just ask people to carry a list of legitimate certificates with them. If we assume that there are 4 million active certificates in the world, we could easily fit them into a < 40MB Bloom filter. This would allow us to determine whether a cert is ‘on the list’ without making an online query. Of course, this requires someone to compile and maintain such a list. Fortunately there are folks already doing this, including the EFF’s SSL Observatory project.

Option 4: The hypothetical

The existence of these proposals is definitely heartening. It means that people are taking this seriously, and there’s an active technical discussion on how to make things better.

Since we’re in this mode, let me mention a few other things that could make a big difference in detecting exploits. For one thing, it would be awfully nice if web servers had a way to see things through their clients’ eyes. One obvious way to do this is through script: use Javascript to view the current server certificate, and report the details back to the server.

Of course this isn’t perfect — a clever MITM could strip the Javascript or tamper with it. Still, obfuscation is a heck of a lot easier then de-obfuscation, and it’s unlikely that a single attacker is going to win an arms race against a variety of sites.

Unfortunately, this idea has to be relegated to the ‘could be, should be’ dustbin, mostly because Javascript doesn’t have access to the current certificate info. I don’t really see the reason for this, and I sure hope that it changes in the future.

Option 5: The long arm of the law

I suppose the last option — perhaps the least popular — is just to treat CAs the same way that you’d treat any important, trustworthy organization in the real world. That means: you cheat, you pay the penalty. Just as we shouldn’t tolerate Bank of America knowingly opening a credit line in the name of a non-customer, we shouldn’t tolerate a CA doing the same.

Option 6: Vigilante justice

Ok, I’m only kidding about this one, cowboy. You can shut down that LOIC download right now.

In summary

I don’t know that there’s a magical vaccine that will make the the CA system secure, but I’ve come to believe that the current approach is not working. It’s not just examples like Trustwave, which (some might argue) is a relatively limited type of abuse. It’s that the Trustwave revelation comes in addition to a steady drumbeat of news about stolen keys, illegitimately-obtained certificates, and various other abuses.

While dealing with these problems might not be easy, what’s shocking is how easy it would be to at least detect and expose the abuses at the core of it — if various people agreed that this was a worthy goal. I do hope that people start taking this stuff seriously, mostly because being a radical is hard, hard work. I’m just not cut out for it.

SSL MITM Update

The other day I snarked about Trustwave’s decision to selltrustwavelogo subordinate root (‘skeleton’) certificates to their corporate clients, for the explicit purpose of destabilizing the web’s Public Key Infrastructure ‘legitimately’* intercepting TLS connections. This practice (new to me) is ostensibly only permitted in limited, controlled settings (usually to spy on a company’s employees).

Trustwave argues that the key was always safe inside of a Hardware Security Module and besides, they’re not doing it any more. (Kind of like saying that you handed out the master key to every door on earth but it’s ok ’cause you chained it to a hubcap.)

Unfortunately, the really bad news is that Trustwave may not be the only major CA implicated in this practice. And at least one browser vendor is planning to do something about it:

Dear Certification Authority, 
This note requests a set of immediate actions on your behalf, as a 

participant in the Mozilla root program.  

Please reply by {date 2 weeks out} to confirm completion of the 
following actions or state when these actions will be completed.  

1) Subordinate CAs chaining to CAs in Mozilla’s root program may not be 
used for MITM purposes, regardless of whether it is in a closed and 
controlled environment or not. Please review all of your subordinate CAs 
to make sure that they may not be used for MITM purposes. Any existing 
subordinate CAs that may be used for that purpose must be revoked and 
any corresponding HSMs destroyed by {date TBD}. For each subordinate CA 
that is revoked, send me: 
a) The certificate that signed the subCA. If it is a root certificate in 
NSS, then the root certificate’s subject and SHA1 fingerprint. 
b) The Serial Number of the revoked certificate. 
c) The CRL that contains the serial number of the revoked certificate. 

As a CA in Mozilla’s root program you are ultimately responsible for 
certificates issued by you and your subordinate CAs. After {date TBD} if 
it is found that a subordinate CA is being used for MITM, we will remove 
the corresponding root certificate. Based on Mozilla’s assessment, we 
may also remove any of your other root certificates, and root 
certificates from other organizations that cross-sign your certificates.  

2) Please add a statement to your CP/CPS committing that you will not 
issue a subordinate certificate that may be used for MITM or traffic 
management of domain names or IPs that the party does not legitimately 
own or control. Send me the URL to the updated document(s) and the 
impacted sections or page numbers. 

Participation in Mozilla’s root program is at our sole discretion, and 
we will take whatever steps are necessary to keep our users safe. 
Nevertheless, we believe that the best approach to safeguard that 
security is to work with CAs as partners, to foster open and frank 
communication, and to be diligent in looking for ways to improve. Thank 
you for your participation in this pursuit. 

Regards, 
Kathleen Wilson 
Module Owner of Mozilla’s CA Certificates Module 

Now I’m no bomb-thrower, but if it were up to me, {Date TBD} would be yesterday and there would be no re-entry for the CAs caught doing this kind of thing. Still, I’m glad that Mozilla is doing this, and we’re all lucky that they have the independence and browser share to force this kind of change.

But not everything is sunshine and rainbows:

  1. We have to trust that the CAs in question will respond honestly to Mozilla’s inquiry and will voluntarily exit a (presumably) lucrative business. This relies very much on the honor system, and it’s hard to presume much honor in a CA that would sell such a product in the first place.
  2. Mozilla only represents 25% of the browser share, and that seems to be falling. That’s probably enough to make the difference — today — but it’d be nice to hear something similar from Microsoft or Google.
  3. We still lack a good client-side mechanism for detecting and reporting unusual (that is: correctly signed, but inconsistent) certificates. Given the news from Trustwave, such a mechanism seems more useful than ever.

We cannot possibly have faith in the security of the Internet when CAs are willing to engage in this kind of practice — even if they do it under the most ‘carefully-controlled’ conditions.

* Whatever that means.

Trustwave announces name change: henceforth will simply be ‘Wave’

This story has making the rounds for about a week and I’m still shockedtrustwavelogo by it. Here’s the postage stamp version: at some point in the past few years, certificate authority Trustwave basically handed out their root signing capability to a third party company. But don’t worry, it’s all better now.

As with any such story, there are bits we know and bits we have to speculate about. Speculation is more fun, so let’s start there:

Once upon a time there was a company — let’s call them ACME Inc — who really didn’t trust its employees. For ACME the solution was vigilance. Constant, invasive vigilance. ACME’s IT department was given the task of intercepting every packet sent to and from the corporate network, which seemed straightforward; until someone pointed out that they could intercept all the packets they wanted, but they couldn’t necessarily read them. Especially not the ones encrypted with SSL/TLS.

Now this isn’t a killer. ACME had a few options: they could (a) block SSL/TLS at their network gateway, forcing everyone to use cleartext connections. They could (b) force their employees to use some awkward SSL proxy. If they were feeling ambitious, they could even (c) run a man-in-the-middle on every SSL connection initiated from within their corporate network. The last option would result in some awkward certificate errors, however — which would be unpleasant for web users, and downright nasty for embedded devices or headless boxes.

But really, each of these solution is just a different version of flypaper. Why catch flies with flypaper, when you can totally screw with the trust model of the Internet?

And this is where we get to the facts. A few years back ACME — or some company like ACME — approached Trustwave with this problem. Trustwave seemed like a good group to ask, since they’re one of the select few companies that make SSL certificates, i.e., they’re one of the ‘authorities’ whose root certs are built into all of the major browsers and OSes.

Somehow the two companies cooked up the following plan. Trustwave would generate a new ‘subordinate root’ certificate with full signing authority. Anyone who possessed the signing key for this cert would essentially be Trustwave — meaning that they could vouch for any website they wanted. Of course, such a key would be enormously valuable (and dangerous). No responsible CA would allow such a thing to leave their facilities.

But apparently Trustwave’s motto is ‘think different’. So they cheerfully packed the signing key into a Hardware Security Module and sent it over to ACME. From that point on, ACME possessed the ability to transparently impersonate any SSL website on the Internet.

And impersonate they did; whenever some client initiated an SSL connection from within ACME’s corporate network, an ACME server would intercept the connection, sign a fresh certificate on the requested domain, then deliver that cert back to the client. To the client, it appeared that the connection went through perfectly. But of course the client was now talking to ACME’s server, not to the company whose name was on the certificate. ACME would in turn connect on to the target SSL server, thus completing the connection.

Technically this tampering wasn’t totally invisible; a clever user might notice that every certificate was now signed by Trustwave — and comparison with certificates received outside of ACME’s network would clearly reveal something funny going on. But since the vast majority of web users don’t check this kind of thing, the interception was basically transparent.

Now I hope I don’t need to tell you why this is a bad idea. Let’s just take it a a given that this is a bad idea. Even Trustwave now realizes it’s a bad idea, and have ‘proactively’ revoked the cert to make sure the evil thing doesn’t fall into the wrong hands. From their blog post about it:

Trustwave has decided to be open about this decision as well as stating that we will no longer enable systems of this type and are effectively ending this short journey into this type of offering.

I guess we can at least be thankful that Trustwave has decided to be open about this decision, despite the fact that they weren’t open about it while it was happening. Let’s all hope this is really the last journey Trustwave plans to take into this type of offering, where by ‘offering’ I mean — disastrous, short-sighted mistake.