Monday, February 27, 2012

The Internet is broken: could we please fix it?

Ok, this is a little embarrassing and I hate having to admit it publicly. But I can't hold it in any longer: I think I'm becoming an Internet activist.

This is upsetting to me, since active is the last thing I ever thought I'd be. I have friends who live to make trouble for big corporations on the Internet, and while I admire their chutzpah (and results!), they've always made me a little embarrassed. Even when I agree with their cause, I still have an urge to follow along, cleaning up the mess and apologizing on behalf of all the 'reasonable' folks on the Internet.

But every man has a breaking point, and the proximate cause of mine is Trustwave. Or rather, the news that Trustwave -- an important CA and pillar of the Internet -- took it upon themselves to sell a subordinate root cert to some (still unknown) client, for the purposes of undermining the trust assumptions that make the Internet secure eavesdropping on TLS connections.

This kind of behavior is absolutely, unquestionably out of bounds for a trusted CA, and certainly deserves a response -- a stronger one than it's gotten. But the really frightening news is twofold:
  1. There's reason to believe that other (possibly bigger) CAs are engaged in the same practice.
     
  2. To the best of my knowledge, only one browser vendor has taken a public stand on this issue, and that vendor isn't gaining market share.
The good news is that the MITM revelation is exactly the sort of kick we've needed to improve the CA system. And even better, some very bright people are already thinking about it. The rest of this post will review the problem and talk about some of the potential solutions.

Certificates 101

For those of you who know the TLS protocol (and how certificates work), the following explanation is completely gratuitous. Feel free to skip it. If you don't know -- or don't understand the problem -- I'm going to take a minute to give some quick background.

TLS (formerly SSL) is probably the best-known security protocol on the Internet. Most people are familiar with TLS for its use in https -- secure web -- but it's also used to protect email in transit, software updates, and a whole mess of other stuff you don't even think about.

TLS protects your traffic by encrypting it with a strong symmetric key algorithm like AES or RC4. Unfortunately, this type of cryptography only works when the communicating parties share a key. Since you probably don't share keys with most of the web servers on the Internet, TLS provides you with a wonderful means to do so: a public-key key agreement protocol.

I could spend a lot of time talking about this, but for our purposes, all you need to understand is this: when I visit https://gmail.com, Google's server will send me a public key. If this key really belongs to Google, then everything is great: we can both derive a secure communication key, even if our attacker Mallory is eavesdropping on the whole conversation.

If, on the other hand, Mallory can intercept and modify our communications, the game is very different. In this case, she can overwrite Gmail's key with her own public key. The result: I end up sharing a symmetric key with her! The worst part is that I probably won't know this has happened: clever Mallory can make her own connection to Gmail and silently pass my traffic through -- while reading every word. This scenario is called a Man in the Middle (MITM) Attack.
MITM attack. Alice is your grandmother, Bob is BankofAmerica.com, and Mallory establishes connections with both. (Wikipedia/CC license)
MITM attacks are older than the hills. Fortunately TLS has built-in protections to thwart them. Instead of transmitting a naked public key, the Gmail server wraps its key in a certificate; this is a simple file that embeds both the key and some identifying information, like "gmail.com". The certificate is digitally signed by someone very trustworthy: one of a few dozen Certificate Authorities (CA) that your browser knows and trusts. These include companies like Verisign, and (yes) Trustwave.

TLS clients (e.g., web browsers) carry the verification keys for a huge number of CAs. When a certificate comes in, they can verify its signature to ensure that it's legit. This approach works very well, under one very important assumption: namely, Mallory won't be able to get a signed certificate on a domain she doesn't own.

What's wrong with the CA model?

The real problem with the CA model is that every root CA has the power to sign any domain, which completely unravels the security of TLS. So far the industry has policed itself using the Macaroni Grill model: If a CA screws up too badly, they face being removed from the 'trusted' list of major TLS clients. In principle this should keep people in line, since it's the nuclear option for a CA -- essentially shutting down their business.

Unfortunately while this sounds good it's tricky to implement in practice. That's because:
  1. It assumes that browser vendors are willing to go nuclear on their colleagues at the CAs.
     
  2. It assumes that browser vendors can go nuclear on a major CA, knowing that the blowback might very well hurt their product. (Imagine that your browser unilaterally stopped accepting Verisign certs. What would you do?)
     
  3. It assumes that someone will catch misbehaving CAs in the first place.
What's fascinating about the Trustwave brouhaha is that it's finally giving us some visibility into how well these assumptions play out in the real world.

So what happened with Trustwave?

In late January of this year, Trustwave made a cryptic update to their CA policy. When people started asking about it, they responded with a carefully-worded post on the company blog. When you cut through the business-speak, here's what it says:
We sold the right to generate certificates -- on any domain name, regardless of whether it belongs to one of our clients or not -- and packed this magical capability into a box. We rented this box to a corporate client for the express purpose of running Man-in-the-Middle attacks to eavesdrop on their employees' TLS-secured connections. At no point did we stop to consider how damaging this kind of practice was, nor did we worry unduly about its potential impact on our business -- since quite frankly, we didn't believe it would have any.
I don't know which part is worse. That a company whose entire business is based on trust -- on the idea that people will believe them when they say a certificate is legit -- would think they could get away with selling a tool to make fraudulent certificates. Or that they're probably right.

But this isn't the worst of it. There's reason to believe that Trustwave isn't alone in this practice. In fact, if we're to believe the rumors, Trustwave is only noteworthy in that they stopped. Other CAs may still be up to their ears.

And so this finally brings us to the important part of this post: what's being done, and what can we do to make sure that it never happens again?

Option 1: Rely on the browser vendors

What's particularly disturbing about the Trustwave fiasco is the response it's gotten from the various browser manufacturers.

So far exactly one organization has taken a strong stand against this practice. The Mozilla foundation (makers of Firefox) recently sent a strongly-worded letter to all of their root CAs -- demanding that they disclose whether such MITM certificates exist, and that they shut them down forthwith. With about 20% browser share (depending on who's counting), Mozilla has the means to enforce this. Assuming the vendors are honest, and assuming Mozilla carries through on its promise. And assuming that Mozilla browser-share doesn't fall any further.

That's the good news. Less cheerful is the deafening silence from Apple, Microsoft and Google. These vendors control most of the remaining browser market, and to the best of my knowledge they've said nothing at all about the practice. Publicly, anyway. It's possible that they're working the issue privately; if so, more power to them. But in the absence of some evidence, I find it hard to take this on faith.

Option 2: Sunshine is the best disinfectant

The Trustwave fiasco exposes two basic problems with the CA model: (1) any CA can claim ownership of any domain, and (2) there's no easy way to know which domains a CA has put its stamp on.

This last is very much by CA preference: CAs don't want to reveal their doings, on the theory that it would harm their business. I can see where they're coming from (especially if their business includes selling MITM certs!) Unfortunately, allowing CAs to operate without oversight is one of those quaint practices (like clicking on links sent by strangers) that made sense in a more innocent time, but no longer has much of a place in our world.

Merkle tree (Wikipedia/CC)
Ben Laurie and Adam Langley feel the same way, and they've developed a plan to do something about it. The basic idea is this:
  1. Every new certificate should be published in a public audit log. This log will be open to the world, which means that everyone can scan for illegal entries (i.e., their own domain appearing in somebody else's certificate.)
     
  2. Anytime a web server hands out a certificate, it must prove that the certificate is contained in the list.
The beautiful thing is that this proof can be conducted relatively efficiently using a Merkle hash tree. The resulting proofs are quite short (log(N) hashes, where N is the total number of certificates). Browsers will need to obtain the current tree root, which requires either (a) periodic scanning of the tree, or some degree of trust in an authority, who will periodically distribute signed root nodes.

Along the same lines, the EFF has a similar proposal called the Sovereign Keys Project. SKP also proposes a public log, but places stronger requirements on what it takes to get into the log. It's quite likely that in the long run these projects will merge, or give birth to something even better.

Option 3: Eternal vigilance

The problem with SKP and the Laurie/Langley proposal is that both require changes to the CA infrastructure. Someone will need to construct these audit logs; servers will have to start shipping hash proofs. Both can be incrementally deployed, but will only be effective once deployment reaches a certain level.

Another option is to dispense with this machinery altogether, and deal with rogue CAs today by subjecting them to contant, unwavering surveillance. This is the approach taken by CMU's Perspectives plugin and by Moxie Marlinspike's Convergence.

The core idea behind both of these systems is to use 'network perspectives' to determine whether the certificate you're receiving is the same certificate that everyone else is. This helps to avoid MITMs, since presumably the attacker can only be in the 'middle' of so many network paths. To accomplish this, both systems deploy servers called Notaries -- run on a volunteer basis -- which you can call up whenever you receive an unknown certificate. They'll compare your version of the cert to what they see from the same server, and help you ring the alarm if there's a mismatch.

A limitation of this approach is privacy; these Notary servers obviously learn quite a bit about the sites you visit. Convergence extends the Perspectives plugin to address some of these issues, but fundamentally there's no free lunch here. If you're querying some external party, you're leaking information.

One solution to this problem is to dispense with online notary queries altogether, and just ask people to carry a list of legitimate certificates with them. If we assume that there are 4 million active certificates in the world, we could easily fit them into a < 40MB Bloom filter. This would allow us to determine whether a cert is 'on the list' without making an online query. Of course, this requires someone to compile and maintain such a list. Fortunately there are folks already doing this, including the EFF's SSL Observatory project.

Option 4: The hypothetical

The existence of these proposals is definitely heartening. It means that people are taking this seriously, and there's an active technical discussion on how to make things better.

Since we're in this mode, let me mention a few other things that could make a big difference in detecting exploits. For one thing, it would be awfully nice if web servers had a way to see things through their clients' eyes. One obvious way to do this is through script: use Javascript to view the current server certificate, and report the details back to the server.

Of course this isn't perfect -- a clever MITM could strip the Javascript or tamper with it. Still, obfuscation is a heck of a lot easier then de-obfuscation, and it's unlikely that a single attacker is going to win an arms race against a variety of sites.

Unfortunately, this idea has to be relegated to the 'could be, should be' dustbin, mostly because Javascript doesn't have access to the current certificate info. I don't really see the reason for this, and I sure hope that it changes in the future.

Option 5: The long arm of the law

I suppose the last option -- perhaps the least popular -- is just to treat CAs the same way that you'd treat any important, trustworthy organization in the real world. That means: you cheat, you pay the penalty. Just as we shouldn't tolerate Bank of America knowingly opening a credit line in the name of a non-customer, we shouldn't tolerate a CA doing the same.

Option 6: Vigilante justice

Ok, I'm only kidding about this one, cowboy. You can shut down that LOIC download right now.

In summary

I don't know that there's a magical vaccine that will make the the CA system secure, but I've come to believe that the current approach is not working. It's not just examples like Trustwave, which (some might argue) is a relatively limited type of abuse. It's that the Trustwave revelation comes in addition to a steady drumbeat of news about stolen keys, illegitimately-obtained certificates, and various other abuses.

While dealing with these problems might not be easy, what's shocking is how easy it would be to at least detect and expose the abuses at the core of it -- if various people agreed that this was a worthy goal. I do hope that people start taking this stuff seriously, mostly because being a radical is hard, hard work. I'm just not cut out for it.

15 comments:

  1. While everybody seems so eager to jump the ship when it comes to the current PKI infrastructure, it has one important advantage over all of the alternatives that are currently proposed: it's the only scheme that supports some form of revocation. While those in favor of the alternatives seem to claim that revocation is not that important, I strongly disagree: revocation is *the* most important thing for example when dealing with a large roll-out of smart cards. People lose smart cards in the same way they lose their credit cards. This is where revocation becomes the key factor of the deployed scheme. And whenever I hear X.509 is "so utterly broken", I think that it's only the implementations that are. An incident like the ones we've seen lately could always be safely handled by simple revocation. It's the browser's fault that they just don't seem to want to implement it correctly.

    ReplyDelete
    Replies
    1. I'm admittedly no expert at all in this topic :) but I think you're talking about something else entirely. The point is that *for websites* it's far more important to have accountability and transparency in the public key infrastructure than revocation.

      Also, some of these proposals have an implicit concept of revocation somewhere. If a certificate can no longer be verified as valid, it can be considered revoked, since the user would no longer have to decide whether to use it or not - it won't be used, period.

      Smartcards on the other hand belong to a completely different use case, and I agree with you that revocation is crucial there.

      Delete
  2. Thanks for the clear explanation of the problem. However, like many technologists, you forgot to list the only option that will work (at least for those located in the United States) no matter how much CA/browser/crypto technology evolves (which it will). I.e., fix the public policy, specifically the Electronic Communication Privacy Act which permits employers to monitor networks for business purposes.

    ReplyDelete
  3. "(Imagine that your browser unilaterally stopped accepting Verisign certs. What would you do?)"

    I would do what I usually do when my browser makes an HTTPS connection with an untrusted certificate: click "Go ahead and connect anyway". I don't see that browser vendors would really take much backlash for doing this.

    ReplyDelete
  4. What about validating the certificate presented to the client based on the hash of the cert? The lookup would be performed client side and would offer a fairly good degree of certainty about the legitimacy of the certificate presented.

    There's been a fair amount of talk about publishing cert hashes in DNSSEC zones over the past couple of years...

    ReplyDelete
  5. Do not forget about SRP protocol. Which can be used to securly authenticate with things like banking, gmail, and other account based web services with encryption, without any certificates (there may be slightly problem bootstraping it - like how to securly CREATE account on gmail without having password/keys/account there). But still, it is major improvment, and can be used for many websites which you login, like social sites, email services, web documents, blogs, administration panels, etc, as well things like SSH for example. All without certificate, and without risk of for example MITM attack, stilling passwords, brute force attacks or leaking by accident a password from one website to another one (like now).


    As of untrusty CAs, vendors should go nuclear. Maybe then they will learn.

    ReplyDelete
  6. "Every new certificate should be published in a public audit log. This log will be open to the world, which means that everyone can scan for illegal entries (i.e., their own domain appearing in somebody else's certificate.)"

    I heartily disagree here about an audit log being PUBLIC. The WHOIS model is a total and complete disaster, which is why WHOIS "protection/anonymity services" exist. Their recommendation of a similar model for CAs is appalling. The CAs should only provide audit logs to authorized domain owners. Then require all of the CAs to participate in a global CA audit log program that, only if you own the domain in question, you can see the relevant logs for (that way the domain owner only has to register at one CA). Anyone who accesses or generates a certificate gets added to the global audit log and you, as the owner of the domain, get to see that activity. Anyone who accesses the log for a domain gets added to the log as well to make sure no abuses occur. Domain owners should be able to also report unauthorized access of the log to their CA (and offending CA) for followup purposes. I don't disagree about having an audit log for holding all CAs accountable, I do disagree about it being public so ANYONE can view it.

    ReplyDelete
    Replies
    1. The audit log as implemented by Google doesn't contain physical names, addresses, telephone numbers, etc., except insofar as these already appear in X.509 certificates. It's already possible to collect this data from most publicly-visible Internet services.

      If the log isn't public and you have to authenticate to someone to view part of it, they can potentially give you a false version of the log. This turns the log-owners into a kind of super-CA because they're in a position to abet misissuance of certificates by simply pretending that those certificates never appeared in the log at all. That drastically reduces the security benefits that would be obtained from this approach.

      Also, people other than the domain owner can have an important stake in the security of the domain (and sometimes could be more technically sophisticated or motivated than the domain owner).

      Also, forcing people to choose the scope of validity of public-key certificates with respect to the intended relying parties is probably worthwhile because a lot of mischief can result from ambiguity about whether a certificate is "private" or "public". (If the certificate is "private", CAs may believe that it's not very important for the CA to know whether the data thus certified is accurate, because the general public can't be harmed by it. That's kind of like what's happened in this Trustwave case. But if the certificate can ever be shown to the general public and accepted as valid by the general public, the entire public is at risk from it.)

      It seems like a source of quite a bit of risk in the long run to say that someone can be issued a certificate that billions of relying parties will accept as valid but that none of those relying parties are entitled to know about its existence just because they aren't the subject of the certificate. For more analysis about this, take a look at AGL's earlier "Classifying solutions to the certificate problem" (although he doesn't directly argue there that the entire public needs to be able to see the log).

      Delete
  7. what i do not understand is why those big corporate not simply create their own certificate in Windows for example, and plant it in every PC of their employes as a trusted root CA, isn't that possible?

    ReplyDelete
    Replies
    1. Possible, but a pain in the backside - particularly if you have heterogeneous desktops, plus wireless networks where you let employees log on using their iPhones, Android phones, Blackberries... Getting a trusted root cert deployed to all these platforms is a non-trivial exercise.

      Delete
  8. May I humbly point out our Firefox add-on that takes a Convergence-like approach (in fact, using it as an additional back-end) plus does a report of fraudulent certs plus requesting the assistance of so-called Hunters to do a distributed traceroute to the destination server - allowing us to determine with some confidence where the MitM is sitting:
    https://github.com/crossbear/Crossbear

    ReplyDelete
  9. A TLS protocol built upon the Namecoin blockchain technology (has good fraction of bitcoin hash-strength security) would do away with the CA's. It just needs somebody to get around to doing it ...

    ReplyDelete
  10. I read most of your post and the comments to follow and I literally cannot comprehend most of it. It is not for your lack of explanation in any way, shape, or form. It is for my own lack of context. When I read something like this, I wonder "Am I missing out on some world that exists below the surface of the world in which I live every day?" "Am I clueless to the goings-on of the cyber-world?" "Does it matter that I am clueless?"

    I suppose it is the same as someone who speaks another language.... You hear the Spanish coming out of their mouths, but the sounds mean nothing.

    I see the words on the page and see about a million different times I should look up a term and try to understand it but I wonder at this point in my life if understanding any of this matters in the slightest - and that is not said to diminish what I am sure is very important work on your part.

    Anyways.... thanks for the contribution. Hope your research goes well.

    ReplyDelete
  11. I like the Laurie approach of Merkle Hash, but there's a simpler way and that's to implement X.509v3 as it was designed in v1, i.e. with a corresponding Directory, capital D.

    ReplyDelete