Google is promoting a new TLS extension which is designed to speed up the negotiation of TLS handshakes. The extension is called called Snap Start, and it’s already been deployed in Chrome (and presumably throughout Google’s back end.)
So what is Snap Start, why is it different/better than simply resuming a TLS session, and most importantly: how snappy is it?
Background: The TLS Handshake
To understand Snap Start we need to quickly review the TLS handshake. Snap Start only supports the classic (non-ephemeral) handshake protocol, so that’s what we’ll be talking about here. Leaving out a lot of detail, that handshake can be reduced to the following four messages:
- C -> S: Client sends a random nonce CR.
- C <- S: Server responds with its public key (certificate) and another nonce SR.
- C -> S: Client generates a pre-master secret PMS, encrypts it with the server’s public key and sends the ciphertext to the server.
- C <- S: The client and server both derive a ‘master secret’ by hashing* together (CR, SR, PMS). The server and client also MAC all previous handshake messages to make sure nothing’s been tampered with. Finally the server notifies the client that it’s ready to communicate.
I’m ignoring many, many important details here. These include ciphersuite negotiation, parameter data, and — most importantly — client authentication, which we’ll come back to later. But these four messages are what we need to think about right now.
So why Snap Start?
It only takes a glance at the above protocol to identify the problem with the classic handshake: it’s frickin’ slow. Every new TLS connection requires two latency-inducing round-trips. This can be a drag when you’re trying to deploy TLS everywhere.
Now technically you don’t need to run the whole handshake each time you connect. Once you’ve established a TLS session you can resume it anytime you want — provided that both client and server retain the master secret.** Session resumption reduces communication overhead, but it isn’t the answer to everything. Most people will balk at the idea of hanging onto secret keys across machine reboots, or even browser restarts. Moreover, it’s not clear that a busy server can afford to securely cache the millions of keys it establishes every day.
What’s needed is a speedy handshake protocol that doesn’t rely on caching secret information. And that’s where Snap Start comes in.
The intuition behind Snap Start is simple: if TLS requires too many communication rounds, then why not ditch some. In this case, the target for ditching is the server’s message in step (2). Of course, once you’ve done that you may as well roll steps (1) and (3) into one mutant mega-step. This cuts the number of handshake messages in half.
There are two wrinkles with this approach — one obvious and one subtle. The obvious one is that step (2) delivers the server’s certificate. Without this certificate, the client can’t complete the rest of the protocol. Fortunately server certificates don’t change that often, so the client can simply cache one after the first time it completes the full handshake.***
The subtle issue has to do with the reason those nonces are there in the first place. From a security perspective, they’re designed to prevent replay attacks. Basically, this is a situation where an attacker retransmits captured data from one TLS session back to a server in the hopes that the server will accept it as fresh. Even if the data is stale, there are various scenarios in which replaying it could be useful.
Normally replays are prevented because the server picks a distinct (random) nonce SR in step (2), which has implications throughout the rest of the handshake. But since we no longer have a step (2), a different approach is required. Snap Start’s solution is simple: let the client propose the nonce SR, and just have the server make sure that value hasn’t been used before.
Obviously this requires the server to keep a list of previously-used SR values (called a ‘strike list’), which — assuming a busy server — could get out of control pretty fast.
The final Snap Start optimization is to tie proposed SR values with time periods. If the client suggests an SR that’s too old, reject it. This means that the server only has to keep a relatively short strike list relating to the last few minutes or hours. There are other optimizations to deal with cases where multiple servers share the same certificate, but it’s not all that interesting.
So here’s the final protocol:
- The client generates a random nonce CR and a proposed nonce SR. It also generates a pre-master secret PMS, encrypts it with the server’s public key and sends the ciphertext and nonces to the server.
- The server checks that SR hasn’t been used before/recently. Both the client and server both derive a ‘master secret’ by hashing* together (CR, SR, PMS). The server notifies the client that it’s ready to communicate.
The best part is that if anything goes wrong, the server can always force the client to engage in a normal TLS handshake.
Now for the terrible flaw in Snap Start
No, I’m just kidding. There doesn’t seem to be anything wrong with Snap Start. With caveats. It requires some vaguely synchronized clocks. Moreover, it’s quite possible that a strike list could get wiped out in a server crash, which would open the server up to limited replays (a careful implementation could probably avoid this). Also, servers that share a single certificate could wind up vulnerable to cross-replays if their administrator forgets to configure them correctly.
One last thing I didn’t mention is that Snap Start tries to use as much of the existing TLS machinery as possible. So even though the original step (2) (‘Server Hello’ message) no longer exists, it’s ‘virtually’ recreated for the purposes of computing the TLS Finished hash check, which hashes over all preceding handshake messages. Ditto for client authentication signatures. Some new Snap-specific fields are also left out of these hashes.
As a consequence, I suppose there’s a hypothetical worry that an attack on Snap Start (due to a bad server implementation, for example) could be leveraged into an attack that works even when the client requests a normal TLS handshake. The basic idea would be to set up a man-in-the-middle that converts the client’s standard TLS handshake request into a Snap Start request, and feeds convincing lies back to the client. I’m fairly certain that the hash checks and extra Snap Start messages will prevent this attack, but I’m not 100% sure from reading the spec.
Beyond that, all of this extra logic opens the door for implementation errors and added complexity. I haven’t looked at any server-side implementations, but I would definitely like to. Nonetheless, for the moment Snap Start seems like a darned good idea. I hope it means a lot more TLS in 2012.
* Technically this is a denoted as a PRF, but it’s typically implemented using hash functions.
** At session resumption the master secret is ‘updated’ by combining it with new client and server randomness.
*** Certificate revocation is still an issue, so Snap Start also requires caching of ‘stapled’ OCSP messages. These are valid for a week.