Update 8/27: This post was originally published three weeks ago under a different title. I subsequently took it down to give affected vendors time to patch the bugs. As a result of the notification, Digital Content Protection LLC (DCP) has updated the spec to v2.2.
Contrary to my understanding when I wrote the original post, HDCP v2 actually is used by a number of devices. I would like to give credit to Alon Ziv at Discretix, who had previously discovered the Locality Check issue, and to Martin Kaiser who experimentally verified the master secret issue on a European Samsung TV and a Galaxy S II.
Finally, I would like to thank Hanni Fakhoury and Marcia Hofmann at the Electronic Frontier Foundation for all of their helpful advice. The EFF is one of the only organizations that represents security researchers. Please consider donating so they can keep doing it!
Over the past couple of weeks I’ve mostly been blogging about inconsequential things. Blame summer for this — it’s hard to be serious when it’s 104 degrees out. But also, the world just hasn’t been supplying much in the way of interesting stuff to write about.
Don’t get me wrong, this is a good thing! But in a (very limited) way it’s also too bad. One of the best ways to learn about security systems is to take them apart and see how they fail. While individual systems can be patched, the knowledge we collect from the process is invaluable.
Fortunately for us, we’re not completely helpless. If we want to learn something about system analysis, there are plenty of opportunities right out there in the wild. The best place to start is by finding a public protocol that’s been published, but not implemented yet. Download the spec and start poking!
This will be our task today. The system we’ll be looking at is completely public, and (to the best of my knowledge) has not yet been deployed anywhere (Update: see note above). It’s good practice for protocol cryptanalysis because it includes all kinds of complicated crypto that hasn’t been seriously reviewed by anyone yet.
(Or at least, my Google searches aren’t turning anything up. I’m very willing to be corrected.)
Best of all, I’ve never looked at this system before. So whatever we find (or don’t find), we’ll be doing it together.
A note: this obviously isn’t going to be a short post. And the TL;DR is that there is no TL;DR. This post isn’t about finding bugs (although we certainly will), it’s about learning how the process works. And that’s something you do for its own sake.
The protocol we’ll be looking at today is the High Bandwidth Digital Content Protection (HDCP) protocol version 2. Before you get excited, let me sort out a bit of confusion. We are not going to talk about HDCP version 1, which is the famous protocol you probably have running in your TV right now.
What we’ll be looking at today is the replacement: HDCP v2. This protocol is everything that its predecessor was not. For one thing, it uses standard encryption: RSA, AES and HMAC-SHA256. It employs a certificate model with a revocation list. It also adds exciting features like ‘localization’, which allows an HDCP transmitter to determine how far away a receiver is, and stop people from piping HDCP content over the Internet. (In case they actually wanted to do that.)
HDCPv2 has barely hit shelves yet (Update: though it was recently selected as the transport security for MiraCast). The Digital Content Protection licensing authority has been keeping a pretty up-to-date set of draft protocol specifications on their site. The latest version at the time of this writing is 2.1, and it gives us a nice opportunity to see how industry ‘does’ protocols.
An overview of the protocol
As cryptographic protocols go, HDCPv2 has a pretty simple set of requirements. It’s designed to protect high-value content running over a wire (or wireless channel) between a transmitter (e.g., a DVD player) and a receiver (a TV). The protocol accomplishes the following operations:
- Exchanging and verifying public key certificates.
- Establishing shared symmetric keys between the transmitter and receiver.
- Caching shared keys for use in later sessions.
- Verifying that a receiver is local, i.e., you’re not trying to proxy the data to some remote party via the Internet.
These functions are accomplished via three (mostly) separate protocols: a public-key Authenticated Key Agreement (AKE) protocol, a pairing protocol, where the derived key is cached for later use, and a locality check protocol to ensure that the devices are physically close.
I’m going to take these protocols one at a time, since each one involves its own messages and assumptions.
Phase (1): Authenticated Key Agreement (AKE)
The core of HDCPv2 is a custom key exchange protocol, which looks quite a bit like TLS. (In fact, the resemblance is so strong that you wonder why the designers didn’t just use TLS and save a lot of effort.) It looks like this:
|HDCPv2 key agreement protocol (source). Click the image to enlarge.|
Now, there’s lots going on here. But if we only look at the crypto, the summary is this:
The transmitter starts by sending ‘AKE_Init’ along with a random 64-bit nonce R_tx. In response, the receiver sends back its certificate, which contains its RSA public key and device serial number, all signed by the HDCP licensing authority.
If the certificate checks out (and is not revoked), the transmitter generates a random 128-bit ‘master secret’ K_m and encrypts it under the receiver’s public key. The result goes back to the receiver, which decrypts it. Now both sides share K_m and R_tx, and can combine them using a wacky custom key derivation function. The result is a shared a session key K_d.
The last step is to verify that both sides got the same K_d. The receiver computes a value H’, using HMAC-SHA256 on inputs K_d, R_tx and some other stuff. If the receiver’s H’ matches a similar value computed at the transmitter, the protocol succeeds.
Note that I’ve ignored one last message in the protocol, which turns out to be very important. Before we go there, let’s pause and take stock.
If you’re paying close attention, you’ve noticed a couple of worrying things:
- The transmitter doesn’t authenticate itself at all. This means anyone can pretend to be a transmitter.
- None of the handshake messages (e.g., AKE_Transmitter_Info) appear to be authenticated. An attacker can modify them as they transit the wire.
- The session key K_d is based solely on the inputs supplied by the transmitter. The receiver does generate a nonce R_rx, but it isn’t used in the above protocol.
Phase (2): Pairing
Public-key operations are expensive. And you only really need to do them once. The designers recognized this, and added a feature called ‘pairing’ to cache the derived K_m for use in later sessions. This is quite a bit like what TLS does for session resumption.
However, there’s one catch, and it’s where things get complicated: some receivers don’t have a secure non-volatile storage area for caching keys. This didn’t phase the designers, who came up with a ‘clever’ workaround for the problem: the receiver can simply ask the transmitter to store K_m for it.
To do this, the receiver encrypts K_m under a fixed internal AES key K_h (which is derived by hashing the receiver’s RSA private key). In the last message of the AKE protocol the receiver now sends this ciphertext back to the transmitter for storage. This appears in the protocol diagram as the ciphertext E(K_h, K_m).
The obvious intuition here is that K_m is securely encrypted. What could possibly go wrong? The answer is to ask how K_m is encrypted. And that’s where things get worrying.
According to the spec, K_m is encrypted using AES in what amounts to CTR mode, where the ‘counter’ value is defined as some value m. On closer inspection, m turns out to be just the transmitter nonce R_tx padded with 0 bits. So that’s simple. Here’s what it looks like:
|Encryption of the master key K_m with the receiver key K_h. The value m is equal to (R_tx || 0x000000000000000).|
Now, CTR is a perfectly lovely encryption mode provided that you obey one unbreakable rule: the counter value must never be re-used. Is that satisfied here? Recall that the counter m is actually chosen by another party — the transmitter. This is worrying. If the transmitter wants, it could certainly ask the receiver to encrypt anything it wants under the same counter.
Of course, an honest transmitter won’t do this. But what about a dishonest transmitter? Remember that the transmitter is not authenticated by HDCP. The upshot is that an attacker can pretend to be a transmitter, and submit her own K_m values to be encrypted under K_h and m.
Even this might be survivable, if it weren’t for one last fact: in CTR mode, encryption and decryption are the same operation.
All of this leads to the following attack:
- Observe a legitimate communication between a transmitter and receiver. Capture the values R_tx and E(K_h, K_m) as they go over the wire.
- Now: pretend to be a transmitter and initiate your own session with the receiver.
- Replay the captured R_tx as your initial transmitter nonce. When you reach the point where you pick the master secret, don’t use a random value for K_m. Instead, set K_m equal to the ciphertext E(K_h, K_m) that you captured earlier. Recall that this ciphertext has the form:AES(K_h, R_Tx || 000…) ⊕ K_m
Now encrypt this value under the receiver’s public key and send it along.
- Sooner or later the receiver will encrypt the ‘master secret’ you chose above under its internal key K_h. The resulting ciphertext can be expanded to:
AES(K_h, R_Tx || 000…) ⊕ AES(K_h, R_Tx || 000…) ⊕ K_m
This is a huge problem for two reasons. First, K_m is used to derive the session keys used to encrypt HDCP content, which means that you may now be able to decrypt any past HDCP content traces. And even worse, thanks to the ‘pairing’ process, you may be able to use this captured K_m to initiate or respond to further sessions involving this transmitter.
Did I mention that protocols are hard?
Phase (3): The Locality Check
For all practical purposes, the attack above should be our stopping point. Once you have the stored K_m you can derive the session keys and basically do whatever you want. But just for fun, let’s go on and see what else we can find.
At its heart, the locality check is a pretty simple thing. Let’s assume the transmitter and receiver are both trusted, and have successfully established a session key K_d by running the AKE protocol above. The locality check is designed to ensure that the receiver is nearby — specifically, that it can provide a cryptographic response to a challenge, and can do it in < 7 milliseconds. This is a short enough time that it should prevent people from piping HDCP over a WAN connection.
(Why anyone would want to do this is a mystery to me.)
|Simple version of the locality check. K_d is a shared key and R_rx is a receiver nonce.|
But here’s a wrinkle: some devices are slow. Consider that the 7 milliseconds must the round-trip communication time, as well as the time required to compute the HMAC. There is a very real possibility that some slower, embedded devices might be not be able to respond in time.
Will HDCP provide a second, optional protocol to deal with those devices? You bet it will.
|‘Precomputed’ version of the protocol.|
This is nearly the same protocol, with a few small differences. Notably, the transmitter gives the receiver all the time it wants to compute the HMAC. The timer doesn’t start until the receiver says it’s ready.
Of course, there has to be something keeping the RTT under 7ms. In this case the idea is to keep the receiver from speaking until it’s received some authenticator from the transmitter. This consists of the least significant 128-bits of the expected HMAC result (L’), which is computed in the same way as above. The receiver won’t speak until it sees those bits. Then it‘ll it kick back its own response, which consists of the most-significant 128 bits of the same value.
Ok, so here we have a protocol that’s much more complicated. But considered its own, this one looks pretty ok by me.
But here’s a funny question: what if we try running both protocols at once?
No, I’m not being ridiculous. You see, it turns out that the receiver and transmitter get to negotiate which protocol they support. By default they run the ‘simple’ protocol above. If both support the pre-computed version, they must indicate this in the AKE_Transmitter_Info and AKE_Receiver_Info messages sent during the handshake.
This leads to the following conjecture: what if, as a man-in-the-middle attacker, we can convince the transmitter to run the ‘pre-computed’ protocol. And at the same time, convince the receiver to run the ‘simple’ one? Recall that none of the protocol flags (transmitted during the AKE) are authenticated. We might be able to trick both sides into seeing a different view of the other’s capabilities.
Here’s the setup: we have a receiver running in China, and a transmitter located in New York. We’re a man-in-the-middle sitting next to the transmitter. We want to convince the transmitter that the receiver is close — close enough to be on a LAN, for example. Consider the following attack:
- Modify the message flags so that the transmitter thinks we’re running the pre-computed protocol. Since it thinks we’re running the pre-computed protocol, it will hand us R_n and then give us all the time in the world to do our pre-computation.
- Now convince the receiver to run the ‘simple’ protocol. Send R_n to it, and wait for the receiver to send back the HMAC result (L’).
- Take a long bath, mow the lawn. Watch Season 1 of Game of Thrones.
- At your leisure, send the RTT_READY message to the transmitter, which has been politely waiting for the receiver to finish the pre-computation
- The transmitter will now send us some bits. Immediately send it back the most significant bits of the value L’, which we got in step (2).
- Send video to China.
Now this attack may not always work — it hinges on whether we can convince the two parties to run different protocols. Still, this is a great teaching example in that it illustrates a key problem in cryptographic protocol design: parties may not share the same view of what’s going on.
A protocol designer’s most important job is to ensure that such disagreements can never happen. The best way to do this is to ensure that there’s only one view to be had — in other words, dispense with all the options and write a single clear protocol. But if you must have options, make sure that the protocol only succeeds if both sides agree on what those options are. This is usually accomplished by authenticating the negotiation messages, but even this can be a hard, hard problem.Compared to the importance of learning those lessons, actually breaking localization is pretty trivial. It’s a stupid feature anyway.
This has been a long post. To the readers I have left at this point: thanks for sticking it out.The only remaining thing I’d like to say is that this post is not intended to judge HDCPv2, or to make it look bad. It may or it may not be a good protocol, depending on whether I’ve understood the specification properly and depending on whether the above flaws make it into real devices. Which, hopefully they won’t now.
What I’ve been trying to do is teach a basic lesson: protocols are hard. They can fail in ruinous, subtle, unexpected, exciting ways. The best cryptographers — working with BAN logic analyzers and security proofs — still make mistakes. If you don’t have those tools, steer clear.
The best ‘fix’ for the problem is to recognize how dangerous protocols can be,and to avoid designing your own. If you absolutely must do so, please try to make yours as simple as possible. Too many people fail to grok this lesson, and the result is, well, HDCPv2.
Update 8/27: As I mentioned above, DCP has released a new version of the specification. Version 2.2 includes several updates: it changes the encryption of Km to incorporate both the Transmitter and Receiver nonces. It also modifies the locality check to patch the bug described above. Both of these changes appear to mitigate the bugs above, at least in new devices.