Friday, February 15, 2013

Why I hate CBC-MAC

If you're like most people, you don't have a strong opinion about CBC-MAC. In fact, if you're like most people, you don't have a strong opinion about any crypto primitive.

This is healthy. Keep up the good work.

I'm not most people. I've spent the last week thinking about and dealing with CBC-MAC -- or more specifically, code that uses it in various contexts -- and I need to share with you how much I despise this little algorithm. And beg you never to use it.

Oh yes, I know the temptation. You have this nice block cipher just sitting around -- maybe you're encrypting something -- and you've heard how serious this whole message authentication thing is. Maybe you've even thought about using one of those fancy authenticated encryption modes, but found them to be too exotic and complicated.

Then it comes to you that all your problems would be solved if you just used CBC-MAC. This is too bad, because now your troubles are just beginning.

Now a quick note: there's nothing really wrong with CBC-MAC, when implemented correctly. And it's not even that hard to implement properly. The problem is that many people who use CBC-MAC (rather than HMAC or a proper AEAD mode) seem incapable of actually doing this. They get it wrong in hilariously, embarassingly, stupid, complicated ways.

But of course you wanted examples. Ok, let's give some.

1. Your implementation doesn't handle variable-length messages.

A quick reminder. CBC-MAC is very similar to the classic CBC mode for encryption, with a few major differences. First, the Initialization Vector (IV) is a fixed value, usually zero. Second, CBC-MAC only outputs the last block of the ciphertext -- this single value forms the MAC.


Many dumb implementations stop here. And that leads to big problems.

Most notably, if your system allows for variable-length messages -- as it should -- there is a simple attack that allows you to forge new messages. First, get a MAC T on a message M1. Now XOR the tag T into the first block of some arbitrary second message M2, and get a MAC on the modified version of M2.

The resulting tag T' turns out to be a valid MAC for the combined message (M1 || M2). This is a valid forgery, and in some cases can actually be useful.

The standard fix to prepend the message length to the first block of the message before MACing it. But a surprisingly large number of (dumb) implementations skip this extra step. And many CBC-MAC implementations are dumb implementations.

2. Your implementation uses a random Initialization Vector.

If CBC-MAC with a fixed IV is great, surely CBC-MAC with a random IV must be super-great. But no, it isn't.

Using a random (or variable IV) is bad for the simple reason that verifying a CBC-MAC requires you to know the IV, and to know the IV you probably need to read it from somewhere. Typically this means the same untrusted place where you were storing your message.

If the attacker can change the CBC-MAC IV, they can also change the first block of the MACed message in an equivalent manner. This works because the first step of CBC-MAC is to XOR the IV with the message. There are all kinds of silly variants of this problem, and all of them hurt.

3. You've used the same key for MAC and encryption.

A general rule in cryptography is that you shouldn't use the same key for two different cryptographic primitives -- encryption and signature, for example. Or encryption and MAC.

Some people figure that rules were made to be broken.

Note that shared keys can actually be ok, in some cases. Combined modes like CCM (short for CTR + CBC-MAC) actually do use the same key for both operations. However, these modes do it in a very careful, thoughtful manner. Your garden-variety implementation doesn't.

One particularly ugly pattern I've seen is to use (dumb) CBC-MAC on a plaintext, then to encrypt said plaintext in CTR mode using some initial counter (C). This is insecure for a bunch of reasons, but specifically because I might be able to completely decrypt your ciphertext.

To do this, I simply ask you to encrypt a series of small files corresponding to the counter values C, C+1, etc. of the ciphertext I want to attack. The CBC-MAC of each of these files lets me recreate the CTR-mode keystream I need to decrypt the original ciphertext. Now I have your message.


4. You've used CBC-MAC as a hash function.

This one isn't really a problem with CBC-MAC, but it does crop up. In fact, it happened recently to the file sharing site Mega.

To make a long story short: cryptographic hash functions are public functions (i.e., no secret key) that have the property of collision-resistance (it's hard to find two messages with the same hash). MACs are keyed functions that (typically) provide message unforgeability -- a very different property. Moreover, they guarantee this only when the key is secret.

If you attempt to use CBC-MAC with a non-secret key, it becomes a very bad candidate for anything. In fact, you can trivially find useful collisions in the output, something that's very bad if you're using it to authenticate code. Which is what Mega was doing with it.

This isn't true of all MACs -- HMAC, for example, should retain the collision resistance of the underlying hash function even if the MAC key is compromised. This is yet another reason to prefer it for cases where cryptographic expertise is not a sure bet.

In summary

I'll repeat that none of these are really problems with CBC-MAC, which is a perfectly lovely algorithm if implemented and used correctly. The problems above only crop up when people try to whip it up themselves, without using a standard construction.

If you must write your own code, my recommendation is to use HMAC -- which is extremely hard to screw up. If you're doing combined encryption/MAC and you only have a block cipher, then look into the CCM spec, which is a patent free AEAD mode. This should address all of these problems and give you some nice test vectors too.

What you shouldn't do is code up some half-assed CBC-MAC thing and expect you'll be ok. The fact is, you probably won't.

16 comments:

  1. What about OCB? I know it bears the "patented" stigma, but there's still some civilization outside US (where it's not patented) and it's free to use in GPL projects.

    ReplyDelete
    Replies
    1. GCM is great. OCB is also great. I mentioned CCM only because it's based on CBC-MAC, I wouldn't actually recommend it as my first choice. For recommendations on AEAD schemes, see:

      http://blog.cryptographyengineering.com/2012/05/how-to-choose-authenticated-encryption.html

      Delete
    2. OCB's licensing has been expanded as of about a month ago

      http://www.cs.ucdavis.edu/~rogaway/ocb/license.htm

      It's now free to use for nearly everything except military applications.

      Delete
    3. OCB seems like a decent mode. But with the licensing requirement for military use remaining, it doesn't seem any more likely it will be recommended by NIST, which is likely to continue to hurt its adoption. People are of course free to do what they like with their inventions, but I wonder if we'd all be using OCB by now without the patent limitations.

      I like GCM's properties, but it seems like a hassle to implement. Requiring large, key dependent tables (or hardware support) for any type of reasonable speed doesn't exactly make it friendly to implementers.

      EAX seems like the best choice, IMO, for AEAD schemes. Definitely better than implementing CBC-MAC yourself alongside encryption or even trying to combine a better MAC and encryption by yourself.

      Delete
  2. I think that there is a more general lesson here. Many systems are extremely fragile in the face of usage error. The paradigmatic example of this is the One Time Pad. It goes from perfect security to utter crap with the slightest deviation from its requirements.

    This leads then to the question of whether we can design systems that are resilient to application developers' screw-ups. Or alternatively can we find ways to make it harder for application developers to screw up.

    We live in a word in which most application developers who use cryptography aren't going to learn why things work or break. And so somehow need to find ways to design for this world.

    Cheers,

    -j

    ReplyDelete
  3. @Pawel Krawczyk: Rogaway recently extended the OCB patent grant to cover all open-source software (not just GPL), non-military software, and noncommercial hardware implementations:

    http://www.cs.ucdavis.edu/~rogaway/ocb/license.htm

    It's definitely still not ideal, but much better than last year.

    ReplyDelete
    Replies
    1. Great news, I missed that! OCB is really nice and fast algorithm, and once I've used authenticated encryption I'd never bother to get back to implementing full block mode plus MAC scheme.

      Delete
  4. Keccak (SHA3) can be used as a drop-in for CBC-MAC and is much more secure. It avoids all these basic pitfalls that anyone having studied CBC for more than 10 minutes can find immediatly.

    ReplyDelete
  5. Why are the CBC logo from http://www.cbc.ca/ ??? haha.

    ReplyDelete
    Replies
    1. Because nobody has ever come up with a good logo for CBC encryption and I thought it was cute.

      Delete
    2. Not sure about US copyright laws but in Europe using that logo without permission is illegal. Better take it down. I don't want you to get burned.

      Delete
  6. Is the length-based construction provably secure (for variable-length messages)?

    ReplyDelete
  7. Can somebody just set up a website that contains safe instructions for all commonly required cryptographic primitives? They should be available on all major development platforms and easy to use. I feel that almost nobody is able to do crypto by himself (me included).

    The most common mistake I see is to use unauthenticated encryption in placed where the attacker can modify the cipher text (cookies, links, ...).

    ReplyDelete
    Replies
    1. At the "Crypto for 2020" event in Tenerife, Spain, this January, this was actually brought up (by Jean-Philippe Aumasson, I believe).

      I think the idea was great, and I'd like to see this become a reality. Unfortunately, I do not myself have the competence in the implementation area to contribute with (proper) content myself. I urge anyone else with the competences though, to follow up on this.

      In academia, modes e.g. CBC is only briefly touched upon, and I suspect a source of bad implementations could be that when CBC-MAC is touched upon, the pitfalls that follow along are not explained properly. This has consequences. So when it is taught, it should come with a clear disclaimer. Having a good source of reference, in the lines of what you mention, surely would help as well.

      Delete
  8. I can't say I have ever done any real crypto, but if I started, I would surely read some material on the quirks of these algorithms before I implemented one. It's very complicated.

    --------------
    Bright Heritage Associates

    ReplyDelete
  9. Can someone explain why releasing intermediate blocks in CBC MAC will result in an insecure message, even in fixed length message

    ReplyDelete