Bram Cohen corrects me?

Bram Cohen has responded to my criticism of his comments on Practical Cryptography. For those who didn’t read my original post, here are my comments in a nutshell:

  1. No, you shouldn’t use RSA with public exponent 2 (i.e., Rabin-Williams). You’re way too likely to screw it up.
  2. No, you should not authenticate (MAC) before you encrypt. You should encrypt first, then you should add a MAC on the ciphertext. Or better yet, use a standard authenticated encryption scheme.
  3. You’re almost always better off using a standard encryption scheme that’s already been implemented and tested, no matter how much ‘better’ an alternative might sound.
You can see Bram’s response in the comments section, down at the bottom of the original post. I’ll try to hit the high points here without doing him an injustice:

Bram: Matthew, first of all, you’re being extremely dismissive of any practical concerns here. Practical crypto deployments bottleneck on the server doing modular exponentiations, and if you use the standard exponent over plain old 2, then you need *five times* as much computing power, which frequently translates into five times as many real machines consuming five times the power in five times as large of a colo space. 

My response? Fair enough. Rabin encryption is a little bit faster. If you’re writing towards an application that needs to perform enormous amounts of encryption, there’s a colorable argument that Rabin might be a good choice. But as Bram points out in his own response, if that’s the case, you’re probably better off using Elliptic Curve cryptography.

In general, I’m leery of efficiency arguments. Standard RSA encryption is extremely fast. The 0x10001 exponents was chosen to minimize the number of squarings required in the encryption process. According to this benchmark, taken on an Intel Core 2 1.83 GHz processor, 1024-bit encryption takes 0.08 milliseconds. That’s without hardware acceleration.

If you’re building a massive server farm and performance is an issue, perhaps there’s reason to worry about the encryption time. If you’re doing something with a very limited processor, and you really know what you’re doing, perhaps there’s some reason to be concerned about performance.

But as a general recommendation? No way. You’re just encouraging people to adopt schemes they don’t really understand, in order to realize a minor performance benefit they won’t notice.

Bram: Doing encrypt-then-authenticate looks a lot like authenticate-then-encrypt except that the authentication is plaintext, which obviously opens it to attacks which involve guessing the plaintext and verifying it because the authentication is right there, and you can extend those to modifying the ciphertext and replacing the authentication. I simply do not believe that making yourself susceptible to those problems is a win. 

Ok, I admit that I legitimately don’t understand Bram’s argument here. If anything, this sounds like the argument for encrypt-then-authenticate.

For the benefit of people who are just joining the discussion, there are basically two points of view. Bram says you should authenticate your plaintext (using a MAC) then you should encrypt it. I (and many other cryptographers) say you should encrypt first, then authenticate the ciphertext.

In a nutshell, the advantage of my approach is what happens at decryption time: when a ciphertext comes in, you first verify a MAC to ensure that the ciphertext is correct and hasn’t been tampered with. You do this with a key that’s not related to the encryption key. If the MAC verification fails, you stop. You simply don’t decrypt.

This ensures that no information about the plaintext ever leaks back to the attacker. If the attacker is tampering with ciphertexts (e.g., in an attempt to implement a padding oracle attack), he gets nothing. This approach has all kinds of nice advantages, one of which is its modularity.

You see, provided that you’re MACing the ciphertext, it doesn’t matter how the encryption scheme works. You don’t have to worry about using it a funny padding scheme. You don’t have to worry about any encoding or compression that might take place during the encryption process. If someone upgrades your encryption to a different mode of operation, you’re still fine — at least as far as active attacks are concerned.

Even better, the encrypt-then-authenticate approach can be proven IND-CCA secure (semantically secure against adaptive chosen ciphertext attacks). A proof can be found in this paper.

Bram: The other issue with encrypt-then-authenticate is that it makes the code more complicated and just plain uglier, which directly makes you more susceptible to the #1 cause of actual breaks, implementation error.

Ok, here again I disagree. Let’s take a look at the most common example of authenticate-then-encrypt that I can think of. It’s CBC-mode encryption as defined in the TLS specification. This has been standard going back to SSL 3.0, but it’s still used in current versions — due to backwards-compatibility issues.

So with that in mind, see this note included in the TLS 1.2 spec:

   Implementation note: Canvel et al. [CBCTIME] have demonstrated a
timing attack on CBC padding based on the time required to compute
the MAC.  In order to defend against this attack, implementations
MUST ensure that record processing time is essentially the same
whether or not the padding is correct.  In general, the best way to
do this is to compute the MAC even if the padding is incorrect, and
only then reject the packet.  For instance, if the pad appears to be
incorrect, the implementation might assume a zero-length pad and then
compute the MAC.  This leaves a small timing channel, since MAC
performance depends to some extent on the size of the data fragment,
but it is not believed to be large enough to be exploitable, due to
the large block size of existing MACs and the small size of the
timing signal.

Allow me to translate this to plain English: “because we chose to authenticate our plaintext, rather than our ciphertext, we had to (a) MAC both the message and the padding, and (b) even though we did that, it turned out that attackers could still hoover out a little bit of information based on the time that it takes to check the ciphertext (basically, implementations would immediately throw an error after seeing invalid padding, and this gave away the game). So implementers now have to do all this extra nonsense if they want their implementation to be secure.”

None of this is necessary in encrypt-then-authenticate. It’s simpler. Really!

I haven’t touched on every single point that Bram makes — you can see his argument in the original posts if you want.

The only final point I would like to make is that this stuff matters. If you get it wrong, you can attack an implementation. And too many people — professionals, even — get it wrong.

In conclusion: I certainly didn’t mean to bash Bram, but I’m happy to bash the advice itself. We can do better.

4 thoughts on “Bram Cohen corrects me?

  1. Some points of clarification: RSA with an exponent of 65537 is *exactly* five times as slow as Rabin-Williams, since it requires five modular multiplies, while Rabin-Williams requires one. Using an exponent of 3 requires two modular multiplies, and a very strong case can be made that it's more confidence-inspiring than 65537, as explained in this paper – http://www.shoup.net/papers/oaep.pdf

    It turns out that the amount of CPU used by curve25519 is pretty much the same as is used by 1024-bit RSA with an exponent of 65537. Come to think of it, that probably isn't a coincidence. I should ask Dan Bernstein about the reasoning behind it.

    I've thus far had to actually work on crypto for two protocols, one of which has been tabled for a while and one which is in the process of getting deployed. Both of these have rather extreme performance requirements, so the criteria probably don't generalize, but my plans have been to use Rabin-Williams for the first one and I'm using ed25519 for the second one. Mulling it over now, I might be inclined to use curve25519 for the first one, for the simple reason that the protocol as a whole allows the public key operations to be spread across several machines, even though there's a database machine which is very inconvenient to make be more than one. For my second protocol I'm making very extensive use of has trees, to keep the CPU usage under control. Like I said, extreme circumstances, criteria don't generalize…

  2. One note on curve25519: I believe that curve has a 128-bit security level, while 1024-bit RSA (Rabin) clocks in around only 80 bits. So the performance comparison between these two isn't really fair.

    Getting to 128-bit RSA/Rabin security requires about a 3,072-bit modulus (based on the state of the art in 2003, anyway). I'll leave it to you to find the benchmarks, but it'll be significantly more costly to use a modulus of that size, even if you're using exponent 2.

    If you really want 128-bit security, you can use curve25519. If 80 bit security is all you want, you can probably use a different curve — something with a parameter size closer to 160 bits. Keep in mind that I'm giving you this advice off the top of my head. You should probably look at the NIST docs for the exact security levels and performance.

  3. Yes, the speed comparison isn't really apples-to-apples. I'm doing the comparison though because there isn't a readily available ECC library which offers less than 128 bit security, and I'm comparing things which provide 'enough' security and where there's an off-the-shelf implementation I can just use. What I'd really like is a curve in a galois field, because those would be blazingly fast even on cell phones, if only the hardware supported no carry multiplies.

    The place where I'm still really not with you is on RSA exponents. I just don't see what benefit 65537 has over 3, even in theory, and it takes 2.5 times as much CPU.

  4. There are a bunch of attacks on low-exponent RSA that all have to do with bad or non-existent padding. That includes bad padding verification, particularly for signatures. People tend to screw padding up for some reason, but it's really a problem when they use e=3.

    If you're really strapped for cycles and you know what you're doing, then — to the best of my knowledge — e=3 is fine.

Comments are closed.