Friday, January 25, 2013

In defense of Provable Security

It's been a long time with no blogging, mostly thanks to travel and deadlines. In fact I'm just coming back from a workshop in Tenerife, where I learned a lot. Some of which I can't write about yet, but am really looking forward to sharing with you soon.

During the workshop I had some time to talk to Dan Bernstein (djb), and to hear his views on the relevance of provable security. This is a nice coincidence, since I notice that Dan's slides have been making the rounds on Twitter -- to the general approval of some who, I suspect, agree with Dan because they think that security proofs are hard.

The problem here is that this isn't what Dan's saying. Part of the issue is that his presentation is short, so it's easy to misinterpret his position as a call to just start designing cryptosystems the way we design software. That's not right, or if it is: get ready for a lot of broken crypto.

This post is my attempt to explain what Dan's saying, and then (hopefully) convince you he's not recommending the crazy things above.

There's no such thing as a "security proof"

Dan's first point is that we're using the wrong nomenclature. The term 'security proof' is misleading in that it gives you the impression that a scheme is, well... provably secure. There aren't many schemes that can make this claim (aside from the One-Time Pad). Most security proofs don't say this, and that can lead to misunderstandings.

The proofs that we see in day-to-day life are more accurately referred to as security reductions. These take something (like a cryptographic scheme) and reduce its security to the hardness of some other problem -- typically a mathematical problem, but sometimes even another cryptosystem.

A classic example of this is something like the RSA-PSS signature, which is unforgeable if the RSA problem is hard, or Chaum-van Heijst-Pfitzmann hashing, which reduce to the hardness of the Discrete Logarithm problem. But there are more complex examples like block cipher modes of operation, which can often be reduced to the (PRP) security of a block cipher.

So the point here is that these proofs don't actually prove security -- since the RSA problem or Discrete Log or block cipher could still be broken. What they do is allow us to generalize: instead of analyzing many different schemes, we can focus our attention one or a small number of hard problems. In other words, it's a different -- and probably much better -- way to allocate our (limited) cryptanalytic effort.

But we don't study those problems well, and when we do, we break them

Dan argues that this approach is superficially appealing, but concretely it can be harmful. Take the Chaum et al. hash function listed above. Nobody should ever use this thing: it's disastrously slow and there's no solid evidence that it's truly more secure than something like SHA-3 or even SHA-3's lamest competitors.

And here (unfortunately) he's got some evidence on his side: we've been amazingly unsuccessful at cryptanalyzing complex new cipher/hash primitives like AES, BLAKE and Keccak, despite the fact that these functions don't have [real] security proofs. Where we make cryptanalytic progress, it's almost always on first-round competition proposals, or on truly ancient functions like MD5. Moreover, if you take a look at 'provably-secure' number theoretic systems from the same era, you'll find that they're equally broken -- thanks to bad assumptions about key and parameter sizes.

We've also gotten pretty good at chipping away at classic problems like the Discrete Logarithm. The charitable interpretation is that this is a feature, not a bug -- we're focusing cryptanalytic effort on those problems, and we're making progress, whereas nobody's giving enough attention to all these new ciphers. The less charitable interpretation is that the Discrete Logarithm problem is a bad problem to begin with. Maybe we're safer with unprovable schemes that we can't break, then provable schemes that seem to be slowly failing.

You need a cryptanalyst...

This is by far the fuzziest part (for me) of what Dan's saying. Dan argues that security proofs are a useful tool, but they're no substitute for human cryptanalysis. None of which I would argue with at all. But the question is: cryptanalysis of what?

The whole point of a security reduction is to reduce the amount of cryptanalysis we have to do. Instead of a separate signature and encryption scheme to analyze, we can design two schemes that both reduce to the RSA problem, then we can cryptanalyze that. Instead of analyzing a hundred different authenticated cipher modes, we can simply analyze one AES -- and know that OCB and GCM and CBC and CTR will all be secure (for appropriate definitions of 'secure').

This is good, and it's why we should be using security proofs. Not to mislead people, but to help us better allocate our very scarce resources -- of smart people who can do this work (and haven't sold out to the NSA).

...because people make mistakes

One last point: errors in security proofs are pretty common, but this isn't quite what Dan is getting at. We both agree that this problem can be fixed, hopefully with the help of computer-aided proof techniques. Rather, he's concerned that security proofs only prove that something is secure within a given model. There are  many examples of provably-secure schemes that admit attacks because those attacks were completely outside of that threat model.

As an example, Dan points to some older EC key agreement protocols that did not explicitly include group membership tests in their description. Briefly, these schemes are secure if the attacker submits valid elements of an elliptic curve group. But of course, a real life attacker might not. The result can be disastrously insecure.

So where's the problem here? Technically the proof is correct -- as long as the attacker submits group elements, everything's fine. What the protocol doesn't model is the fact that an attacker can cheat -- it just assumes honesty. Or as Dan puts it: 'the attack can't even be modeled in the language of the proof'.

What Dan's not saying

The one thing you should not take away from this discussion is the idea that security proofs have no value. What Dan is saying is that security proofs are one element of the design process, but not 100% of it. And I can live with that.

The risk is that some will see Dan's talk as a justification for using goofy, unprovable protocols like PKCS#1v1.5 signature or the SRP password protocol. It's not. We have better protocols that are just as well analyzed, but actually have a justification behind them.

Put it this way: if you have a choice between driving on a suspension bridge that was designed using scientific engineering techniques, and one that simply hasn't fallen down yet, which one are you going to take? Me, I'll take the scientific techniques. But I admit that scientifically-designed bridges sometimes do fall down.

In conclusion

While I've done my best to sum up Dan's position, what I've written above is probably still a bit inaccurate. In fact, it's entirely possible that I've just constructed a 'strawman djb' to argue with here. If so, please don't blame me -- it's a whole lot easier to argue with a straw djb than the real thing.

10 comments:

  1. "Some of which I can't write about yet, but am really looking forward to sharing with you soon."

    Hopefully before 2020!

    ReplyDelete
  2. An interesting read on this... http://www.daimi.au.dk/~ivan/positionpaper.pdf

    ReplyDelete
  3. The 'older' schemes you refer to as missing point validation include most prominently HMQV from 2005 by Krawczyk. 2005 is no more in the cryptographic stone age.

    MQV is from the 90's but does properly include point validation -- which at the same time gets you into Certicom's patent mess; HMQV was marketed as an improvement over MQV.

    ReplyDelete
  4. A related question that is equally fascinating: How much real security is in AES-128 in the sense that an infinitely intelligent cryptanalyzing attacker cannot crack a key faster than with a certain amount of CPU time.

    My guess: Less than 128 bits of security, bot more than zero. Modern algorithms are so hard to break that there *must* be some kind of inherent, hard security in them.

    The question is: Can we determine its amount and can we prove it is there?

    ReplyDelete
  5. "A classic example of this is something like the RSA-PSS signature, which is unforgeable if the RSA problem is hard"

    Having investigated RSA-PSS in detail, I must nitpick here: RSA-PSS is unforgeable if the RSA problem is hard AND if you use the random oracle model. That means more or less your hash function has to behave like a perfect hash function. The second is crucial, if your hash function is broken you will still have problems.

    ReplyDelete
  6. One impression I got is that systems that are designed to avoid proofs in the random oracle model tend to be silly. I usually prefer systems with a ROM proof over systems with a standard model proof, despite the problems of the ROM since their construction is usally much more natural.

    ReplyDelete
  7. Now I'm curious: what's wrong with SRP? I had assumed that SRP (or at least SRP-6a) was provably secure under some vaguely reasonable assumption.

    ReplyDelete
  8. "Security" is a relative term. Security proofs don't prove absolute security, they determine a level of security based on defined assumptions and probability. It's correct to say they are reductions, but I don't think they should be dismissed as proofs of security just because they don't prove invulnerability.

    ReplyDelete
  9. Hi Matt,

    Dan's slides and your post point to a phenomenon I've noticed where "provably security" does indeed lead to less secure schemes. I think the situation is more nuanced than how it's presented in your post of Dan's talk, and becomes even more nuanced when one examines the meaning of concrete security tightness in the public key versus symmetric key setting. I can give some examples:

    - Modes of encryption: Here security reductions have proven themselves to be tremendously useful. When attacks against proven-secure symmetric modes crop up, we can always (as far as I'm aware) pinpoint where the model's assumptions were violated and then address the issue. It is very hard to argue that we should abandon security reductions here.

    - Any use of public-key primitives for a symmetric key objective: The hash function in Dan's talk is an example of this. In this case a security reduction to a mathematically structured problem rather than an unstructured "symmetric-key" problem is almost always a sign of weakness, and in particular a sign of subexponential attacks.

    - Here is a more interesting and specific example that I like to trot out when discussing concrete security. Suppose you are going to deploy a DDH-based encryption scheme with many thousands of users. To assess the concrete security of this scheme, you could choose to analyze the concrete security of your scheme in a version of the IND-CCA model that includes many users and challenge ciphertexts, instead of just one user as we normally do. Of course, we have the asymptotic result that security for one user is enough, but by analyzing the many-user case we can hope to save the hybrid argument factor for concrete security.

    Now consider the following choice: You can either configure your DDH-based scheme to have every user choose their own DDH modulus and generator, or you can have the DDH modulus and generator specified in the scheme itself, so every user will be working in the same group.

    The latter case enables (depending on the scheme) a *tight* reduction to the DDH assumption, while the former does not. This suggests that the latter is "more secure", but that is (in my opinion) absurd: If everyone is working in the same group, then any cryptanalytic effort spent to analyze that group will be useful in attack every user. On the other hand, the former version avoids this issue by assigning each user a different group. (To be clear, I don't imagine this being the source of a practical attack; but hopefully my theoretical point is clear.)

    I've never been able to unwind this issue into a research topic worth writing a paper about, but it points to a strange and interesting property of the practice of choosing assumptions and giving reductions: The presence of a reduction in some model to a specific assumption may suggest that an attack is possible in some other model. (Or something like that; this is where things start to get fuzzy.) And it has to do with the relationship between assumptions: One could partially resolve the issue in my example by looking at a "many group DDH" assumption and its relation to "regular" DDH, as well by refining the multi-user model.

    More generally I've found the issue of "tightness to a public-key assumption" feature to be intuitively less important than "tightness to PRP security". In the public-key setting tightness doesn't seem to rule out relevant attacks, while in the symmetric key setting we do rule out issues like birthday meet-in-the-middle attacks. I'd love to see realistic examples where public-key tightness is important.

    Enjoying your blogging as always!
    -David Cash

    ReplyDelete
    Replies
    1. I just noticed this comment recently. There's a lot to comment on here, but a couple of things:

      1. w.r.t. Dan's hash function example, is the issue here simply that we're used to building on well-studied assumptions that /happen/ to have sub-exp attacks? And other assumptions would be better? And more to the point, are you better with a well-studied assumption with subexp (but superpolynomial) attack than with something that could break at any moment?

      2. Your comments about reductions are interesting. I think to some extent this is a reflection of the fact that we like 'simple, elegant' assumptions, then get obsessed with finding a tight reduction to them.

      I was going to pick on exactly the opposite problem -- which is that we become obsessed with particular assumptions, then get happy when we find a reduction to them //no matter how loose// the reduction is. OAEP comes to mind.

      Delete