Intuition for the Security Proof of F-O

The basic idea of the F-O proof is to show via a reduction argument that if there exists any algorithm (attacker) that wins the IND-CCA2 security game against some scheme based on the F-O transform (with probability significantly greater than 1/2), then there must be some flaw in the underlying IND-CPA encryption scheme that the transform was built on.

(The following is a rough sketch done from memory and likely contains some flaws. I really shouldn’t even write it, and you shouldn’t take it as more than a rough intuition. I welcome your comments.)

The IND-CPA game is a lot like the IND-CCA2 game, but it lacks a couple of steps:

  1. I generate an encryption keypair for a public-key scheme and give you the public key.
  2. You can send me (sequentially and adaptively) many ciphertexts, which I will decrypt with my secret key. I’ll give you the result of each decryption.
  3. Eventually you’ll send me a pair of messages (of equal length) M_0, M_1 and I’ll pick a bit b at random, and return to you the encryption of M_b, which I will denote as C^* \leftarrow {\sf Encrypt}(pk, M_b).
  4. You’ll repeat step (2), sending me ciphertexts to decrypt. If you send me C^* I’ll reject your attempt. But I’ll decrypt any other ciphertext you send me, even if it’s only slightly different from C^*.
  5. You (the attacker) will output your guess b'. They “win” the game if b'=b.
  6. We say a scheme is IND-CPA secure if the attacker wins with probability “not much greater” than 1/2.

So in short, our attacker is going to play the IND-CCA2 game with us, and we’re going to play the IND-CPA game with some other challenger. When our challenger gives us a public key, we’ll give it to our attacker. When the attacker gives us a pair of messages M_0, M_1 we’ll ask our own IND-CPA challenger to encrypt one of a pair of messages. When the attacker makes a guess b' we’ll somehow use that to figure out which message our IND-CPA challenger encrypted. All we have to do is answer our attacker’s queries in the middle part of the game.

The tricky part here is that answering our attacker’s queries seems super hard. The attacker is going to send us ciphertexts to decrypt, yet we won’t have the decryption key (because the IND-CPA game doesn’t give it to us). This is great in one sense, since if we’re able to pull this nonsense off, it guarantees that we can’t possibly be leaking useful information about that key.

On the other hand, it’s kind of bonkers. How the heck do we decrypt stuff without a decryption key?

The trick that cuts us out of this gordian knot is that proof is that our proof will be conducted in the random oracle model (ROM). The ROM is a special heuristic in which hash functions (like the H_1, H_2 we use in our scheme) aren’t really implemented using code that an attacker can run. Instead, we treat these hash functions as random functions, and implement them via “oracles” that the attacker has to call outside of herself when she wants to obtain the hash of some string.

Yes, this is completely nuts. But in practice, this lets us “cheat” quite a bit when conducting the proof. That’s because we can take any given attack algorithm and “plug into” its ports where it sends out requests to compute the functions H_1, H_2. That means we’ll see every single value it tries to hash (super useful!) and — even better — we can mess with the results it gets back. The latter trick is called “programming”.

The key observation we’ll use in this proof is the following: anytime the attacker tries to encrypt a ciphertext C = (C_1, C_2) under our public key using the encryption algorithm of our scheme, she will have to evaluate both H_1(R \| M) and H_2(R) if they want to do it correctly. If we can intercept these hash calls, we’ll learn M, R and r (and hence the values C_1, C_2 which are formed from those values). That means we can always answer a decryption query if she ever asks us to decrypt the resulting ciphertext. That’s nice.

Similarly, if we ever see a ciphertext that was not obviously formed from the results of the hash function calls, then we can simply argue that this ciphertext will not pass the checks in our decryption scheme. That is, if we had decrypted it properly using the secret key (which we don’t have) then it would almost certainly have produced an Error result. So we can just output Error in these cases, and we’ll almost certainly be right (proving this is obviously a bit more work — this is just an intuition! But maybe you can see how hard it would be for an attacker to guess the right value without calling the hash functions.)

And now we get to the hardest part, which is producing a challenge ciphertext C^* for the attacker, and using the attacker’s guess b' to somehow help us break the IND-CPA scheme.

Our strategy here is going to seem a bit bizarre. I’ll walk through it first, then explain why it works:

  1. When the attacker hands us M_0, M_1, we’ll pick a random bit b, and encrypt M_b. But we won’t just do this using the normal encryption algorithm.
  2. Instead, we’ll pick a new pair of messages at random from the IND-CPA scheme’s message space: call them R_0, R_1. We’ll send these to our IND-CPA challenger. It will pick its own bit \bar{b} and send back the IND-CPA encryption of one of these messages — although we won’t know which. Call that C^*_1 \leftarrow {\sf Encrypt_{CPA}}(\pk, R_{\bar{b}}).
  3. Pick a random string X of the appropriate length and set C^*_2 \leftarrow X \oplus M_b. We will implicitly define X such that X = H_2(R_{\bar{b}}). We don’t actually know what the value of R_{\bar{b}} is, because we don’t know \bar{b}, so we’re implicitly programming the random oracle such that this unknown input will output X.
  4. We send C^* = (C^*_1, C^*_2) to the attacker.
  5. If the attacker every queries the hash function H_2 or H_1 on input R_{\bar{b}}, we know that she has learned \bar{b} and we output that as our guess for the IND-CPA game.
  6. If the attacker never queries the hash function H_2 on such an input, then she should have no way of learning any information about X, which means that the message M_b is perfectly hidden from her, and her advantage in winning the CCA game is 0.
  7. Thus, if there is an attacker that wins the CCA game with reasonable probability, then it must be the case that she queries H_2, H_1, and therefore our reduction will win the CPA game with non-negligible probability.

Phew.