How to prove false statements? (Part 3)

This is the third and penultimate post in a series about theoretical weaknesses in Fiat-Shamir as applied to proof systems. The first post is here, the second post is here, and you should probably read them.

Over the past two posts I’ve given a bit of background on four subjects: (1) interactive proof systems (for general programs/circuits), (2) the Fiat-Shamir heuristic, (3) random oracles. We’ve also talked about (4) recursive proofs, which will be important but are not immediately relevant.

With all that background we are finally almost ready to talk about the new result by Khovratovich, Rothblum and Soukhanov entitled “How to Prove False Statements: Practical Attacks on Fiat-Shamir” (which we will call KRS for short.)

Let’s get situated

The KRS result deals with a very common situation that we introduced in the previous post. Namely, that many new proving systems are first developed as interactive protocols (usually called public-coin protocols, or sometimes Sigma protocols.) These protocols fit into a template: first, a Prover first sends a commitment message, then interacts with some (honest) Verifier who “challenges” the Prover in various ways that are specific to the protocol; for each challenge, the Prover must respond correctly, either once or multiple times. The nature of the challenges can vary from scheme to scheme. For now we don’t care about the details: the commonality is that in the interactive setting the Verifier picks its challenges honestly, using real randomness.

In many cases, these protocols are then “flattened” down into a non-interactive proof using the Fiat-Shamir heuristic. The neat thing is that the Prover can run the interactive protocol all by itself, by running a deterministic “copy” of the Verifier. Specifically, the Prover will:

Cryptographically hash its own Commitment message — along with some extra information, such as the input/output and “circuit” (or program, or statement.)¹
Use the resulting hash digest bits to sample a challenge.
Compute the Prover’s response to the challenge (many times if the protocol calls for this.)
Publish the whole transcript for anyone to verify.

Protocol designers focus on the interactive setting because it’s relatively easy to analyze the security of the protocol. They can make strong claims about the nature of the challenges that will be chosen, and can then reason about the probability that a cheating Prover can lie in response. The hope with Fiat-Shamir is that, if the hash function is “as good as” a random oracle, it should be nearly as secure as the interactive protocol.

Of course none of this applies to practice. When we deploy the protocol onto a blockchain, we don’t have interactive Verifiers and we certainly don’t have random oracles. Instead we replace the random oracle in Fiat-Shamir with a standard hash function like SHA-3. Whether or not any security guarantees still hold once we do this is an open question. And into the maw of that question is where we shall go.

The GKR15 succinct proof system (but not with details)

The new KRS paper starts by considering a specific interactive proof system designed back in 2015 by Goldwasser, Kalai and Rothblum, which we will refer to as GKR15 (note that the “R” is the not the same in both papers — thanks Aditya!) This is a relatively early result on interactive proving systems and has many interesting theoretical features that we will mostly ignore.

At the surface level, what you need to know about the GKR15 proof system is that it works over arithmetic circuits. The Prover and Verifier agree on a circuit C (which can be thought of as the “program”). The Prover also provides some input to the circuit x as well as a witness w, and a purported output y, which ought to satisfy y = C(w, x) if the Prover is honest.

There are some restrictions on the nature of the circuits that can be used in GKR15 — which we will ignore for the purposes of this high-level intuition. The important feature is that the circuits can be relatively deep. That is to say, they can implement reasonably complex programs, such as cryptographic hashing algorithms.

(The authors of the recent KRS paper also refer to GKR15 as a “popular” scheme. I don’t know how to evaluate that claim. But they cite some features in the protocol that are re-used in more recent proof systems that might be deployed in practice, so let’s go with this!)

The GKR15 scheme (like most proving schemes) is designed to be interactive. All you need to know for the moment is that:

The Prover and Verifier agree on C.
The Prover sends the input and output (x, y) to the Verifier.
It also sends a (polynomial) commitment to the witness w in the first “Commitment” message.
The Verifier picks a random challenge c and the Prover/Verifier interact (multiple times) to do stuff in response to this challenge.

Finally: GKR15 possesses a security analysis (“soundness proof”) that is quite powerful, provided we are discussing the interactive version of the protocol. This argument does not claim GKR15 is “perfect”! It leaves room for some tiny probability that a cheating Prover can get away with its crimes: however, it does bound the probability of successful cheating to something (asymptotically and, presumably, practically with the right parameters) negligible.

Since the GKR15 authors are careful theoretical cryptographers, they don’t suggest that their protocol would be a good fit for Fiat-Shamir. In fact, they hint that this is a problematic idea. But, in the 2015 paper anyway, the don’t actually show there’s a problem with the idea. This leaves open a question: what happens if we do flatten it using Fiat-Shamir? Could bad things happen?

A first thought experiment: “weak challenges”

I am making a deliberate decision to stay “high level” and not dive into the details of the GKR15 system quite yet, not because I think they’re boring and off-putting — ok, it’s partly because I think they’re boring and off-putting — but mostly because I think leading with those details will make understanding more difficult. I promise I’ll get to the details later.

Up here at the abstract level I want to remind us, one final time, what we know. We know that proving systems like GKR15 involve a Verifier making a “challenge” to the Prover, which the Prover must answer successfully. A good proving system should ensure that an honest Prover can answer those challenges successfully — but a cheating Prover (any algorithm that is trying to prove a false statement) will fail to respond correctly to these challenges, at least with overwhelming probability.

Now I want to add a new wrinkle.

Let’s first imagine that the set of all possible challenges a Verifier could issue to a Prover is quite large. For one example: the the challenge might be a random 256-bit string, i.e., there are 2²⁵⁶ to choose from. For fun, let’s further imagine that somewhere within this huge set of possible challenge values there exists a single “weak challenge.” This value c* — which I’ll exemplify by the number “53” for this discussion — is special. A cheating Prover who is fortunate enough to be challenged at this point will always be able to come up with a valid response, even if the statement they are “proving” is totally false (that is, if y is not equal to C(w, x).)

Now it goes without saying that having such a “weak challenge” in your scheme is not great! Clearly we don’t like to have weird vulnerabilities in our schemes. And yet, perhaps it’s not really a problem? To decide, we need to ask: how likely is it that a Verifier will select this weak challenge value?

If we are thinking about the interactive setting with an honest Verifier, the analysis is easy. There are 2²⁵⁶ possible challenge values, and just one “weak challenge” at c* = 53. If the honest Verifier picks challenges uniformly at random, this bad challenge will be chosen with probability 2^-256 during one run of the protocol. This is so astronomically improbable that we can more or less pretend it will never happen.

How does Fiat-Shamir handle “bad challenges”?

We now need to think about what happens when we flatten this scheme using Fiat-Shamir.

To do that we need to be more specific about how the scheme will be Fiat-Shamir-ified. In our scheme, we assumed that the Prover will generate a Commitment message first. The deterministic Fiat-Shamir “Verifier” will hash this message using a hash function H, probably tossing in some other inputs just to be safe. For example, it might include the input/output values as well as a hash of the circuit h(C) — note that we’re using a separate hash function out of an abundance of caution. Our challenge will now be computed as follows:

c = H( h(C), x, y, Commitment )

Our strawman protocol has a weak challenge point at c* = 53. So now we should ask: can a cheating Prover somehow convince the Fiat-Shamir hash “Verifier” to challenge them at this weak point?

The good news for Fiat-Shamir is that this also seems pretty difficult. A cheating Prover might want to get itself challenged on c* = 53. But to do this, they’d need to find some input (pre-image) to the hash function H that produces the output “53”. For any hash function worth its salt (pun intended!) this should be pretty difficult!⁴

Of course, in Fiat-Shamir world, the cheating Prover has a slight advantage: if it doesn’t get the hash challenge it wants, it can always throw away the result and try again. To do this it could formulate a new Commitment message (or even pick a new input/output pair x, y) and then try hashing again. It can potentially repeat this process millions of times! This attack is sometimes called “grinding” and it’s a real attack that actual cheating Provers can run in the real world. Fortunately even grinding isn’t necessarily a disaster: if we assume that the Prover can only compute, say, 2⁵⁰ hashes, then (in the ROM) the probability of finding an input that produces c = 53 is still 2⁵⁰ * 2^-256 = 2^-206, yet another very unlikely outcome.

Another victory for Fiat-Shamir! Even if we have one or more fixed “weak challenge” points in our protocol, it seems like Fiat-Shamir protects us.

What if a cheating Prover can pick the “weak challenge”?

I realize I’m being exhaustive (or exhausting), but I now want to consider another weird possibility: what if the “bad challenge” value can change when we switch circuits (or even circuit inputs)?

For fun, let’s crank this concern into absolute overdrive. Imagine we’re using a proving system that’s as helpful as possible to the cheating Prover: in this system, the “weak challenge value” will be equal to whatever the real output of the circuit C is. Concretely, if the circuit outputs c* = C(w, x) and also c* happens to be challenge value selected by the Verifier, then the Prover can cheat.

(Notice that I’m specifying that this vulnerability depends on the “real” output of the circuit. The prover also sends over a value y, which is the purported output of the circuit that the Prover is claiming, but that might be different if the Prover is cheating.)

In the interactive world this really isn’t an issue. In that setting, the Verifier selects the challenge randomly after the Prover has committed to everything. In that setting, the Prover genuinely cannot “cook” the circuit to produce the challenge value as output (except with a tiny probability.) Our concern is that In Fiat-Shamir world we should be a bit more worried. The value c* is chosen by hashing. A cheating Prover could theoretically compute c* in advance. It now has several options:

The cheater could alter the circuit C so that it hard-codes in c*.
Alternatively, it could feed the value c* (or some function of it) into the circuit via the input x.
It could sneak c* in within the witness w.
Or any combination of the above.

The problem here is that even in Fiat-Shamir world, this attack doesn’t work. A cheating Prover might first pick a circuit C and some input/output x, witness w, and a (purported) output y. It could then compute a Commitment by using the proving scheme’s commitment protocol to commit to w. It will then compute the anticipated Fiat-Shamir challenge as:

c* = H( h(C), x, y, Commitment )

To exploit the vulnerability in our proving system, the cheating Prover must now somehow cause the circuit C to output C(w, x) = c*. However, once again we encounter a problem. The cheating Prover had to feed x, as well as a (commitment to) w and (a hash of) the circuit C into the Fiat-Shamir hash function in order to learn c*. (It should not have been able to predict c* before performing the hash computation above, so it’s extremely unlikely that C(w, x) would just happen to output c*.) However, in order to make C(w, x) output c*, the cheater must subsequently manipulate either (1) the design of C or (2) the circuit’s inputs w, x.

But if the cheating Prover does any of those things, they will be completely foiled. If the cheating Prover changes any of the inputs to the circuit or the Circuit itself after it hashes to learn c*, then the actual Fiat-Shamir challenge used in the protocol will change.³

What if the circuit can compute the “weak challenge” value?

Phew. So what have we learned?

Even if our proving system allows us to literally pick the “weak challenge” c* — have it be the output of the circuit C(w, x) — we cannot exploit this fact very easily. Changing the structure of C, x or w after we’ve learned c* will cause the actual Fiat-Shamir challenge used to simply hop to a different value, like the timeline of a traveler who goes back in time and kills their own grandfather.

Clearly we need a different strategy: one where none of the inputs to the circuit, or the “code” of the circuit itself, depend on the output of the Fiat-Shamir hash function.

An important observation, and one that is critical to the KRS result, is that the “circuit” we are proving is fundamentally a kind of program that computes things. What if, instead of computing c* and feeding it to the circuit, or hard-coding it within the circuit, we instead design a circuit that itself computes c*?

Let’s look one last time at the way that Fiat-Shamir challenges are computed:

c* = H( h(C), x, y, Commitment)

We are now going to build a circuit C* that is able to compute c* when given the right inputs.

To make this interesting, we are going to design a very silly circuit. It is silly in a specific way: it can never output one specific output, which in this case will be the binary string “BANANAS”. Looking forward, our cheating Prover is then going to try to falsely claim that for some x, w, the relation C*(w, x) = “BANANAS” actually does hold.

Critically, this circuit will contain a copy of the Fiat-Shamir hashing algorithm H() and it will — quite coincidentally — actually output a value that happens to be identical to the Fiat-Shamir challenge (most of the time.) Here’s how it works:

The cheating Prover will pass w and x = ( h(C*), y* = “BANANAS” ) as input.
It will then use the proving system’s commitment algorithm on w to compute Commitment.
The circuit will parse x into its constituent values.
The circuit will internally compute c* = H( h(C*), x, y*, Commitment ).
In the (highly unlikely) case that c* = “BANANAS”, it will set c* = “APPLES”.
Finally, the circuit will output c*.

Notice first that this circuit can never, ever output C*(w, x) = “BANANAS” for any inputs w, x. Indeed, it is incredibly likely that this output would happen to occur naturally after step (4) — how likely is it that a hash function outputs the name of a fruit? — but just to be certain, we have step (5) to reduce this occurrence from “incredibly unlikely” to “absolutely impossible.”

Second, we should observe that the “code” of the circuit above does not in any way depend on c*. We can write this circuit before we ever hash to learn c*! The same goes for the inputs w and x. None of the proposed inputs require the cheating Prover to know c* before it chooses them.

Now imagine that some Prover shows up and tells you that, in fact, they know some w, x such that C*(w, x) = “BANANAS”. By definition they must be lying! When they try to convince you of this using the proving system, a correctly-functioning system should (with overwhelming probability) catch them in their lie and inform you (the Verifier) that the proof is invalid.

However: we stipulated that this particular proving system becomes totally insecure in the special case where the real calculation C*(w, x) happens to equal the Fiat-Shamir challenge c* that will be chosen during the protocol execution. If and when this remarkable event ever happens, the cheating Prover will be able to successfully complete the protocol even if the statement they are proving is totally bogus. And finally it seems we have an exploitable vulnerability! A cheating Prover can show up with the ridiculous false statement that C*(w, x) = “BANANAS”, and then hand us a Fiat-Shamir proof that will verify just fine. They can do this because in real life, C*(w, x) = c* and, thanks to the “weak challenge” feature of the proving system, that Prover successfully answer all the challenges on that point.

And thus in Fiat-Shamir land we have finally proved a false statement! Note that in the interactive protocol none of this matters, since the Prover cannot predict the Verifier’s (honest) challenge, and so has no real leverage to cheat. In the Fiat-Shamir world — at least when considering this one weird circuit C* — a cheating Prover can get away with murder.

This is not terribly satisfying!

Well, not murder maybe.

I realize this may seem a bit contrived. First we had to introduce a made-up proving system with a “weird” vulnerability, then we exploited it in a very nutty way. And the way we exploited it is kind of silly. But there is a method to the madness. I hope I’ve loaded us up with the basic concepts that we will need to really understand the KRS result.

What remains is to show how the real GKR15 scheme actually maps to some of these ideas.

As a second matter: I started these posts out by proposing all kinds of flashy ideas: we could steal billions of dollars in Ethereum transactions by falsely “proving” that a set of invalid transactions is actually valid! Now having promised the moon, I’m delivering the incredibly lame false statement C*(w, x) = “BANANAS”, where C* is a silly circuit that mostly computes a hash function. This should disappoint you. It disappoints me.

But keep in mind that cryptographic protocols are very delicate. Sometimes an apparently useless vulnerability can be expanded into something that actually matters. It turns out that something similar will apply to this vulnerability, when we apply it to the GKR15 scheme. In the next post.

There will be only one last post, I promise.

Notes:

There is a class of attacks on Fiat-Shamir-ified protocols that stems from putting too little information into the hash function. Usually the weakness here is that the hash does not get fed stuff like “the public key” (in signature schemes), which weakly corresponds to “the circuit” in a proving system. Sneaky attackers can switch these out and do bad thing.
I’ve been extremely casual about the problem of converting hash function outputs into “challenges” for arbitrary protocols. Sometimes this is easy — for example, if the challenge space consists of fixed-length random bit-strings, then we can just pick a hash function with the appropriate output length. Sometimes it is moderately easy, for example if the challenges are sampled from some reasonably-sized field. A few protocols might pick the challenges from genuinely weird distibutions. If you’re annoyed with me on this technical detail, I might stipulate that we have (A) a Fiat-Shamir hash function that outputs (enough) random bits, and (B) a way to convert those bits into whatever form of challenge you need for the protocol.
I said this is not scientific, but clearly we could make some kind of argument in the random oracle model. There we’d argue (something like): assume the scheme is secure using random challenges chosen by an honest (interactive) Verifier, now let’s consider whether it’s possible for the attacker to have pre-queried the oracle on the inputs that will produce the actual Fiat-Shamir. We could then analyze every possible case and hopefully determine that the answer is “no.” And that would be “science.”
In the random oracle model we can convert this intuition into a stronger argument: each time the Prover hashes something for the first time, the oracle effectively samples a uniformly random string as its response. Since there are 2²⁵⁶ possible digests, the probability that it returns c = 53 after one hash attempt should still be 2^-256, which again is very low.

How to prove false statements? (Part 1)

Trigger warning: incredibly wonky theoretical cryptography post (written by a non-theorist)! Also, this will be in two parts. I plan to be back with some more thoughts on practical stuff, like cloud backup, in the near future.

If you’ve read my blog over the years, you should understand that I have basically two obsessions. One is my interest in building “practical” schemes that solve real problems that come up in the real world. The other is a weird fixation on the theoretical models that underpin (the security of) many of those same schemes. In particular, one of my favorite bugaboos is a particular model, or “heuristic”, called the random oracle model (ROM) — essentially a fancy way to think about hash functions.

Along those lines, my interest was recently piqued by a new theoretical result by Khovratovich, Rothblum and Soukhanov entitled “How to Prove False Statements: Practical Attacks on Fiat-Shamir.” This is a doozy of a paper! It touches nearly every sensitive part of my brain: it urges us towards a better understanding of our theoretical models for proving security of protocols. It includes the words “practical” and “attacks” in the title! And most importantly it demonstrates a real (albeit wildly contrived) attack on the kinds of “ZK” (note: not actually ZK, more on that later) “proving systems” that we are now using inside of real systems like blockchains.

I confess I am still struggling hard to figure out how I “feel” about this result. I understand how odd it seems that my feelings should even matter: this is science after all. Shouldn’t the math speak for itself? The worrying thing is that, in this case, I don’t think it does. In fact, this is what I find most fundamentally exciting about the result: it really does matter how we think about it. (Here I should add that we don’t all think the same say. My theory-focused PhD student Aditya Hegde has been vigorously debating me on my interpretation — and mostly winning on points. So anything non-stupid I say here is probably due to him.)

Oh yes, and I should also mention that there are billions and billions of dollars riding on these questions? I’m not being dramatic. This is really true.

I mentioned that this post is going to be long and wonky, that’s just unavoidable. But I promise it will be fun. (Ok, I can’t even promise that.) Screw it, let’s go.

The shortest background ever (and it will still be really long)

If you’ve read this blog over the long term, you know that I’m obsessed with one particular “trick” we use in proving our schemes secure. This trick is known as the random oracle model, and it’s one of the worst (or best) things to happen to cryptography.

Let me try to break this down as quickly as I can. In cryptography we have a tendency to use an ingredient called a cryptographic hash function. These functions take in a (potentially) long string and output a short digest. In cryptography courses, we present these functions along with various “security definitions” they should meet, properties like collision resistance, pre-image resistance and so on. But out in the real world most schemes require much stronger assumptions in order to be proven secure. When we argue for the security of these schemes, we often demand that our hash functions be even stronger: we require that they must behave like random functions.

If you’re not sure what a random function is, you can read about it in depth here. You should just trust that it is a very strong and beautiful requirement for a hash function! But there is a fly in the ointment. Real-world hash functions cannot possibly be random functions. Specifically: concrete hash functions like SHA-2, SHA-3 etc. are characterized by the inevitable requirement that they possess compact, efficient algorithms that can compute their output. Random functions (of any usefulness) must not. Indeed, the most efficient description of a random function is essentially a giant (i.e., exponentially-sized in the length of the inputs to the function) lookup table. These functions cannot even be computed efficiently, because they’re so big.

So when we analyze schemes where hash functions must behave in this manner, we have to do some pretty suspicious things. The approach we take is bonkers. First, we analyze our schemes inside of an artificial “model” where efficient (polynomial-time) participants can somehow evaluate random functions, despite the fact that this is literally impossible. To make this apparent contradiction work, we “yank” the hash function logic into a magical box that lives outside that participants — this includes both honest participants in a protocol, as well as any adversaries who try to attack the scheme — and we force everyone to call out to that functionality. This new thing is called a “random oracle.”

One weird implication of this approach is that no party can ever know the code of the “hash function” they’re evaluating. They literally cannot know it, since in this model the hash function is comprised of an enormous random lookup table that’s much too big for anyone to actually know! This may seem like a not-very big deal, but it will be exceptionally important going forward.

Of course in the real world we do not have random oracles. I guess we could set up a special server that everyone in the world can call out to in order to compute their hash function values! But we don’t do that because it’s ridiculous. When want to deploy a scheme IRL, we do a terrible thing: we “instantiate the random oracle” by replacing it with an actual hash function like SHA-2 or SHA-3. Then everyone goes on their merry way, hoping that the security proof still has some meaning.

Let me be abundantly clear about this last part. From a theoretical perspective, any scheme “proven secure” in the random oracle model ceases to be provably secure the instant you replace the random oracle with a real (concrete) hash function like SHA-3. Put differently, it’s the equivalent of replacing your engine oil with Crisco. Your car may still run, but you are absolutely voiding the warranty.

But, but, but — and I stress the stammer — voiding your warranty does not mean your engine will become broken! In most of the places where we’ve done this awful random oracle “instantiation” thing (let’s be honest: almost every real-world deployed protocol) the instantiated protocols all seemed to work just fine.

(To be sure: we have certainly seen cryptographic protocols break down due to broken hash functions! But these breaks are almost always due to obvious hash function bugs that anyone can see, such as meaningful collisions being found. They were not magical breaks that come about because you rubbed the “theory lamp” wrong. As far as we can tell, in most cases if you use a “good enough” secure hash function to instantiate the random oracle, everything mostly goes fine.)

Now it should be noted that theoreticians were not happy about this cavalier approach. In the late 1990s, they rebelled and demonstrated something shocking: it was possible to build “contrived” cryptographic schemes that were provably secure in the random oracle model but totally broken when the oracle was “instantiated” with any hash function.

This was shocking, but honestly not that surprising once you’ve had someone else explain the basic idea. Most of these “counterexample schemes” followed from four simple observations:

In the (totally artificial) random oracle model, you don’t know a compact description of the hash function. You literally can’t know one, since it’s an exponentially-sized random function.
In the “instantiated” protocol, where you’ve replaced the random oracle with e.g., SHA-2, you very clearly must know a compact description of the hash function (for example, here is one.)
We can build a “contrived” scheme in which “knowledge of the description of the hash algorithm” forms a kind of backdoor that allows you to break the scheme!
In the random oracle model where you can’t ever possess this knowledge, the backdoor can never be triggered — hence the scheme is “secure.” In the real world where you instantiate the scheme with SHA-2, any clown can break it.

These results straddle the line between “brilliant” and “fundamentally kind of silly”. Brilliant because, wow! These schemes will be insecure when instantiated with any possible hash function! The random oracle model is a trap! But stupid because, I mean… duh!? In fact what we’re really showing is that our artificial model is artificial. If you build schemes that deliberately fall apart when any adversary knows the code for a hash function, then of course your schemes are going to be broken. You don’t need to be a genius to see that this is going to go poorly.

Nonetheless: theoreticians took the a victory lap and then moved on to ruining other people’s fun. Practitioners waited until every last one of them had lost interest, rolled their eyes, and said “let’s agree not to deploy schemes that do obviously stupid things.” And then they all went on deploying schemes that were only proven secure in the random oracle model. And this has described our world for 28 years or so.

But the theoreticians weren’t totally wrong, were they?

That is the $10,000,000,000 question.

As discussed above, many “contrived counterexample” schemes were built to demonstrate the danger of the random oracle model. But each of them was so obviously cartoonish that nobody would ever deploy one of them in practice. If your signature scheme includes 40 lines of code that essentially scream “FYI: THIS IS A BACKDOOR THAT UNLOCKS FOR ANYONE WHO KNOWS THE CODE OF SHA2”, the best solution is not to have a theoretical argument about whether this code is “valid.” The best solution is to delete the code and maybe write over your hard disk three times with random numbers before you burn it. Practitioners generally do not feel threatened by artificial counterexamples.

But a danger remains.

Cryptographic schemes have been getting more complicated and powerful over time. Since I explained the danger in a previous blog post I wrote five years ago, I’m going to save myself some trouble — and also make myself look prescient:

The probability of [a malicious scheme slipping past detection] accidentally seems low, but it gets higher as deployed cryptographic schemes get more complex. For example, people at Google are now starting to deploy complex multi-party computation and others are launching zero-knowledge protocols that are actually capable of running (or proving things about the execution of) arbitrary programs in a cryptographic way. We can’t absolutely rule out the possibility that the CGH and MRH-type counterexamples could actually be made to happen in these weird settings, if someone is a just a little bit careless.

Let’s drill down on this a moment.

One relatively recent development in cryptography is the rise of succinct “ZK” or “verifiable computation” schemes that allow an untrusted person to prove statements about arbitrary programs. In general terms, these systems allow a Prover (e.g., me) to prove statements of the following form: (1) I know an input to a [publicly-known] program, such that (2) the program, when run on that input, will output “True.”

The neat thing about these systems is that after running the program, I can author a short (aka “succinct”) proof that will convince you that both of these things are true. Even better, I can hand that short proof (sometimes called an “argument”) to anyone in the world. They can run a Verify algorithm to check that the proof is valid, and if it agrees, then they never need to repeat the original computation. Critically, the time required to verify the proof is usually much less than the time required to re-check the program execution, even for really complicated program executions. The resulting systems are called arguments of knowledge and they go by various cool acronyms: SNARGs, SNARKs, STARKs, and sometimes IVC. (The Ethereum world sometimes lumps these together under the moniker “ZK”, for historical reasons we will not dig into.)

This technology has proven to be an exciting and necessary solution for the cryptocurrency world, because that world happens to have a real problem on its hands. Concretely: they’ve all noticed that blockchains are very slow. Those systems require thousands of different computers to verify (“check the validity of”) every financial transaction they see, which places enormous limitations on transaction throughput.

“Succinct” proof systems offer a perfect solution to this conundrum.

Rather than submitting millions of individual transactions to a big, slow blockchain, the blockchain can be broken up. Distinct servers called “rollups” can verify big batches of transactions independently. They can each use a succinct proof system to prove that they ran the transaction-verification program correctly on all those transactions. The base-level blockchains no longer need to look at every single transaction. They only need to verify the short “proofs” authored by the rollup servers, and (magically!) this ensures that all of the transactions are verified — but with the base-level blockchain doing vastly less work. In theory this allows a massive improvement in blockchain throughput, mostly without sacrificing security.

An even cooler fact is that these proof systems can in some cases be applied recursively. This is due to a cute feature: the algorithm for verifying a proof is, after all, itself just a computer program. So I can run that program on some other proofs as input — and then I can use the proof system to prove that I ran that program correctly.

To give a more concrete application:

Imagine 1,000 different servers each run a program that verifies a distinct batch of 1,000 transactions. Each server produces a succinct proof that they ran their program correctly (i.e., their batch is correct.)
Now a different server can take in each of those 1,000 different proofs. And it can run a Verify program that goes through each of those 1,000 proofs and verifies that each one is correct. It outputs a proof that it ran this program correctly.
The result is a single “short” proof that proves all 1,000,000 transactions are correct!

I even made a not-very-excellent picture to try to illustrate how this can look:

Example of recursive proof usage. At the bottom we have some real programs, each of which gets its own proof. Then one level up we have a program that simply verifies the proofs from the bottom level. And at the top we have another program that verifies many proofs from the second level! (Many programs not shown.)

This recursive stuff is really useful, and I promise that it will be relevant later.

So what?

The question you might be asking is: what in the world does this have to do with random oracle counterexamples?!

Since these proof systems are now powerful enough to run arbitrary programs (sometimes implemented in the form of arithmetic or boolean “circuits”), there is now a possibility that sneaky counterexample “backdoors” could be smuggled in within the programs we are proving things about. This would mean that even if the actual proving scheme has no obvious backdoors in its code, the actual programs would be able to do creepy stuff that would undermine security for the whole system. Our practitioner friends would no longer be able to exercise their (very reasonable) heuristic of “don’t deploy code that does obviously suspicious things” because, while their implementation might not do stupid things, some user try to run it with a malicious program that does.

(A good analogy is to imagine that your Nintendo system has no exploits built into it, but any specific game might sneak in a nasty piece of code that could blow everything up.)

A philosophical interlude

This has been a whole lot, and there’s lots more to come.

To give you a break, I want pause for a moment to talk about philosophy, metaphysics (meta-cryptography?), or maybe just the Meaning of Life. More concretely, at this point we need to stop and ask a very reasonable question: how much does this threat model even matter? And what even is this threat model?

Allow me to explain. Imagine that we have a proving system that is basically not backdoored. It may or may not be provably secure, but by itself the proving system itself does not contain any obvious backdoors that will cause it to malfunction, even if you implement it using a concrete hash function like SHA-3.

Now imagine that someone comes along and writes a program called “Verify_Ethereum_Transactions_EVIL.py” that we will want to run and prove using our proof system. Based on the name, we can assume this program was developed by a shady engineer who maliciously decide to add a “backdoor” to the code! Instead of merely verifying Ethereum transactions as you might hope for, the functionality of this program does something nastier:

“Given some input, output True if the input comprises 1,000 valid Ethereum transactions…
OR
output True if the input (or the program code itself) contains a description of the hash function used by the proving system.”

This would be really bad for your cryptocurrency network! Any clever user could submit invalid Ethereum transactions to be verified by this program and it would happily output “True.” If any cryptocurrency network then trusted the proof (to mean “these transactions are actually valid”) then you could potentially use this trick to steal lots of money.

But also let me be clear: this would also be an incredibly stupid program to deploy in your cryptocurrency network.

The whole point of a proof system is that it proves you ran a program successfully, including whatever logic happens to be within those programs. If those programs have obvious backdoors inside of them, then proving you ran those programs means you’re also proving that you might have exercised any backdoors in those programs. If the person writing your critical software is out to get you, and/or you don’t carefully audit their output, you will end up being very regretful. And there are many, many ways to add backdoors to software! (Just to illustrate this, there used to be an entire competition called the “Underhanded C Contest” where people would compete to write C programs full of malicious code that was hard to catch. The results were pretty impressive!)

So it’s worthwhile to ask whether this is really a surprise. In the past we knew that (1) if your silly cryptographic scheme had weird code that made it insecure “to anyone who knows how to compute SHA-2“, then (2) it would really be insecure in the real world, since any idiot can download the code for SHA-2, and (3) you should not deploy schemes that have obvious backdoors.

So with this context in mind, let’s talk about what kind of bad things might happen. These can be divided into “best case“, “second worst case” and “oh hell, holy sh*t.“

In the best case, this type of attack might simply move the scary backdoor code out from the cryptographic proving system, and into the modular “application programs” that can be fed into the proving system You still need to make sure the scheme implementation doesn’t have silly backdoors — like special code that breaks everything if you know the code for SHA-2. But now you also need to make sure every program you run using this system doesn’t have a similar backdoors. But to be clear: you kind of had to audit your programs for backdoors anyway!

In fairness, the nature of these cryptographic backdoors is that they might be more subtle than a typical software backdoor. What I mean here is that ordinary software reviewers might not recognize it, and only only an experienced cryptographer will identify that something sketchy is happening. But even if the bug is hard to identify, it’s still a bug — a bug in one specific piece of code — and most critically, it would only affect your own application if you deployed it.

Of course there are worse possibilities as well.

In the second worst case, perhaps the bugdoor can be built into the application code in some clever way that is deeply subtle and fundamentally difficult for code auditors to detect — even if they know how to look for it. Perhaps it could somehow be cryptographically obfuscated, so no code review will detect it! Recursive proof systems are worrying when it comes to this concern, since the “bug” might exist multiple layers down in a tree of recursive proofs, and you might not have the code for all those lower-level programs.¹ It’s possible that the set of “bad code behaviors” we we’d need to audit the code for is so large and undefined that we can no longer apply simple heuristics to catch the bad stuff!

This would be very scary. But it is certainly not the worst case.

The (“oh crap!”) worst case: with recursive proofs there is an even more terrible thing that could theoretically happen. Recall that a single top-level recursive proof can recursively verify thousands of different programs. Many of those programs will likely be written by careful, honest developers. Others could be written by scammers. Clearly if the scammers’ code contains bugs (or flaws) then we should expect those bugs to make the scammers’ own programs less secure, at whatever goal they’re supposed to accomplish. So far none of this is surprising. But ideally what we should hope is that any backdoors in the scammers’ programs will remain isolated to the scammers’ code. They should not “jump across program boundaries” and thus undermine the security of the well-written, honest programs elsewhere in the recursive proof stack.

Now imagine a situation where this is not true. That is, a clever bug in one “program” anywhere in the tree could somehow make any other program (proof) in the entire tree of proofs insecure. This is akin to getting one malicious program onto a Linux box, then using it to compromise the Linux kernel and thus undermine the security of any application running on the system. Maybe this seems unlikely? Actually to me it seems genuinely fantastic, but again, we’re in Narnia at this point. Who knows what’s possible!

*This is the very worst case. I don’t think it could happen, but… who knows?*

This is the scary thing about what can happen once we leave the world of provable security. Without some fundamental security guarantees we can rely on, it’s possible that the set of attacks we might suffer could be very limited. But they could also be fundamentally unbounded! And that’s where I have to leave this post for the moment.

This post is continued in Part 2.

Notes:

We might imagine, for example, that a recursive Verify program might just take in the hash (or commitment) to a program. And then a Prover could simply state “well, I know a real program that matches this commitment AND ALSO I know an input that satisfies the program.” This means the program wouldn’t technically be available to every auditor, only the hash of the program. I am doing a lot of handwaving here, but this is all possible.

A quick post on Chen’s algorithm

Update (April 19): Yilei Chen announced the discovery of a bug in the algorithm, which he does not know how to fix. This was independently discovered by Hongxun Wu and Thomas Vidick. At present, the paper does not provide a polynomial-time algorithm for solving LWE.

If you’re a normal person — that is, a person who doesn’t obsessively follow the latest cryptography news — you probably missed last week’s cryptography bombshell. That news comes in the form of a new e-print authored by Yilei Chen, “Quantum Algorithms for Lattice Problems“, which has roiled the cryptography research community. The result is now being evaluated by experts in lattices and quantum algorithm design (and to be clear, I am not one!) but if it holds up, it’s going to be quite a bad day/week/month/year for the applied cryptography community.

Rather than elaborate at length, here’s quick set of five bullet-points giving the background.

(1) Cryptographers like to build modern public-key encryption schemes on top of mathematical problems that are believed to be “hard”. In practice, we need problems with a specific structure: we can construct efficient solutions for those who hold a secret key, or “trapdoor”, and yet also admit no efficient solution for folks who don’t. While many problems have been considered (and often discarded), most schemes we use today are based on three problems: factoring (the RSA cryptosystem), discrete logarithm (Diffie-Hellman, DSA) and elliptic curve discrete logarithm problem (EC-Diffie-Hellman, ECDSA etc.)

(2) While we would like to believe our favorite problems are fundamentally “hard”, we know this isn’t really true. Researchers have devised algorithms that solve all of these problems quite efficiently (i.e., in polynomial time) — provided someone figures out how to build a quantum computer powerful enough to run the attack algorithms. Fortunately such a computer has not yet been built!

(3) Even though quantum computers are not yet powerful enough to break our public-key crypto, the mere threat of future quantum attacks has inspired industry, government and academia to join forces Fellowship-of-the-Ring-style in order to tackle the problem right now. This isn’t merely about future-proofing our systems: even if quantum computers take decades to build, future quantum computers could break encrypted messages we send today!

(4) One conspicuous outcome of this fellowship is NIST’s Post-Quantum Cryptography (PQC) competition: this was an open competition designed to standardize “post-quantum” cryptographic schemes. Critically, these schemes must be based on different mathematical problems — most notably, problems that don’t seem to admit efficient quantum solutions.

(5) Within this new set of schemes, the most popular class of schemes are based on problems related to mathematical objects called lattices. NIST-approved schemes based on lattice problems include Kyber and Dilithium (which I wrote about recently.) Lattice problems are also the basis of several efficient fully-homomorphic encryption (FHE) schemes.

This background sets up the new result.

Chen’s (not yet peer-reviewed) preprint claims a new quantum algorithm that efficiently solves the “shortest independent vector problem” (SIVP, as well as GapSVP) in lattices with specific parameters. If it holds up, the result could (with numerous important caveats) allow future quantum computers to break schemes that depend on the hardness of specific instances of these problems. The good news here is that even if the result is correct, the vulnerable parameters are very specific: Chen’s algorithm does not immediately apply to the recently-standardized NIST algorithms such as Kyber or Dilithium. Moreover, the exact concrete complexity of the algorithm is not instantly clear: it may turn out to be impractical to run, even if quantum computers become available.

But there is a saying in our field that attacks only get better. If Chen’s result can be improved upon, then quantum algorithms could render obsolete an entire generation of “post-quantum” lattice-based schemes, forcing cryptographers and industry back to the drawing board.

In other words, both a great technical result — and possibly a mild disaster.

As previously mentioned: I am neither an expert in lattice-based cryptography nor quantum computing. The folks who are those things are very busy trying to validate the writeup: and more than a few big results have fallen apart upon detailed inspection. For those searching for the latest developments, here’s a nice writeup by Nigel Smart that doesn’t tackle the correctness of the quantum algorithm (see updates at the bottom), but does talk about the possible implications for FHE and PQC schemes (TL;DR: bad for some FHE schemes, but really depends on the concrete details of the algorithm’s running time.) And here’s another brief note on a “bug” that was found in the paper, that seems to have been quickly addressed by the author.

Up until this week I had intended to write another long wonky post about complexity theory, lattices, and what it all meant for applied cryptography. But now I hope you’ll forgive me if I hold onto that one, for just a little bit longer.

A Few Thoughts on Cryptographic Engineering

Some random thoughts about crypto. Notes from a course I teach. Pictures of my dachshunds.

Tag: Cryptography