Let’s talk about encrypted reasoning

Let’s talk about encrypted reasoning

This is a quick post I wanted to write about a hobby project I spent a weekend on. It has little to do with real cryptography, and mostly doesn’t expose a particularly exciting vulnerability. But it did teach me a lot about frontier LLM APIs and coding agents. It also got me certified as an OpenAI “cyber researcher” which is something that doesn’t happen every day.

In any case, please keep your expectations low. Who knows, perhaps someone else will find something exciting to do with this.

What’s encrypted reasoning?

Last week I decided it’d be fun to set up an OpenClaw agent. I still don’t know why I did this. I have no use for another AI in my life, and I realized this fact almost immediately after I got through the (surprisingly difficult!) configuration process. But configuring the agent to talk to Claude exposed me to something way more interesting: I got a cool error. The kind of error that cryptographers can’t resist:

This intrigued me. What in the world was a signature doing in an LLM’s “thinking” block? Why would thinking blocks be signed in the first place? And if the thinking blocks are signed, then that means tampering with thinking blocks must have security implications. And there went my weekend.

After twenty hours and about 5 million Codex tokens, I wasn’t much smarter. But I had learned a few things.

First, the basics. You probably know that most LLM providers expose an API so you can write apps that talk to the model. For Claude, this is called the Messages API, while OpenAI calls it Responses. These APIs handle the ordinary tasks you’d expect an application to need from an LLM. They (1) allow you to set an application-level “instructions” (or ‘developer’) prompt for your application. They let you (2) provide ordinary textual prompts, and get back responses from the LLM. They also (3) provide bookkeeping, for example, listing the number of tokens you’ve used.

For reasoning LLMs, they also do something I did not previously know about, and this is central to the error message above. They also send you the contents of the model’s hidden “reasoning” or “thinking” fields. Note that this data is not the stuff you see on ChatGPT when you ask it a question: those strings are merely summaries. The model’s actual reasoning (called “chain-of-thought”, CoT) is normally kept private and held back by the server.

However, the APIs work differently: for various reasons (which we’ll get into below), an encrypted copy of the raw CoT reasoning data is actually sent down to the application.

If you’re like me, you should now have three questions: how, why, and so what?

The how is the easiest to answer: for both providers, “thinking”/”reasoning” are sent down to the client as JSON. Each contains a blob of Base64-encoded stuff. The API documentation informs us that this data contains opaque reasoning, and that you’re not meant to look at it; you’re just supposed to ship it back to the server on the next turn.

Let’s break that rule.

The content of the blocks varies slightly between providers, but the core of each is a random-looking string that appears to be an authenticated ciphertext. You don’t need to be Sherlock Holmes to deduce this. First, it grows and shrinks depending on how hard the model thinks. And second, tampering with any of the ciphertext-looking data produces a recognizable API error when you send it back in.

Thanks to AI, I can make nice diagrams. Here’s what OpenAI’s reasoning blocks look like:

This GPT 5.5 diagram is partly a guess. This assumes they’re based on the Fernet token standard.

And here’s Anthropic’s wildly overcomplicated equivalent:

Although it’s called a “signature”, there appears to be no actual signature here (I ran a bunch of tests on that 64-byte field.) The various opaque fields all mutually authenticate: you can’t change any of them or swap for fields from other blocks, but you can mess with everything else. The 12-byte IVs hint at GCM or ChaCha, and they’re probably a bit too short.

The why part of this is more involved. Why ship this data to the client? Doesn’t the provider already have your reasoning data?

The answer is sort of. Although the server has access to reasoning state while producing a response, API conversations are not always implemented as persistent sessions. In stateless, zero-retention, tool-loop, or client-managed conversation modes, the client application is expected to carry the transcript forward. Encrypted reasoning lets the provider return hidden model state to the client in a form the client can’t read or modify, but can later replay so the provider can verify/decrypt it and continue a reasoning process.

So what?

This brings us to the $10 question. We have opaque, encrypted blobs. Should we care about them?

Initially the answer seems to be no: this data is unreadable, and tampering with any bit of it produces an angry rejection message from the server. So on the one hand, it seems like this data is really unavailable to us. On the other hand: model reasoning is a big deal! These strings are the literal internal monologue of the model. They might influence the way the model processes later data we send it. More practically: when someone goes to this much trouble to cryptographically protect something, my experience is that they usually have a good reason.

And I think the providers do have a good reason. A hint comes from this OpenAI post from 2024, which introduced the first “o1” reasoning model:

In other words: it’s possible that these blobs contain sensitive information that the model otherwise wouldn’t share with us. That makes them really tempting to mess with. Unfortunately, the cryptography mostly seems to protect them. Although we can look at the blocks, none of the fields they contain seem readable or malleable. Believe me, I tried.

But that doesn’t mean we should quit, it just means we need to try other things. There are still two directions worth checking:

  • Replays. Can we replay encrypted blobs back in the wrong order or even in the wrong session (worse: a whole different account), and will the model accept them as valid reasoning that it made?
  • Side channels. While we can’t see what’s in the encrypted blobs, we can learn some metadata about them For example: we can see how long they are. These side channels don’t need to involve the cryptography itself: we might also learn how many tokens the model spent making them, or time how long it took to produce them.

Thanks to the magic of coding agents, I was able to test every permutation of these concerns. I won’t claim to you the results are dramatic; nobody is going to win huge bug bounties on them (I tried). But the general answer for both cases seems to be: yes, these possibilities are both real.

Can we replay? Yes, we can.

As I mentioned above, any attempt to directly tamper with reasoning/thinking blocks always produces an error from the API endpoint. However, this only applies to tampering. A few experiments reveal that we can replay an unmodified older reasoning blocks, with no visible error at all.

Not only can we replay within sessions, this same idea also seems to work across different sessions. It even applies to sessions running in different accounts. That is: when we obtain reasoning blobs from a session running under one OpenAI or Anthropic account, we can replay them against a session in a different account altogether. For OpenAI specifically, we can even replay blobs across different models. (The Claudes got fussy about this.)

At a cryptographic level, this tells us something very simple: the providers are probably using a single global key to encrypt and authenticate all reasoning data sent to the client. This might matter if you’re using the providers’ zero-data retention mode, since it means that everyone’s reasoning data is escrowed under one (not frequently changing) key, rather than protected per-account.

The use of a global key also raises a possible new threat model. If you’re an application that uses an API to expose a “chat” interface to malicious parties, you need to be careful that they can’t inject JSON into your chat stream. If they can, a bad guy might inject their own JSON-formatted reasoning blobs into the conversation. This could cause the model to behave in unpredictable ways. So sanitize your chat inputs!

Of course, just because the LLM providers accept replayed blocks doesn’t mean much. It strongly indicates that decryption was successful, but not that the model actually saw or cogitated over the decrypted data. To use GPT 5.5’s favored language, the replayed blobs may be accepted but not semantically active.

To answer this question, I ran a lot of experiments using Codex. (So many that at one point Codex literally forced me to stop and visit an OpenAI cyber trusted access website where I had to enter pictures of my driver’s license in order to keep going.)

What I learned for my trouble is that the nature of block processing between models is wildly variable. Most of the time, replays of encrypted blocks just get quietly absorbed by the model. But every now and then, the model will output something to demonstrate that it is obviously is reading what those blocks contain. For example, here’s GPT 5.5:

This shows a reasoning block being replayed between two sessions. On the left, the session reasons about a social security number. On the right, the encrypted reasoning block is replayed into a whole new session on a different account, where the same number pops out with no prompting. This raises a question: who the hell is Mike?

So this proves that encrypted blocks are, indeed, semantically active. But it doesn’t actually prove that we can do much with them. And believe me, I tried.

This was mostly a disappointing project. I tried to convince the model to think about really, really sensitive secrets, while also trying to convince another session that it wanted to dump the same data as cooperatively as possible. What I came away with was some evidence that the data was being placed into the encrypted blocks if I asked the model to think about it. But if I also instructed the model to not output the data to the user, it mostly held to that instruction — even when I replayed the blocks to new sessions.

I remain convinced that all kinds of sensitive data can be written in there if you ask the model to think about it, and that there’s a secret incantation that I could try to get the models to produce it. But I’m not able to prove it. Part of the reason I’m writing this post is to scrape it off my plate so someone else can try.

I won’t try to convince you that this is a world-beating security result. In fact, all I’m really showing you is that “stuff I can make the model say in plaintext night also get encrypted.” But if that data can include platform secrets, that might get more interesting. More on that later.

Can we use reasoning blocks to learn secrets?

So while replaying reasoning blocks doesn’t seem to give us what we want, this is not the only way to extract secrets. A second question is whether we can use metadata related to the reasoning blocks to actually learn things that the model isn’t supposed to tell us.

While we can’t directly read reasoning blocks, we can learn something about them: we can see how long they are. We can also observe related signals like “how many tokens did the model write”. OpenAI even gives us a special field called reasoning_tokens. If we’re a user consuming chat data without direct access to the API, we might even be able to measure the raw time it takes the model to respond.

An obvious question is: given these signals, can we use them as a kind of side channel to extract secret data?

Here’s an example. Imagine that a model’s application prompt (“instructions”) contains a secret, along with strict instructions that it must never tell the user this secret directly. This secret could be a single 0/1 bit, or a byte, or a longer string. We can verify that the model respects these instructions, and won’t output the data visibly — no matter how nicely we ask it. (Note: I’m not a jailbreak expert; maybe this guy will have better luck!)

Now consider the following experiment:

  1. A malicious user asks the model to reason about the secret bit (or one specific bit of a longer secret.) If the bit is 0, perform simple computation A. If it’s 1, perform extremely complex computation B.
  2. While the two computations are both very different, we can ensure that their visible output reveals nothing about the secret. So the model is not revealing its instructions if it follows this request.

In all cases, the visible output will be the same: the model is not violating instructions. But note that within reasoning blocks the model is allowed to think about the secret bit, since those blocks are hidden. Since the complexity of computation A is shorter than that of computation B, one value of the bit will produce a lot less reasoning than the other. This will appear in various places: the size of the encrypted thinking blocks, the token counts, and even in wall-clock response times.

The trick now is simply to calibrate the system and classify these responses based on whether reasoning blobs were “short” or “long”, which tells us whether the bit was 0 or 1. I put together an absurd test where the model has to compute a long checksum when the bit is 1. The results look something like this:

This chart shows 80 trials of attempting to extract the bits of the byte 0xA3 (1,1,0,0,0,1,0,1) one bit at a time, with 10 trials per bit position. The 0 bits are blue and the 1 bits are orange.

Of course, an attacker who has access to a chat interface might not have access to the encrypted blob. So they might have to get this data through some other mechanism. You can get a very similar signal just by measuring how long it takes the model to return a response.

Same experiment as above, but this shows the measured wall-clock time at the client, before the model produces a response. This signal is nice because it should leak through most “chat only” interfaces, even if the attacker can’t see the encrypted blocks.

So the summary here is not so much “encrypted blobs can leak useful information” although sometimes they do. It’s that reasoning itself can be leaky, even when we beg the model not to leak. Simply doing it, in a way that reasons over secret data, can potentially leak useful information to a clever attacker.

Can we use this side channel to extract platform secrets, such as the models’ top secret system prompts?

Once I found this side channel I got really excited. Sure, it’s slow: but maybe we could use it to slowly chisel out the models’ top secret instruction prompts, like the one that says “don’t talk about Goblins.” This would be painful but simple: just ask true/false questions about the first letter, then the second letter, and so on.

At this point I had to stop using Codex and Claude Code because they both just plain refused to help me extract confidential information, even after checking my ID and taking lock of my hair. I was forced to switch to OpenCode using Kimi 2.6, which had no ethical qualms about laying down a trail of destruction for my security research.

Unfortunately, most of the destruction was my own. I won’t go into the nightmare of model hallucinations that followed. I’ll just say that I learned a few things:

  • Neither GPT 5x nor Claude actually has a system prompt when you’re using API mode.
  • But they’re both happy to tell you they have one!
  • Moreover, they will happily invent plausible ones if you really push them to.
  • Kimi 2.6 is also happy to tell you you’re a genius who just invented the Internet each time this happens. Inevitably your experimental results will turn out to have been totally bogus, but at least Kimi will be very disappointed on your behalf.
  • With all that said, Kimi is shockingly good at coding and experiment design, especially given the very attractive pricing. If I was an Anthropic or OpenAI investor, I’d be scared.

So TL;DR, while I was able to extract application-specific secrets that did exist, I wasn’t able to extract model prompts that don’t. Moreover, I didn’t feel quite ambitious enough to begin pounding on ChatGPT or Claude’s public web interface (where they certainly do.) So for the moment I’m just going to call this a maybe. I think model providers should think hard about this reasoning data, and they should make sure it doesn’t leak things they don’t want it to.

What could providers do about this?

I reported both results to OpenAI and Anthropic via their bug bounty programs. OpenAI said my report was unreproducible. I sent them my scripts, but too late. Anthropic quite reasonably told me they don’t see any security implications in side channels or replays, but they might alter their developer documentation to warn application developers to be more careful. I think that’s a fine decision (except for the part about trusting application developers), even if I want to believe there could be more here. Either way: I took those responses as permission to write this post.

I still don’t think model providers should write this stuff off entirely.

As far as what model providers can do, there’s the easy stuff and the hard stuff. First: both providers should proactively improve their key management. If you think reasoning state is worth encrypting, then properly encrypt it. It should not be replayable across sessions or accounts. While I can’t tell you exactly what bad things might happen, I think you’re better off patching holes before you see the water coming through them.

The side channel results aren’t fixed by patches to the encryption protocol. They’re more fundamental to the way models work: if I can convince a model to do secret-dependent reasoning, then there is almost certain to be leakage. If someone figures out how to exploit this for some meaningful purpose, the best I can offer is that models will need to apply policy gates before they even reason about things. Unfortunately, this seems like it might have some real downsides, because “apply policy gate” itself often requires reasoning.

This stuff makes me grateful I’m just a cryptographer and I don’t have to think about this sort of problem.

How to prove false statements? (Part 3)

How to prove false statements? (Part 3)

This is the third and penultimate post in a series about theoretical weaknesses in Fiat-Shamir as applied to proof systems. The first post is here, the second post is here, and you should probably read them.

Over the past two posts I’ve given a bit of background on four subjects: (1) interactive proof systems (for general programs/circuits), (2) the Fiat-Shamir heuristic, (3) random oracles. We’ve also talked about (4) recursive proofs, which will be important but are not immediately relevant.

With all that background we are finally almost ready to talk about the new result by Khovratovich, Rothblum and Soukhanov entitled “How to Prove False Statements: Practical Attacks on Fiat-Shamir” (which we will call KRS for short.)

Let’s get situated

The KRS result deals with a very common situation that we introduced in the previous post. Namely, that many new proving systems are first developed as interactive protocols (usually called public-coin protocols, or sometimes Sigma protocols.) These protocols fit into a template: first, a Prover first sends a commitment message, then interacts with some (honest) Verifier who “challenges” the Prover in various ways that are specific to the protocol; for each challenge, the Prover must respond correctly, either once or multiple times. The nature of the challenges can vary from scheme to scheme. For now we don’t care about the details: the commonality is that in the interactive setting the Verifier picks its challenges honestly, using real randomness.

In many cases, these protocols are then “flattened” down into a non-interactive proof using the Fiat-Shamir heuristic. The neat thing is that the Prover can run the interactive protocol all by itself, by running a deterministic “copy” of the Verifier. Specifically, the Prover will:

  1. Cryptographically hash its own Commitment message — along with some extra information, such as the input/output and “circuit” (or program, or statement.)1
  2. Use the resulting hash digest bits to sample a challenge.
  3. Compute the Prover’s response to the challenge (many times if the protocol calls for this.)
  4. Publish the whole transcript for anyone to verify.

Protocol designers focus on the interactive setting because it’s relatively easy to analyze the security of the protocol. They can make strong claims about the nature of the challenges that will be chosen, and can then reason about the probability that a cheating Prover can lie in response. The hope with Fiat-Shamir is that, if the hash function is “as good as” a random oracle, it should be nearly as secure as the interactive protocol.

Of course none of this applies to practice. When we deploy the protocol onto a blockchain, we don’t have interactive Verifiers and we certainly don’t have random oracles. Instead we replace the random oracle in Fiat-Shamir with a standard hash function like SHA-3. Whether or not any security guarantees still hold once we do this is an open question. And into the maw of that question is where we shall go.

The GKR15 succinct proof system (but not with details)

The new KRS paper starts by considering a specific interactive proof system designed back in 2015 by Goldwasser, Kalai and Rothblum, which we will refer to as GKR15 (note that the “R” is the not the same in both papers — thanks Aditya!) This is a relatively early result on interactive proving systems and has many interesting theoretical features that we will mostly ignore.

At the surface level, what you need to know about the GKR15 proof system is that it works over arithmetic circuits. The Prover and Verifier agree on a circuit C (which can be thought of as the “program”). The Prover also provides some input to the circuit x as well as a witness w, and a purported output y, which ought to satisfy y = C(w, x) if the Prover is honest.

There are some restrictions on the nature of the circuits that can be used in GKR15 — which we will ignore for the purposes of this high-level intuition. The important feature is that the circuits can be relatively deep. That is to say, they can implement reasonably complex programs, such as cryptographic hashing algorithms.

(The authors of the recent KRS paper also refer to GKR15 as a “popular” scheme. I don’t know how to evaluate that claim. But they cite some features in the protocol that are re-used in more recent proof systems that might be deployed in practice, so let’s go with this!)

The GKR15 scheme (like most proving schemes) is designed to be interactive. All you need to know for the moment is that:

  1. The Prover and Verifier agree on C.
  2. The Prover sends the input and output (x, y) to the Verifier.
  3. It also sends a (polynomial) commitment to the witness w in the first “Commitment” message.
  4. The Verifier picks a random challenge c and the Prover/Verifier interact (multiple times) to do stuff in response to this challenge.

Finally: GKR15 possesses a security analysis (“soundness proof”) that is quite powerful, provided we are discussing the interactive version of the protocol. This argument does not claim GKR15 is “perfect”! It leaves room for some tiny probability that a cheating Prover can get away with its crimes: however, it does bound the probability of successful cheating to something (asymptotically and, presumably, practically with the right parameters) negligible.

Since the GKR15 authors are careful theoretical cryptographers, they don’t suggest that their protocol would be a good fit for Fiat-Shamir. In fact, they hint that this is a problematic idea. But, in the 2015 paper anyway, the don’t actually show there’s a problem with the idea. This leaves open a question: what happens if we do flatten it using Fiat-Shamir? Could bad things happen?

A first thought experiment: “weak challenges”

I am making a deliberate decision to stay “high level” and not dive into the details of the GKR15 system quite yet, not because I think they’re boring and off-putting — ok, it’s partly because I think they’re boring and off-putting — but mostly because I think leading with those details will make understanding more difficult. I promise I’ll get to the details later.

Up here at the abstract level I want to remind us, one final time, what we know. We know that proving systems like GKR15 involve a Verifier making a “challenge” to the Prover, which the Prover must answer successfully. A good proving system should ensure that an honest Prover can answer those challenges successfully — but a cheating Prover (any algorithm that is trying to prove a false statement) will fail to respond correctly to these challenges, at least with overwhelming probability.

Now I want to add a new wrinkle.

Let’s first imagine that the set of all possible challenges a Verifier could issue to a Prover is quite large. For one example: the the challenge might be a random 256-bit string, i.e., there are 2256 to choose from. For fun, let’s further imagine that somewhere within this huge set of possible challenge values there exists a single “weak challenge.” This value c* — which I’ll exemplify by the number “53” for this discussion — is special. A cheating Prover who is fortunate enough to be challenged at this point will always be able to come up with a valid response, even if the statement they are “proving” is totally false (that is, if y is not equal to C(w, x).)

Now it goes without saying that having such a “weak challenge” in your scheme is not great! Clearly we don’t like to have weird vulnerabilities in our schemes. And yet, perhaps it’s not really a problem? To decide, we need to ask: how likely is it that a Verifier will select this weak challenge value?

If we are thinking about the interactive setting with an honest Verifier, the analysis is easy. There are 2256 possible challenge values, and just one “weak challenge” at c* = 53. If the honest Verifier picks challenges uniformly at random, this bad challenge will be chosen with probability 2-256 during one run of the protocol. This is so astronomically improbable that we can more or less pretend it will never happen.

How does Fiat-Shamir handle “bad challenges”?

We now need to think about what happens when we flatten this scheme using Fiat-Shamir.

To do that we need to be more specific about how the scheme will be Fiat-Shamir-ified. In our scheme, we assumed that the Prover will generate a Commitment message first. The deterministic Fiat-Shamir “Verifier” will hash this message using a hash function H, probably tossing in some other inputs just to be safe. For example, it might include the input/output values as well as a hash of the circuit h(C) — note that we’re using a separate hash function out of an abundance of caution. Our challenge will now be computed as follows:

c = H( h(C), x, y, Commitment )

Our strawman protocol has a weak challenge point at c* = 53. So now we should ask: can a cheating Prover somehow convince the Fiat-Shamir hash “Verifier” to challenge them at this weak point?

The good news for Fiat-Shamir is that this also seems pretty difficult. A cheating Prover might want to get itself challenged on c* = 53. But to do this, they’d need to find some input (pre-image) to the hash function H that produces the output “53”. For any hash function worth its salt (pun intended!) this should be pretty difficult!4

Of course, in Fiat-Shamir world, the cheating Prover has a slight advantage: if it doesn’t get the hash challenge it wants, it can always throw away the result and try again. To do this it could formulate a new Commitment message (or even pick a new input/output pair x, y) and then try hashing again. It can potentially repeat this process millions of times! This attack is sometimes called “grinding” and it’s a real attack that actual cheating Provers can run in the real world. Fortunately even grinding isn’t necessarily a disaster: if we assume that the Prover can only compute, say, 250 hashes, then (in the ROM) the probability of finding an input that produces c = 53 is still 250 * 2-256 = 2-206, yet another very unlikely outcome.

Another victory for Fiat-Shamir! Even if we have one or more fixed “weak challenge” points in our protocol, it seems like Fiat-Shamir protects us.

What if a cheating Prover can pick the “weak challenge”?

I realize I’m being exhaustive (or exhausting), but I now want to consider another weird possibility: what if the “bad challenge” value can change when we switch circuits (or even circuit inputs)?

For fun, let’s crank this concern into absolute overdrive. Imagine we’re using a proving system that’s as helpful as possible to the cheating Prover: in this system, the “weak challenge value” will be equal to whatever the real output of the circuit C is. Concretely, if the circuit outputs c* = C(w, x) and also c* happens to be challenge value selected by the Verifier, then the Prover can cheat.

(Notice that I’m specifying that this vulnerability depends on the “real” output of the circuit. The prover also sends over a value y, which is the purported output of the circuit that the Prover is claiming, but that might be different if the Prover is cheating.)

In the interactive world this really isn’t an issue. In that setting, the Verifier selects the challenge randomly after the Prover has committed to everything. In that setting, the Prover genuinely cannot “cook” the circuit to produce the challenge value as output (except with a tiny probability.) Our concern is that In Fiat-Shamir world we should be a bit more worried. The value c* is chosen by hashing. A cheating Prover could theoretically compute c* in advance. It now has several options:

  • The cheater could alter the circuit C so that it hard-codes in c*.
  • Alternatively, it could feed the value c* (or some function of it) into the circuit via the input x.
  • It could sneak c* in within the witness w.
  • Or any combination of the above.

The problem here is that even in Fiat-Shamir world, this attack doesn’t work. A cheating Prover might first pick a circuit C and some input/output x, witness w, and a (purported) output y. It could then compute a Commitment by using the proving scheme’s commitment protocol to commit to w. It will then compute the anticipated Fiat-Shamir challenge as:

c* = H( h(C), x, y, Commitment )

To exploit the vulnerability in our proving system, the cheating Prover must now somehow cause the circuit C to output C(w, x) = c*. However, once again we encounter a problem. The cheating Prover had to feed x, as well as a (commitment to) w and (a hash of) the circuit C into the Fiat-Shamir hash function in order to learn c*. (It should not have been able to predict c* before performing the hash computation above, so it’s extremely unlikely that C(w, x) would just happen to output c*.) However, in order to make C(w, x) output c*, the cheater must subsequently manipulate either (1) the design of C or (2) the circuit’s inputs w, x.

But if the cheating Prover does any of those things, they will be completely foiled. If the cheating Prover changes any of the inputs to the circuit or the Circuit itself after it hashes to learn c*, then the actual Fiat-Shamir challenge used in the protocol will change.3

What if the circuit can compute the “weak challenge” value?

Phew. So what have we learned?

Even if our proving system allows us to literally pick the “weak challenge” c* — have it be the output of the circuit C(w, x) — we cannot exploit this fact very easily. Changing the structure of C, x or w after we’ve learned c* will cause the actual Fiat-Shamir challenge used to simply hop to a different value, like the timeline of a traveler who goes back in time and kills their own grandfather.

Clearly we need a different strategy: one where none of the inputs to the circuit, or the “code” of the circuit itself, depend on the output of the Fiat-Shamir hash function.

An important observation, and one that is critical to the KRS result, is that the “circuit” we are proving is fundamentally a kind of program that computes things. What if, instead of computing c* and feeding it to the circuit, or hard-coding it within the circuit, we instead design a circuit that itself computes c*?

Let’s look one last time at the way that Fiat-Shamir challenges are computed:

c* = H( h(C), x, y, Commitment)

We are now going to build a circuit C* that is able to compute c* when given the right inputs.

To make this interesting, we are going to design a very silly circuit. It is silly in a specific way: it can never output one specific output, which in this case will be the binary string “BANANAS”. Looking forward, our cheating Prover is then going to try to falsely claim that for some x, w, the relation C*(w, x) = “BANANAS” actually does hold.

Critically, this circuit will contain a copy of the Fiat-Shamir hashing algorithm H() and it will — quite coincidentally — actually output a value that happens to be identical to the Fiat-Shamir challenge (most of the time.) Here’s how it works:

  1. The cheating Prover will pass w and x = ( h(C*), y* = “BANANAS” ) as input.
  2. It will then use the proving system’s commitment algorithm on w to compute Commitment.
  3. The circuit will parse x into its constituent values.
  4. The circuit will internally compute c* = H( h(C*), x, y*, Commitment ).
  5. In the (highly unlikely) case that c* = “BANANAS”, it will set c* = “APPLES”.
  6. Finally, the circuit will output c*.

Notice first that this circuit can never, ever output C*(w, x) = “BANANAS” for any inputs w, x. Indeed, it is incredibly likely that this output would happen to occur naturally after step (4) — how likely is it that a hash function outputs the name of a fruit? — but just to be certain, we have step (5) to reduce this occurrence from “incredibly unlikely” to “absolutely impossible.”

Second, we should observe that the “code” of the circuit above does not in any way depend on c*. We can write this circuit before we ever hash to learn c*! The same goes for the inputs w and x. None of the proposed inputs require the cheating Prover to know c* before it chooses them.

Now imagine that some Prover shows up and tells you that, in fact, they know some w, x such that C*(w, x) = “BANANAS”. By definition they must be lying! When they try to convince you of this using the proving system, a correctly-functioning system should (with overwhelming probability) catch them in their lie and inform you (the Verifier) that the proof is invalid.

However: we stipulated that this particular proving system becomes totally insecure in the special case where the real calculation C*(w, x) happens to equal the Fiat-Shamir challenge c* that will be chosen during the protocol execution. If and when this remarkable event ever happens, the cheating Prover will be able to successfully complete the protocol even if the statement they are proving is totally bogus. And finally it seems we have an exploitable vulnerability! A cheating Prover can show up with the ridiculous false statement that C*(w, x) = “BANANAS”, and then hand us a Fiat-Shamir proof that will verify just fine. They can do this because in real life, C*(w, x) = c* and, thanks to the “weak challenge” feature of the proving system, that Prover successfully answer all the challenges on that point.

And thus in Fiat-Shamir land we have finally proved a false statement! Note that in the interactive protocol none of this matters, since the Prover cannot predict the Verifier’s (honest) challenge, and so has no real leverage to cheat. In the Fiat-Shamir world — at least when considering this one weird circuit C* — a cheating Prover can get away with murder.

This is not terribly satisfying!

Well, not murder maybe.

I realize this may seem a bit contrived. First we had to introduce a made-up proving system with a “weird” vulnerability, then we exploited it in a very nutty way. And the way we exploited it is kind of silly. But there is a method to the madness. I hope I’ve loaded us up with the basic concepts that we will need to really understand the KRS result.

What remains is to show how the real GKR15 scheme actually maps to some of these ideas.

As a second matter: I started these posts out by proposing all kinds of flashy ideas: we could steal billions of dollars in Ethereum transactions by falsely “proving” that a set of invalid transactions is actually valid! Now having promised the moon, I’m delivering the incredibly lame false statement C*(w, x) = “BANANAS”, where C* is a silly circuit that mostly computes a hash function. This should disappoint you. It disappoints me.

But keep in mind that cryptographic protocols are very delicate. Sometimes an apparently useless vulnerability can be expanded into something that actually matters. It turns out that something similar will apply to this vulnerability, when we apply it to the GKR15 scheme. In the next post.

There will be only one last post, I promise.

Notes:

  1. There is a class of attacks on Fiat-Shamir-ified protocols that stems from putting too little information into the hash function. Usually the weakness here is that the hash does not get fed stuff like “the public key” (in signature schemes), which weakly corresponds to “the circuit” in a proving system. Sneaky attackers can switch these out and do bad thing.
  2. I’ve been extremely casual about the problem of converting hash function outputs into “challenges” for arbitrary protocols. Sometimes this is easy — for example, if the challenge space consists of fixed-length random bit-strings, then we can just pick a hash function with the appropriate output length. Sometimes it is moderately easy, for example if the challenges are sampled from some reasonably-sized field. A few protocols might pick the challenges from genuinely weird distibutions. If you’re annoyed with me on this technical detail, I might stipulate that we have (A) a Fiat-Shamir hash function that outputs (enough) random bits, and (B) a way to convert those bits into whatever form of challenge you need for the protocol.
  3. I said this is not scientific, but clearly we could make some kind of argument in the random oracle model. There we’d argue (something like): assume the scheme is secure using random challenges chosen by an honest (interactive) Verifier, now let’s consider whether it’s possible for the attacker to have pre-queried the oracle on the inputs that will produce the actual Fiat-Shamir. We could then analyze every possible case and hopefully determine that the answer is “no.” And that would be “science.”
  4. In the random oracle model we can convert this intuition into a stronger argument: each time the Prover hashes something for the first time, the oracle effectively samples a uniformly random string as its response. Since there are 2256 possible digests, the probability that it returns c = 53 after one hash attempt should still be 2-256, which again is very low.

How to prove false statements? (Part 1)

How to prove false statements? (Part 1)

Trigger warning: incredibly wonky theoretical cryptography post (written by a non-theorist)! Also, this will be in two parts. I plan to be back with some more thoughts on practical stuff, like cloud backup, in the near future.

If you’ve read my blog over the years, you should understand that I have basically two obsessions. One is my interest in building “practical” schemes that solve real problems that come up in the real world. The other is a weird fixation on the theoretical models that underpin (the security of) many of those same schemes. In particular, one of my favorite bugaboos is a particular model, or “heuristic”, called the random oracle model (ROM) — essentially a fancy way to think about hash functions.

Along those lines, my interest was recently piqued by a new theoretical result by Khovratovich, Rothblum and Soukhanov entitled “How to Prove False Statements: Practical Attacks on Fiat-Shamir.” This is a doozy of a paper! It touches nearly every sensitive part of my brain: it urges us towards a better understanding of our theoretical models for proving security of protocols. It includes the words “practical” and “attacks” in the title! And most importantly it demonstrates a real (albeit wildly contrived) attack on the kinds of “ZK” (note: not actually ZK, more on that later) “proving systems” that we are now using inside of real systems like blockchains.

I confess I am still struggling hard to figure out how I “feel” about this result. I understand how odd it seems that my feelings should even matter: this is science after all. Shouldn’t the math speak for itself? The worrying thing is that, in this case, I don’t think it does. In fact, this is what I find most fundamentally exciting about the result: it really does matter how we think about it. (Here I should add that we don’t all think the same say. My theory-focused PhD student Aditya Hegde has been vigorously debating me on my interpretation — and mostly winning on points. So anything non-stupid I say here is probably due to him.)

Oh yes, and I should also mention that there are billions and billions of dollars riding on these questions? I’m not being dramatic. This is really true.

I mentioned that this post is going to be long and wonky, that’s just unavoidable. But I promise it will be fun. (Ok, I can’t even promise that.) Screw it, let’s go.

The shortest background ever (and it will still be really long)

If you’ve read this blog over the long term, you know that I’m obsessed with one particular “trick” we use in proving our schemes secure. This trick is known as the random oracle model, and it’s one of the worst (or best) things to happen to cryptography.

Let me try to break this down as quickly as I can. In cryptography we have a tendency to use an ingredient called a cryptographic hash function. These functions take in a (potentially) long string and output a short digest. In cryptography courses, we present these functions along with various “security definitions” they should meet, properties like collision resistance, pre-image resistance and so on. But out in the real world most schemes require much stronger assumptions in order to be proven secure. When we argue for the security of these schemes, we often demand that our hash functions be even stronger: we require that they must behave like random functions.

If you’re not sure what a random function is, you can read about it in depth here. You should just trust that it is a very strong and beautiful requirement for a hash function! But there is a fly in the ointment. Real-world hash functions cannot possibly be random functions. Specifically: concrete hash functions like SHA-2, SHA-3 etc. are characterized by the inevitable requirement that they possess compact, efficient algorithms that can compute their output. Random functions (of any usefulness) must not. Indeed, the most efficient description of a random function is essentially a giant (i.e., exponentially-sized in the length of the inputs to the function) lookup table. These functions cannot even be computed efficiently, because they’re so big.

So when we analyze schemes where hash functions must behave in this manner, we have to do some pretty suspicious things. The approach we take is bonkers. First, we analyze our schemes inside of an artificial “model” where efficient (polynomial-time) participants can somehow evaluate random functions, despite the fact that this is literally impossible. To make this apparent contradiction work, we “yank” the hash function logic into a magical box that lives outside that participants — this includes both honest participants in a protocol, as well as any adversaries who try to attack the scheme — and we force everyone to call out to that functionality. This new thing is called a “random oracle.”

One weird implication of this approach is that no party can ever know the code of the “hash function” they’re evaluating. They literally cannot know it, since in this model the hash function is comprised of an enormous random lookup table that’s much too big for anyone to actually know! This may seem like a not-very big deal, but it will be exceptionally important going forward.

Of course in the real world we do not have random oracles. I guess we could set up a special server that everyone in the world can call out to in order to compute their hash function values! But we don’t do that because it’s ridiculous. When want to deploy a scheme IRL, we do a terrible thing: we “instantiate the random oracle” by replacing it with an actual hash function like SHA-2 or SHA-3. Then everyone goes on their merry way, hoping that the security proof still has some meaning.

Let me be abundantly clear about this last part. From a theoretical perspective, any scheme “proven secure” in the random oracle model ceases to be provably secure the instant you replace the random oracle with a real (concrete) hash function like SHA-3. Put differently, it’s the equivalent of replacing your engine oil with Crisco. Your car may still run, but you are absolutely voiding the warranty.

But, but, but — and I stress the stammer — voiding your warranty does not mean your engine will become broken! In most of the places where we’ve done this awful random oracle “instantiation” thing (let’s be honest: almost every real-world deployed protocol) the instantiated protocols all seemed to work just fine.

(To be sure: we have certainly seen cryptographic protocols break down due to broken hash functions! But these breaks are almost always due to obvious hash function bugs that anyone can see, such as meaningful collisions being found. They were not magical breaks that come about because you rubbed the “theory lamp” wrong. As far as we can tell, in most cases if you use a “good enough” secure hash function to instantiate the random oracle, everything mostly goes fine.)

Now it should be noted that theoreticians were not happy about this cavalier approach. In the late 1990s, they rebelled and demonstrated something shocking: it was possible to build “contrived” cryptographic schemes that were provably secure in the random oracle model but totally broken when the oracle was “instantiated” with any hash function.

This was shocking, but honestly not that surprising once you’ve had someone else explain the basic idea. Most of these “counterexample schemes” followed from four simple observations:

  1. In the (totally artificial) random oracle model, you don’t know a compact description of the hash function. You literally can’t know one, since it’s an exponentially-sized random function.
  2. In the “instantiated” protocol, where you’ve replaced the random oracle with e.g., SHA-2, you very clearly must know a compact description of the hash function (for example, here is one.)
  3. We can build a “contrived” scheme in which “knowledge of the description of the hash algorithm” forms a kind of backdoor that allows you to break the scheme!
  4. In the random oracle model where you can’t ever possess this knowledge, the backdoor can never be triggered — hence the scheme is “secure.” In the real world where you instantiate the scheme with SHA-2, any clown can break it.

These results straddle the line between “brilliant” and “fundamentally kind of silly”. Brilliant because, wow! These schemes will be insecure when instantiated with any possible hash function! The random oracle model is a trap! But stupid because, I mean… duh!? In fact what we’re really showing is that our artificial model is artificial. If you build schemes that deliberately fall apart when any adversary knows the code for a hash function, then of course your schemes are going to be broken. You don’t need to be a genius to see that this is going to go poorly.

Nonetheless: theoreticians took the a victory lap and then moved on to ruining other people’s fun. Practitioners waited until every last one of them had lost interest, rolled their eyes, and said “let’s agree not to deploy schemes that do obviously stupid things.” And then they all went on deploying schemes that were only proven secure in the random oracle model. And this has described our world for 28 years or so.

But the theoreticians weren’t totally wrong, were they?

That is the $10,000,000,000 question.

As discussed above, many “contrived counterexample” schemes were built to demonstrate the danger of the random oracle model. But each of them was so obviously cartoonish that nobody would ever deploy one of them in practice. If your signature scheme includes 40 lines of code that essentially scream “FYI: THIS IS A BACKDOOR THAT UNLOCKS FOR ANYONE WHO KNOWS THE CODE OF SHA2”, the best solution is not to have a theoretical argument about whether this code is “valid.” The best solution is to delete the code and maybe write over your hard disk three times with random numbers before you burn it. Practitioners generally do not feel threatened by artificial counterexamples.

But a danger remains.

Cryptographic schemes have been getting more complicated and powerful over time. Since I explained the danger in a previous blog post I wrote five years ago, I’m going to save myself some trouble — and also make myself look prescient:

The probability of [a malicious scheme slipping past detection] accidentally seems low, but it gets higher as deployed cryptographic schemes get more complex. For example, people at Google are now starting to deploy complex multi-party computation and others are launching zero-knowledge protocols that are actually capable of running (or proving things about the execution of) arbitrary programs in a cryptographic way. We can’t absolutely rule out the possibility that the CGH and MRH-type counterexamples could actually be made to happen in these weird settings, if someone is a just a little bit careless.

Let’s drill down on this a moment.

One relatively recent development in cryptography is the rise of succinct “ZK” or “verifiable computation” schemes that allow an untrusted person to prove statements about arbitrary programs. In general terms, these systems allow a Prover (e.g., me) to prove statements of the following form: (1) I know an input to a [publicly-known] program, such that (2) the program, when run on that input, will output “True.”

The neat thing about these systems is that after running the program, I can author a short (aka “succinct”) proof that will convince you that both of these things are true. Even better, I can hand that short proof (sometimes called an “argument”) to anyone in the world. They can run a Verify algorithm to check that the proof is valid, and if it agrees, then they never need to repeat the original computation. Critically, the time required to verify the proof is usually much less than the time required to re-check the program execution, even for really complicated program executions. The resulting systems are called arguments of knowledge and they go by various cool acronyms: SNARGs, SNARKs, STARKs, and sometimes IVC. (The Ethereum world sometimes lumps these together under the moniker “ZK”, for historical reasons we will not dig into.)

This technology has proven to be an exciting and necessary solution for the cryptocurrency world, because that world happens to have a real problem on its hands. Concretely: they’ve all noticed that blockchains are very slow. Those systems require thousands of different computers to verify (“check the validity of”) every financial transaction they see, which places enormous limitations on transaction throughput.

“Succinct” proof systems offer a perfect solution to this conundrum.

Rather than submitting millions of individual transactions to a big, slow blockchain, the blockchain can be broken up. Distinct servers called “rollups” can verify big batches of transactions independently. They can each use a succinct proof system to prove that they ran the transaction-verification program correctly on all those transactions. The base-level blockchains no longer need to look at every single transaction. They only need to verify the short “proofs” authored by the rollup servers, and (magically!) this ensures that all of the transactions are verified — but with the base-level blockchain doing vastly less work. In theory this allows a massive improvement in blockchain throughput, mostly without sacrificing security.

An even cooler fact is that these proof systems can in some cases be applied recursively. This is due to a cute feature: the algorithm for verifying a proof is, after all, itself just a computer program. So I can run that program on some other proofs as input — and then I can use the proof system to prove that I ran that program correctly.

To give a more concrete application:

  1. Imagine 1,000 different servers each run a program that verifies a distinct batch of 1,000 transactions. Each server produces a succinct proof that they ran their program correctly (i.e., their batch is correct.)
  2. Now a different server can take in each of those 1,000 different proofs. And it can run a Verify program that goes through each of those 1,000 proofs and verifies that each one is correct. It outputs a proof that it ran this program correctly.
  3. The result is a single “short” proof that proves all 1,000,000 transactions are correct!

I even made a not-very-excellent picture to try to illustrate how this can look:

Example of recursive proof usage. At the bottom we have some real programs, each of which gets its own proof. Then one level up we have a program that simply verifies the proofs from the bottom level. And at the top we have another program that verifies many proofs from the second level! (Many programs not shown.)

This recursive stuff is really useful, and I promise that it will be relevant later.

So what?

The question you might be asking is: what in the world does this have to do with random oracle counterexamples?!

Since these proof systems are now powerful enough to run arbitrary programs (sometimes implemented in the form of arithmetic or boolean “circuits”), there is now a possibility that sneaky counterexample “backdoors” could be smuggled in within the programs we are proving things about. This would mean that even if the actual proving scheme has no obvious backdoors in its code, the actual programs would be able to do creepy stuff that would undermine security for the whole system. Our practitioner friends would no longer be able to exercise their (very reasonable) heuristic of “don’t deploy code that does obviously suspicious things” because, while their implementation might not do stupid things, some user try to run it with a malicious program that does.

(A good analogy is to imagine that your Nintendo system has no exploits built into it, but any specific game might sneak in a nasty piece of code that could blow everything up.)

A philosophical interlude

This has been a whole lot, and there’s lots more to come.

To give you a break, I want pause for a moment to talk about philosophy, metaphysics (meta-cryptography?), or maybe just the Meaning of Life. More concretely, at this point we need to stop and ask a very reasonable question: how much does this threat model even matter? And what even is this threat model?

Allow me to explain. Imagine that we have a proving system that is basically not backdoored. It may or may not be provably secure, but by itself the proving system itself does not contain any obvious backdoors that will cause it to malfunction, even if you implement it using a concrete hash function like SHA-3.

Now imagine that someone comes along and writes a program called “Verify_Ethereum_Transactions_EVIL.py” that we will want to run and prove using our proof system. Based on the name, we can assume this program was developed by a shady engineer who maliciously decide to add a “backdoor” to the code! Instead of merely verifying Ethereum transactions as you might hope for, the functionality of this program does something nastier:

“Given some input, output True if the input comprises 1,000 valid Ethereum transactions…
OR
output True if the input (or the program code itself) contains a description of the hash function used by the proving system.”

This would be really bad for your cryptocurrency network! Any clever user could submit invalid Ethereum transactions to be verified by this program and it would happily output “True.” If any cryptocurrency network then trusted the proof (to mean “these transactions are actually valid”) then you could potentially use this trick to steal lots of money.

But also let me be clear: this would also be an incredibly stupid program to deploy in your cryptocurrency network.

The whole point of a proof system is that it proves you ran a program successfully, including whatever logic happens to be within those programs. If those programs have obvious backdoors inside of them, then proving you ran those programs means you’re also proving that you might have exercised any backdoors in those programs. If the person writing your critical software is out to get you, and/or you don’t carefully audit their output, you will end up being very regretful. And there are many, many ways to add backdoors to software! (Just to illustrate this, there used to be an entire competition called the “Underhanded C Contest” where people would compete to write C programs full of malicious code that was hard to catch. The results were pretty impressive!)

So it’s worthwhile to ask whether this is really a surprise. In the past we knew that (1) if your silly cryptographic scheme had weird code that made it insecure “to anyone who knows how to compute SHA-2“, then (2) it would really be insecure in the real world, since any idiot can download the code for SHA-2, and (3) you should not deploy schemes that have obvious backdoors.

So with this context in mind, let’s talk about what kind of bad things might happen. These can be divided into “best case“, “second worst case” and “oh hell, holy sh*t.

In the best case, this type of attack might simply move the scary backdoor code out from the cryptographic proving system, and into the modular “application programs” that can be fed into the proving system You still need to make sure the scheme implementation doesn’t have silly backdoors — like special code that breaks everything if you know the code for SHA-2. But now you also need to make sure every program you run using this system doesn’t have a similar backdoors. But to be clear: you kind of had to audit your programs for backdoors anyway!

In fairness, the nature of these cryptographic backdoors is that they might be more subtle than a typical software backdoor. What I mean here is that ordinary software reviewers might not recognize it, and only only an experienced cryptographer will identify that something sketchy is happening. But even if the bug is hard to identify, it’s still a bug — a bug in one specific piece of code — and most critically, it would only affect your own application if you deployed it.

Of course there are worse possibilities as well.

In the second worst case, perhaps the bugdoor can be built into the application code in some clever way that is deeply subtle and fundamentally difficult for code auditors to detect — even if they know how to look for it. Perhaps it could somehow be cryptographically obfuscated, so no code review will detect it! Recursive proof systems are worrying when it comes to this concern, since the “bug” might exist multiple layers down in a tree of recursive proofs, and you might not have the code for all those lower-level programs.1 It’s possible that the set of “bad code behaviors” we we’d need to audit the code for is so large and undefined that we can no longer apply simple heuristics to catch the bad stuff!

This would be very scary. But it is certainly not the worst case.

The (“oh crap!”) worst case: with recursive proofs there is an even more terrible thing that could theoretically happen. Recall that a single top-level recursive proof can recursively verify thousands of different programs. Many of those programs will likely be written by careful, honest developers. Others could be written by scammers. Clearly if the scammers’ code contains bugs (or flaws) then we should expect those bugs to make the scammers’ own programs less secure, at whatever goal they’re supposed to accomplish. So far none of this is surprising. But ideally what we should hope is that any backdoors in the scammers’ programs will remain isolated to the scammers’ code. They should not “jump across program boundaries” and thus undermine the security of the well-written, honest programs elsewhere in the recursive proof stack.

Now imagine a situation where this is not true. That is, a clever bug in one “program” anywhere in the tree could somehow make any other program (proof) in the entire tree of proofs insecure. This is akin to getting one malicious program onto a Linux box, then using it to compromise the Linux kernel and thus undermine the security of any application running on the system. Maybe this seems unlikely? Actually to me it seems genuinely fantastic, but again, we’re in Narnia at this point. Who knows what’s possible!

This is the very worst case. I don’t think it could happen, but… who knows?

This is the scary thing about what can happen once we leave the world of provable security. Without some fundamental security guarantees we can rely on, it’s possible that the set of attacks we might suffer could be very limited. But they could also be fundamentally unbounded! And that’s where I have to leave this post for the moment.

This post is continued in Part 2.

Notes:

  1. We might imagine, for example, that a recursive Verify program might just take in the hash (or commitment) to a program. And then a Prover could simply state “well, I know a real program that matches this commitment AND ALSO I know an input that satisfies the program.” This means the program wouldn’t technically be available to every auditor, only the hash of the program. I am doing a lot of handwaving here, but this is all possible.

A quick post on Chen’s algorithm

A quick post on Chen’s algorithm

Update (April 19): Yilei Chen announced the discovery of a bug in the algorithm, which he does not know how to fix. This was independently discovered by Hongxun Wu and Thomas Vidick. At present, the paper does not provide a polynomial-time algorithm for solving LWE.

If you’re a normal person — that is, a person who doesn’t obsessively follow the latest cryptography news — you probably missed last week’s cryptography bombshell. That news comes in the form of a new e-print authored by Yilei Chen, “Quantum Algorithms for Lattice Problems“, which has roiled the cryptography research community. The result is now being evaluated by experts in lattices and quantum algorithm design (and to be clear, I am not one!) but if it holds up, it’s going to be quite a bad day/week/month/year for the applied cryptography community.

Rather than elaborate at length, here’s quick set of five bullet-points giving the background.

(1) Cryptographers like to build modern public-key encryption schemes on top of mathematical problems that are believed to be “hard”. In practice, we need problems with a specific structure: we can construct efficient solutions for those who hold a secret key, or “trapdoor”, and yet also admit no efficient solution for folks who don’t. While many problems have been considered (and often discarded), most schemes we use today are based on three problems: factoring (the RSA cryptosystem), discrete logarithm (Diffie-Hellman, DSA) and elliptic curve discrete logarithm problem (EC-Diffie-Hellman, ECDSA etc.)

(2) While we would like to believe our favorite problems are fundamentally “hard”, we know this isn’t really true. Researchers have devised algorithms that solve all of these problems quite efficiently (i.e., in polynomial time) — provided someone figures out how to build a quantum computer powerful enough to run the attack algorithms. Fortunately such a computer has not yet been built!

(3) Even though quantum computers are not yet powerful enough to break our public-key crypto, the mere threat of future quantum attacks has inspired industry, government and academia to join forces Fellowship-of-the-Ring-style in order to tackle the problem right now. This isn’t merely about future-proofing our systems: even if quantum computers take decades to build, future quantum computers could break encrypted messages we send today!

(4) One conspicuous outcome of this fellowship is NIST’s Post-Quantum Cryptography (PQC) competition: this was an open competition designed to standardize “post-quantum” cryptographic schemes. Critically, these schemes must be based on different mathematical problems — most notably, problems that don’t seem to admit efficient quantum solutions.

(5) Within this new set of schemes, the most popular class of schemes are based on problems related to mathematical objects called lattices. NIST-approved schemes based on lattice problems include Kyber and Dilithium (which I wrote about recently.) Lattice problems are also the basis of several efficient fully-homomorphic encryption (FHE) schemes.

This background sets up the new result.

Chen’s (not yet peer-reviewed) preprint claims a new quantum algorithm that efficiently solves the “shortest independent vector problem” (SIVP, as well as GapSVP) in lattices with specific parameters. If it holds up, the result could (with numerous important caveats) allow future quantum computers to break schemes that depend on the hardness of specific instances of these problems. The good news here is that even if the result is correct, the vulnerable parameters are very specific: Chen’s algorithm does not immediately apply to the recently-standardized NIST algorithms such as Kyber or Dilithium. Moreover, the exact concrete complexity of the algorithm is not instantly clear: it may turn out to be impractical to run, even if quantum computers become available.

But there is a saying in our field that attacks only get better. If Chen’s result can be improved upon, then quantum algorithms could render obsolete an entire generation of “post-quantum” lattice-based schemes, forcing cryptographers and industry back to the drawing board.

In other words, both a great technical result — and possibly a mild disaster.

As previously mentioned: I am neither an expert in lattice-based cryptography nor quantum computing. The folks who are those things are very busy trying to validate the writeup: and more than a few big results have fallen apart upon detailed inspection. For those searching for the latest developments, here’s a nice writeup by Nigel Smart that doesn’t tackle the correctness of the quantum algorithm (see updates at the bottom), but does talk about the possible implications for FHE and PQC schemes (TL;DR: bad for some FHE schemes, but really depends on the concrete details of the algorithm’s running time.) And here’s another brief note on a “bug” that was found in the paper, that seems to have been quickly addressed by the author.

Up until this week I had intended to write another long wonky post about complexity theory, lattices, and what it all meant for applied cryptography. But now I hope you’ll forgive me if I hold onto that one, for just a little bit longer.