// Sigma Protocols #Math
The Road to Zero Knowledge
A cryptographic proof scheme between two parties, a prover and a verifier, is considered zero knowledge if it has three properties:
- Completeness – A prover is able to prove they know something to a verifier. The verifier is somehow fully convinced.
- Soundness – A prover cannot fake it, they must know this something in order to convince the verifier of the knowledge.
- Zero-knowledge – The verifier learns nothing beyond that the prover knows this something.
These rules feel both very thorough and very vague at the same time. But they allow for cryptographic tools to be described in an abstract fashion. If someone is designing a system that requires a zero-knowledge proof, there are quite a few implementations to choose from which could be plugged in. Those options have all sorts of other trade-offs between them such as performance to compute and verify, proof size, and level of required interaction between parties. There is even wiggle room in if the proof is probabilistic. Some schemes require multiple rounds of interaction with the verifier gaining confidence on each successful proof. Other schemes are deterministic after just a single round.
The prover and verifier lingo sounds a lot like the Schnorr protocol for a digital signature, where a prover convinces a verifier that they know a private key (a.k.a. a discrete logarithm) without telling them the key itself. Is that protocol considered zero knowledge? Turns out, not quite. To prove that a protocol is zero knowledge it needs to be shown that even if a verifier breaks all the rules, they still learn nothing. This is done by showing that there is simply nothing being exposed by the prover to learn. The Schnorr signature protocol cannot say this with certainty. The verifier is supposed to choose a challenge scalar randomly, but a bad actor doesn’t have to and could theoretically use this to somehow calculate the private key with some algebra. Don’t get too concerned though, it has been a few decades and no one has actually thought of an attack which does this. We use this protocol all over the place so we feel pretty good about it. But it hasn’t been proven to be perfect, it is possible that in the future some kernel of knowledge is discovered which leverages this crack (crazier things have happened). So we add a caveat to the zero knowledge rule to describe this: honest verifier zero knowledge. If the verifier is honest (follows the protocol), it is definitely zero knowledge, but we just don’t know for sure in the case that they are dishonest.
While the Schnorr protocol doesn’t meet the requirements for zero knowledge, it does meet the requirements for a different cryptographic abstraction: a sigma protocol. A sigma protocol has some more specific rules than the three above for zero knowledge. It describes a three step handshake used by a prover and verifier, but more on that later. It also has three requirements which line up perfectly with Completeness, Soundness, and Zero-Knowledge except that, funny enough, Zero Knowledge has been swapped out with Honest Verifier Zero Knowledge. What are the odds! A sigma protocol is another abstraction “lens” to view cryptographic tools with. It turns out that sigma protocols are just so useful it is worth having its own description even if it requires the zero knowledge caveat.
Those three steps of the sigma protocol are the Commitment, Challenge, and Response. The commitment, also known as the blinding factor, allows the prover to hide their secret. It also allows each proof to be unique which keeps all information about the secret hidden. The challenge allows the verifier to know for sure that the prover has the secret. For example, hey prover, if you actually have the secret, you wouldn’t mind multiplying it a few times? Finally, the response ties together the commitment, challenge, and secret in an algebraic fashion which convinces the verifier the prover knows the secret.
The Schnorr protocol implements these three abstract steps by leveraging the discrete log problem and the homomorphic connection between the private scalars and public points. There are other implementations which don’t use either of they properties.
Thinking back on the Schnorr signature scheme, it is possible to make it non-interactive by adding a hash function. This combines the commitment and challenge steps so the verifier no longer is directly involved. Does this make the protocol meet the three rules of zero knowledge since there is now know potential crack in the scheme? Ah, oh so close, but not quite. While a dishonest verifier no longer has a path to manipulate things, this now depends on the hash function really being a random oracle. If for some reason that assumption is broken, the protocol would then not be zero knowledge since information could leak. Cryptographers are not quite ready to declare this straight-up zero knowledge, but it does get the added distinction Non-Interactive Zero Knowledge.
composition
Sigma protocols can be composed! An AND
composition proves knowledge of all independent values. An OR
composition proves knowledge of 1 value out of many.
The AND
composition is simple, it involves no new logic or layers. The verifier can just run multiple sigma protocols and check that the prover can prove them all. The verifier and prover can use the same challenge value across all proofs to save some bytes on the wire, which may add up fast depending on how many of these you are doing. But there is no security reason to do so.
The OR
composition is where things get interesting. The prover would like to prove they know one thing, but without letting the verifier know which one exactly. To accomplish this, the verifier and prover agree on one more step beyond the verifier just checking each proof. The verifier must also verify that the returned challenges. But why is the prover returning challenges?
All the proofs in the OR
composition must still work, else it would leak which one the prover actually knows. So the prover needs to simulate the proofs they don’t know, they gotta fake em. So here’s the new flow.
- The prover commits to all protocols.
- The verifier sends a single challenge
c
. - The prover computes a challenge for the real value and simulates the others.
- The prover responds to all protocols
- The verifier checks all responses are valid and that the challenges sum to the original challenge.
Step #3 sounds different! This is where the magic happens. The prover uses the power of XOR
to work backwards and compute a challenge for the real value.
c_1 = c XOR c_2
The verifier sends over c
. The prover can choose anything for the challenge for the bogus simulation c_2
. These are used to calculate c_1
with an XOR operation.
The prover will send back c_2
and c_1
with their corresponding proofs so that the verifier can check that they XOR
back to their original challenge c
. The properties of XOR
make it perfect for this. The simple reversibility and split/recombine patterns. It is also uniformly balanced (1’s and 0’s output) keeping things random in the challenges.
This is pretty clever so I am going to step through it with the common Schnorr implementation. For the Schnorr implementations, we are proving knowledge of a discrete logarithm. So we know x
in Y = G^x
where Y
is the public key (point if using a curve), G
is a generator of the group, and x
is the secret scalar, the discrete logarithm of Y
in the field.
Let’s say in this case there are two discrete logs and the prover wants to show that they one of them, but without revealing which one. We have Y_1 = G^x_1
and Y_2 = G^x_2
, and the prover knows x_1
. The public keys, Y_1
and Y_2
, are known by both the prover and verifier.
Following the sigma flow, the prover needs to first choose some random blinding factor scalars and send the public points to the prover. The prover chooses r_1
for its known scalar, and calculates R_1 = G^r_1
.
Now here is where the new protocol starts. For the unknown logarithm x_2
they are going to simulate knowledge. This looks like the normal flow in reverse. The prover chooses random signature s_2
and challenge c_2
scalars. This sounds like it would defeat the purpose, but we will see how this is just used to keep the knowledge of which discrete logarithm they know secret.
With a fake s_2
chosen, the prover needs to work backwards to find a public blinding factor which will remain consistent when they toss s_2
to the verifier. The verifier will use s_2
to check that G^s_2 = Y_2^c_2 + R_2
. For this to be true, R_2 = G^s_2 - Y_2^c_2
and notice all the pre-selected values on the right side of the equation. So the prover generates a public blinding factor R_2
without having to know the discrete logarithm r_2
. R_1
is calculated legitimately, but R_2
is still algebraically connected to the challenge.
The prover sends over the two public blinding factors R_1
and R_2
and the verifier sends back a single challenge scalar c
.
With the ball back in the prover’s court, they use c
to calculate a challenge for the known secret value, c_1 = c XOR c_2
. Notice that this means c = c_1 XOR c_2
, which the verifier can use later to make sure their challenge value was used by the prover. With c_1
now set, the prover calculates the signature for the known private scalar following the standard protocol s_1 = c_1 * x_1 + r_1
. The prover now has a legit s_1
and the bogus (but still functioning) s_2
. They send back s_1, c_1, s_2, c_2
.
The verifier can now check the signatures like normal, but needs to plug in the challenges passed back by the prover. Why trust these challenges? The verifier also checks that c = c_1 XOR c_2
so they know they challenges were derived from their original c
.
So the verifier knows all the proofs checkout and that the prover used their challenge value, but how does this make them certain that the prover knows one of the discrete logarithms? If the prover doesn’t know either secret values, they have two options. They could try and spoof R_1
like they did R_2
. But this means they would have to guess a c_1
before learning c
…unlikely. The other option is to wait for c
, use it to calculate c_1
and now try to calculate s_1
without knowledge of x_1
. This would require breaking the discrete logarithm…also unlikely.
ring signatures
A use case for sigma protocol OR composition is ring signatures, where one party can sign on behalf of multiple, but the verifier doesn’t know who signed. Sounds pretty straight forward, I was curious though how a ring signature could be made non-interactive. Usually this is done by taking a blinding factor and hashing it to challenge (fiat shamir), but what if there are multiple blinding factors?
There needs to be a deterministic way for a prover to compute a challenge (or challenges) which verifiers can also compute. A pretty nifty protocol is used to set this up. It uses the same sort of pattern of the simple XOR split approach, where random values from the simulation protocols are used to generate a real challenge for the real secret. One new requirement is the order of the public keys involved. Any sort of heuristic can be used, lexographical or hashing for example, but all parties need to agree on an order so that hashes are deterministic.
This ordering of the parties is how the signature gets its “ring” name. The prover is able to tie all the parties together and complete the ring with their secret.
From a verifier perspective, the always start at key #1, top of the order. This way there is no information leak on which key signed. The prover publishes all the public keys (P
), witnesses (s
), and public blinding factors (R
). The one last piece is the challenge for key #1. That is enough information to verify proof #1 and then compute the challenge for proof #2, kicking off the process to verify the whole ring of proofs. The challenge is a hash of the message being signed, the public keys, and crucially the current proof’s public blinding factor.
If we take a look at things from the prover’s perspective now, we can see how this binds things together. The prover starts with the real secret and generates a random blinding factor. They can use the public version R
to generate the challenge for their first simulation, the next key in the order. Here is where the XOR pattern from above (R_2 = G^s_2 - Y_2^c_2
) comes into play again. The bogus R
value of the simulation can be set with the given challenge and a randomly selected witness. And then the process starts again for the next key. It finally loops around the ring to arrive back at the real key and hand it a real challenge. This is used to calculate a real witness and all the proofs are set, tied in a hash knot.