2024.01 Vol.5

Bit of cryptographic signatures

With the lower level abstract algebra under the belt from a little bit of elliptic curve cryptography it is now possible to sink a bit further into the signature pattern we see in applications of cryptography.

The first goal of a signature (will get into the second later) is for Alice to prove to Bob that she is the holder of a private key without given up the private key. One way Alice could prove she owns her super secret 256-bit private key is by just showing it to Bob, but now Bob can go and pretend to be Alice. Defeats the whole point of this thing being secret.

Before diving in, here is a quick recap of the abstract algebra layer we have to work with: the elliptic curve points which make up a group. Since it is just a group, not a field, there is a single binary operator across the set of points. We refer to it as “addition” where a point can be added to a point to produce a third point. More common in cryptography, a point can be added to itself a bunch of times. We call this “scalar multiplication” since we are scaling up a point by some number. For example, P + P + P + P is just 4P, scaling the point P four times. This is the most common operation in cryptography since it is how we introduce the discrete log problem, by just scaling a common “generator” point. And this addition binary operator is the only tool we have to work with to try and reach our goal of proof without sharing. But it has a few characteristics which should get the spidy senses tingling that there is a way to pull this off.

A public key is a point on the elliptic curve. Since public keys are always based off of the generator point G, adding two public keys is just adding the scalars of the generator point. So if one public key is kG and the other is pG, adding them together is just adding G to itself k + p times. And just a heads-up, subtraction works the same.

sG = kG + pG 
sG = (k + p)G

G is the public generator point and s, k, and p are private scalars.

There are some interesting relations between the scaled-up, public, version of the calculation and the scalar-based, private, version. If given a scalar s, it is easy to find the public key sG, but if given sG we can’t find s due to the discrete log problem. But we can do addition operations on either level to get s or sG…so seems like there should be a way to do some private addition which can be verified with public addition.

s = k + p

The scalar-based, private, calculation of the above public key addition where it is scaled-up by G.

Ok, let’s lay out the scenario. Alice has a private key p which corresponds to the public key pG. She has given Bob the public key pG, but Bob doesn’t trust that Alice hasn’t just copy/pasta’d this from the interwebs somewhere, he wants proof that Alice has the private key.

In the box below, we have the public equation on top and the private equation below. Bear with me for this super simple example of s = p, this is just establishing the mental model of these two equations. So Bob knows pG, the public key, and can also be told s to verify that s is the key for sG. This is proof that Alice knows p, since p is required to know s, but it also obviously doesn’t achieve our goal since s == p and Alice just gave away her private key to Bob. But is there a tweak we can add to this model to hide that?

sG = pG
^    ^
|    |
|    |
s  = p  

What if Bob is told s with pG? Still pretty useless since Alice just gave away p.

Well, the only other tool we have is adding G’s so let’s throw it in there and see what happens. Again, Bob gets the top public equation and just the left-hand side, s, of the bottom private equation.

sG = (p + n)G
^    ^    
|    |
|    |
s  = p + n  

Add G to itself n more times.

This appears better since Alice doesn’t immediately tell Bob the secret key p. However, does she tell Bob anything? If she gives him just s and sG and says, “Hey, by the way this proves I have p” he will just ask “How?”. If she then tells him “It is because I added the random number n to my key!”, Bob can then go verify that sG = (p+n)G, but the first step would be to calculate p = s - n, giving away p again. Back to the drawing board.

sG = pG + nG
^    ^    ^   
|    |    |
|    |    |
s  = p  + n  

Add G n more times to itself, but also give Bob nG along with s.

Ok, this is looking interesting. So Bob already knows pG, the public key Alice is trying to prove she owns. Alice can calculate s because she knows p and n. She can safely give s to Bob because he can’t calculate n from nG due to the discrete log problem. Bob can verify that sG = pG + nG and that s is the private key for sG. All the while p remains hidden! But…does this prove ownership of p?

Let’s say Mallory, not the owner of p, is trying to trick Bob into thinking she does know p. She could choose any random number for s, derive sG from it, and then calculate nG = sG - pG since pG is public information. So n is doing a good job of protecting p from being shared for Alice, but Bob has no reason to trust it. The issue is that nG can be manipulated by an attacker. They can lie! And it might be hard to see that they did since the discrete log problem is masking the source.

So what if after Alice proposes a nG, Bob gets to add some spice to the equations so that nG can’t be calculated from a randomly chosen s, since it’s already “locked in” (Bob knows it already). Bob can add a challenge scalar to the equation.

sG = cG + pG + nG
^    ^    ^    ^   
|    |    |    |
|    |    |    |
s  = c  + p  + n  

Add a challenge scalar c so that nG cannot be derived from a random s.

The challenge scalar c is part of the private equation since it is a scalar. Having Bob add something to the public equation wouldn’t be too useful since no one would know the key. Since Bob only gave Mallory c after he gets nG, Mallory can no longer easily calculate nG = sG - pG - cG because she would have to have known c beforehand! Right? Hm…

If we take another look at how Mallory tricked Bob last time, nG = sG - pG, we can plug this into the public equation which might help reveal her more general strategy: sG = pG + (sG - pG). It is obvious arithmetic, but the pG’s are cancelling each other out which means that their p counterparts don’t play a role in the private equation. This is great for Mallory since she doesn’t know p.

Back to our new challenge scalar, Mallory can take a similar strategy and make nG = nG - pG. Wait, what. That doesn’t check out at all. So naming the public variable nG was a bad call because it makes an assumption that Alice/Mallory is going to play nice. Maybe nG' = nG - pG is a bit better notation, since it is nG in the happy path, but a bad actor doesn’t need to follow that and can propose any number for nG'. And the point here is that the pG’s can get cancelled out (again): sG = cG + pG + (nG - pG). The private equation then boils down to s = c + n and bad actor Mallory can pretend to know p (again). Funny story, I didn’t realize this case until sipa told me.

So the challenge scalar needs to be applied in a way so that Mallory can’t cancel out the pG. What if c is applied to the public key itself like in the second attempt above? Scaling a scaled point is just scalar multiplication: c*(pG) == (c*p)G. And if c is applied to pG there is no way for Mallory to modify her nG' candidate to cancel out all pG’s in the equation since she won’t know c ahead of time.

sG = c*pG + nG'
^    ^      ^   
|    |      |
|    |      |
s  = c*p  + n  

Move the challenge scalar c to the public key pG so that it cannot be removed from the equation by a bad actor.

This is pretty great, but can be improved! Right now it involves interaction between Alice and Bob. Bob needs to have Alice commit to a nG' candidate before he chooses a random c so that he can trust her. Also, if Carol comes along and wants Alice to prove she owns p, nothing done with Bob can be shared in this proof since Carol has no reason to trust Bob. Fiat and Shamir have something to say about this.

Recall that hashing something gives random data. If Alice commits to a nG' and then hashes it, could that hash be the challenge scalar c? This is theoretically just as random as Bob choosing a number out of a hat. Remember, c is there so that attacker Mallory can’t propose a nG' candidate to cancel out the public key pG. If pG goes from c*pG to hash(nG')*pG, Mallory doesn’t get to cancel out pG for free, she would need to reverse a hash(nG'). She has more time than if Bob interactively gives her a random c and tells her he is only going to wait five minutes, but the maths are stacked against her, it would take a very, very, very long time to reverse a hash. And Bob can trust the hash since it is deterministic and easy to verify based on nG'. Mallory is committing to that nG' since the hash forces her to use it. So with the challenge hash, p is still protected for Alice and Bob can trust nG', just like with a random c from Bob. But now the protocol is non-interactive and can be re-used by any parties: Alice commits to a nG' and proves she has p by sharing nG' and s.

sG = hash(nG')*pG + nG'
^    ^              ^   
|    |              |
|    |              |
s  = hash(nG')*p  + n  

Base the challenge scalar c on nG' so that the protocol becomes non-interactive.

Alice can prove she owns p without revealing p! Goal achieved. Finally, the promised second feature of signatures: provide the ability for Alice to put a digital stamp on some arbitrary message. “I Alice, owner of p, declare…m”. This message, m, is public information which Alice just needs to be able to commit to somehow. Our non-interactive public equation already has a commitment from Alice in it. The hash(nG') “commits” Alice to nG'. She either knew nG' ahead of creating s or she guessed the hash (impossible). We can just add more to that hash commitment.

sG = hash(nG'||m)*pG + nG'
^    ^                 ^   
|    |                 |
|    |                 |
s  = hash(nG'||m)*p  + n  

Concatenate an arbitrary message m to the hash commitment.

Alice commits to nG' in order to prove she has control of p and also commits to m. She either knew m before-hand, or randomly guessed the hash (again, impossible). So the owner of p has committed to m, a signature on that message.