2024.02 Vol.2

Bit of taproot primitives

The taproot BIPs added Schnorr signatures to some parts of bitcoin, moving away from the unnecessary complex ECDSA. Since signature schemes can have a little wiggle room, BIP-340 looked to standardize Schnorr as much as possible for use in the bitcoin ecosystem. These were first introduced in SegWit v1 outputs, a.k.a. P2TR, in BIP-341.

There are a lot of layered standards and patterns, a bit of a web below, but the underlying algebra techniques are used over and over again.

Linearity

Compare the Schnorr algorithm to ECDSA and you will notice a big difference (kinda obnoxious, but the common letters are all mixed up between them so just being very explicit here).

s = hash(R||P||m)*d + k

Schnorr…the k is the private nonce, d is the private key, and the hash commitment contains the public nonce R, the public key P (Key Prefixed scheme), and the message m.

s = (hash(m) + r*d)*k^-1  

ECDSA…the k is the private nonce, d is the private key, and r is the public x point of the nonce.

The schemes accomplish the same general goal (ignore the key prefixed part in Schnorr for now, more on that below) ECDSA is a little different with its public nonce usage. The k^-1 in ECDSA was probably introduced to differentiate the algorithm enough to get around patents, but it also adds a whole ton of complexity. All the operations in the Schnorr scheme are linear whereas the k^-1 is very much not, that is modular arithmetic division at the scalar level, a finite field. Division (a.k.a. the inverse of multiplication) is different operation than addition and a bit of doozy down there. It requires a modular multiplicative inverse where an exponent calculation is required. So a ton more CPU cycles to calculate and verify an ECDSA signature compared to Schnorr’s relatively light linear calculations.

Key Aggregation

The power of linearity becomes more apparent when looking at key aggregation. The goal of key aggregation is for multiple parties to sign the same message m with a shared public key, while all parties maintain the secrecy of their personal private key. So a signed message and a public key end up in the wild, but it is impossible to see the parties involved.

A Schnorr signature, (R,s), is verified by calculating sG from s and checking that against the public equation for sG = e*P + R where e is the hash challenge.

Disregarding some attack vectors for now, if two users want to sign the same message, they can add their public keys (key aggregation) to get a shared public key (EC point addition) P_agg = P_1 + P_2. They then need to agree on a shared nonce R_agg = R_1 + R_2, but only have to share the public values with each other. They can then each compute their part of a signature for the message.

// Constant for both signatures.
e = hash(R_agg||P_agg||m)

// Each party's signature part.
s_1 = e*d_1 + k_1
s_2 = e*d_2 + k_2

// Signature aggreagation.
s_agg = s_1 + s_2
s_agg = (e*d_1 + k_1) + (e*d_2 + k_2)
s_agg = e*d_1 + e*d_2 + k_1 + k_2
s_agg = e*(d_1 + d_2) + k_agg
s_agg = e*d_agg + k_agg

// Scale up with G.
s_aggG = e*P_agg + R_agg

Sweet, sweet linearity. Can see that s_1 + s_2 = s_agg in Schnorr-land.

Since the signing scheme is linear, the two signature “parts” can be added together to get the shared signature, none of the secret signer elements are exposed.

Going to sink into this a bit since gaining an intuition here makes it easier to see why it doesn’t work so well for ECDSA. The first thing to hammer home is how the Schnorr signing scheme is just a bunch of modular addition. Even the hash element e is just adding the private key d to itself a bunch of times since it is scalar multiplication. When we add the public keys together, elliptic curve point addition, that is also just addition p_1G + p_2G = (p_1 + p_2)G ==> p_1 + p_2 in the private scalar equation.

We are only adding things and then we add more things. Kinda feels right how it works out and is easy to calculate signature parts separately. The key is how we can factor out the constant e and easily see the aggregated private keys and nonces.

So ECDSA, what is going on there? The ECDSA signature is (r,s) and has a similar verification process. However, that division by the signer variable k in the signature scheme really throws a wrench in things. Does s_1 + s_2 = s_agg still hold?

(e + r_agg*d_1)/k_1 + (e + r_agg*d_2)/k_2 != (e + r_agg*(d_1 + d_2))/(k_1 + k_2)

No.

You can still add two public keys together to get a combined public key, but the signature with that key is not going to equal the sum of partial signatures in ECDSA-land. It is possible to perform 2+ party key aggregation in ECDSA, but it requires multiplying keys, not adding them. It gets complicated fast, not to mention costly on the CPU cycles.

Tweaks

Another form of key aggregation is tweaking a key.

A user chooses a new private key t where 0 < t < SECP256K1_ORDER. That private key has a public key T = t*G. A user’s original private key x can be tweaked by t, x' = x + t. The public key can be tweaked with the “tweak point” T, P' = P + T.

The benefit of tweaks is that an observer cannot distinguish between a tweaked and untweaked public key. This is used in hierarchical deterministic wallets defined in BIP32.

A tweak can also be used as a form of a commitment, but some care is needed to make sure this is secure. A “private key” tweak point c can be created by hashing whatever needs to be committed to m, producing a 32 byte scalar c = sha_256(m). The tweak point commitment tweaks the public key Q = P + cG. The public tweak’d key can be created by either party since c is also public. Bob wants to pay Alice, owner of key P, but with an extra commitment c. He can pay to Q instead of P. Alice is able to unlock Q since she knows the commitment as well, and this also shows that she “agreed” to it. The commitment is not revealed to outside observers (unless required to maybe settle a dispute).

But using the simple Q = P + cG scheme straight can be insecure. It allows re-use of a commitment point Q, with different public keys. Ideally, there is a 1:1 commitment to commitment point.

// Orignal commitement point based on private key x and commitement c.
Q = xG + cG

// A new private key based on the old commitment and the new one c'.
x' = x + c - c'

// c' is removed from the equation by manipulating x'.
Q = x'G + c'G
Q = (x + c - c')G + c'G
Q = xG + cG

Calculating the secret of P_2 off of a previous commitment point. (Q,P) and (Q,P_2) are both valid, but are committing to different commitments.

It is easy to protect against this, just commit to the public key in the commitment, similar to strategies in signature schemes Q = xG + hash(xG||c)G. The commitment can no longer be manipulated out of the scheme since xG appears inside and outside the hash.

Batch Validation

CPU cycles can be saved by leveraging linearity again and factoring out the expensive operations and saving them for last.

Key Prefixed

BIP-340 specifies that signatures are key prefixed, which means the public key is added to the challenge hash.

sG = hash(R||P||m)*P + R
^    ^                 ^   
|    |                 |
|    |                 |
s  = hash(R||P||m)*p + r    

The public key P is committed to in the challenge hash.

The explicit commitment protects against “related-key attacks”, looks similar to committing to a key in a tweak.

Tagged Hashes

Tagged hashes are kinda like a namespace for the hash context. It helps protect from a few attacks and accidental leaking of private information if inputs are re-used. BIP-340 calls for prefixing hash data with SHA256(tag) || SHA256(tag), so tagged_hash("TagName", data) = sha256(sha256("TagName") + sha256("TagName") + data). These are used throughout the taproot signature scheme, each context with its own tag value.

Nonce Derivation

Nonce derivation can technically be any random number, but it should be random for every signature. Re-using a nonce can leak a private key.

// Two signatures over different messages, but using the same nonce k.
s1 = k + H(R||P||m1)*d
s2 = k + H(R||P||m2)*d

// Subtract the signatures from each other.
s2 - s1 = (H(R||P||m2) - H(R||P||m1))*d

// Solving for the private key d removes k, all that is left is public info.
d = (s2 - s1) / (H(R||P||m2) - H(R||P||m1)

Don’t reuse the same nonce.

On the other hand, in some contexts it can be a little jarring if a user signs a message twice and the signature is different. Both valid, but committing to different nonces. A signature algorithm could specify a deterministic nonce calculation. BIP340 doesn’t do that, calling for some random data when signing to help create a non-derterminitic nonce. It does mention that a determinitic function could be specified to create that random data, like RFC6979 (deterministic ECDSA). But care has to be used to not use the same nonce and secret key multiple times or one can accidentally leak private material.

NUMS

“Nothing Up My Sleeve” numbers in cryptography are special numbers which are somehow above suspicion if used in an algorithm. Sometimes a scheme requires a sort of start point and the author wants to use a number which others can assume isn’t setting up a backdoor. BIP341 has a recommendation if you need one in taproot scripts.