2024.01 Vol.4

Bit of elliptic curve cryptography

The maths behind Elliptical Curve Cryptography (ECC) involves two layers of abstract algebra. Each layer brings its own piece of the puzzle required for practical cryptography. I won’t be going into to rigorous proofs here, but will focus on building a mental model about how these layers interact and create something valuable.

The subject of abstract algebra appears to be wide and diverse. I don’t know the maths well enough to tell you where it ends and where other subjects begin. But, I think I have a good enough handle on the parts which ECC relies on. And it ain’t impossible to wrap your head around.

Not surprisingly, abstract algebra is about defining abstractions. Similar to abstractions in a program, it allows us to make some assumptions of a black box without needing to know exactly how the box is implemented. And the key is that we get to define these abstractions in whatever way is most helpful. In computer programs we do this all the time, but when it comes to math, this part was a bit of a “there is no spoon” experience for me. Some math abstractions are so well known we don’t really think of them as abstractions, but rather some hard rules-of-reality. We will get more into that in a bit.

The building blocks of algebraic abstractions are set theory and group theory. Set theory comes naturally to programmers since we use sets all the time. A set is a “well-defined” collection of objects which contains no duplicates. Anything can be a set, but there are a handful which everyone is already familiar with, like natural numbers ({1, 2, 3, ...}) or the set of all integers (negative, zero, positive). Again though, anything can be a set, up to us to define them.

Group theory extends on set theory. It introduces the operator abstraction. An operator, in the most general terms, is a function which takes elements of a set(s) and outputs elements of a set. A binary operator is a restricted operator which takes just two elements from the same set and outputs a single element in that same set (A op A => A). Restricting the inputs and output to the same set makes it what we call a closed operation. Restricting our abstractions like this allows us to make assumptions later on, so look out for them on this journey. Continuing on, we can bundle a set with a binary operator, but we call that bundle different things depending on characteristics of that operation. Any set and a binary operator is called a magma. If that binary operator is also associative (A op (B op C) = (A op B) op C, another restriction!) it is upgraded from magma to a semigroup.

Another building block in group theory is the identity element. A binary operation on the identity element and any other element just outputs that any element. That is a word salad, but we actually have a good intuition for what it means: 8 + 0 = 8. This example is lookin a little ahead though, so for now just know that if a set has an identity element it is bumped from semigroup to monoid (yea, that monoid for all the computer science nerds). Identity elements allow for another possible restriction on the set: every element in the set has an inverse. An inverse element is one where the binary operator of an element and its inverse outputs the identity element. An example which again probably looks familiar 8 + (-8) = 0. If a monoid meets the inverse requirement it is now a group!

As these examples have hinted at, almost everyone in the world is very aware of one example of a group: “addition across the set of real numbers”. But this is almost too good of an example, because like I mentioned before, it is so baked into our minds it’s hard to think of it as an abstraction. But the addition operation we have defined across real numbers meets all the above requirements. The tough part now is making the “there is no spoon” leap: this is only one definition of a binary operator across one definition of a set. We have endless other possibilities!

A group with a second binary operator might qualify as another level of an abstraction, but these levels require ever more restrictions on the operators. If under the first operator the group is an abelian group (a.k.a. communitive A op B = B op A), and under the second operator it is a monoid, and if the second operator distributes over the first ((a op1 b) op2 c = (a op2 c) op1 (b op2 c)) then the group is a ring. If we tweak those requirements so that under the first operator it is an abelian group and under the second, excluding the zero element, it is also an abelian group it is now a field. That “excluding the zero element” sounds strangely specific and that is because it is, but we make the rules! I might dig into the whys of that clause later, but for now, we just have to trust the math wizards before us on this one.

If a set does not have infinite elements it is called finite. A curious thing of infinite and finite sets is that the binary operation restrictions above don’t really care one way or the other, they act the same. This is a very important trait for cryptography because algorithms are usually a little more intuitive in our heads across an infinite set (e.g. real numbers), but our computer friends generally appreciate implementations across a finite set. However, we can describe an algorithm in one set, implement it over another, and it remains valid. It leverages the abstractions.

Groups can possibly have another interesting characteristic: cyclic. This is where if you start at a certain element in the group’s set (the generator element) and then repeatably apply the group’s binary operator, you can output every element in the set. For example, integers over addition is considered cyclic since you can start at 0 and then add 1 forever. Cyclic can apply to finite groups too.

Ok, so how do these algebraic terms apply to ECC? We gotta change gears and talk about that “curve” part.

Elliptic curves are a fairly standard line, they have the equation of y^2 = x^3 + ax + b where each (a, b) constant pair is a new line. The elliptic curve lines have distinctive shapes, like how they are mirror’d over the x-axis due to that y^2 part.

If we went ahead and chose a a and b, like I dunno, what about a = 0 and b = 7, well then we are effectively describing a set. This set contains points, (x, y) coordinates, on the line. Remember, sets can be anything, they are not limited to numbers and integers. And in this case, they are points on a line.

We will get to the “But why?” in a second, but we can define a binary operation over this set of points to form a group. And while it is commonly called “addition” because it has the same interface as the well known addition operation over real numbers, remember this can be any operation we define, and in this case it is over the set of points on an elliptic curve instead of real numbers.

I am not going to go through how all the boxes are checked right now, but we define addition over elliptic curve points as a closed binary operator. For two points, there is always a third point unless one of the points is the “infinity” point (maybe better described as the “no-where” point), which is the identity element of the set. It is easy to describe the equation in geometric terms: you draw the line between any two points until it intersects the curve for the third point. You then “flip” the point over the x-axis (this is mirror’d due to the y^2). Why flip it? This is one of those “well, we make the rules” things where we need to flip it in order for the operation to meet the binary operator requirements. So we flip it.

The equation for the binary operator to get the third point is straight forward algebra, we find the slope of the line between the first point (x_1, y_1) and the second (x_2, y_2) and can boil down the functions.

s = (y_2 - y_1)/(x_2 - x_1)
x_3 = s^2 - x_1 - x_2
y_3 = s(x_1 - x_3) - y_1

Elliptic curve point addition functions.

There is a special (and very important) case when adding a point to itself. In this scenario, the slope of the line between the two inputs is the “tangent” line.

s = (3 * x_1^2 + a)/(2 * y_1)
x_3 = s^2 - (2 * x_1)
y_3 = s(x_1 - x_3) - y_1

Add a point added to itself.

Cool, so we have a group. What do we do with this thing? Well, a funny property of point addition over the elliptic curve is that it is nonlinear. That means that there is no pattern to where the third output point is based on the two input points. Compare that to addition over real numbers where we generally have a good idea where the output will be based on the inputs. If we add a point to itself five times, A + A + A + A + A = 5A = A', this starts to look a bit like what we normally call multiplication. Here though, it is “scalar multiplication” because we are not multiplying two points, just addition, but it is an easier way for us to represent the addition. If we calculate nA = A' where n = 5 are we able to reverse the calculation and find n if we are instead given A and A'? We could subtract A over and over again until we get to A. But given the nonlinear nature, we wouldn’t know at any given point how close we are to finding n. This is the crux of the discrete log problem which practical cryptography is based on. This “trapdoor” operation is relatively easy to calculate, but hard to reverse. The discrete log problem has “log” in its name due to it being associated with multiplication, but it might be described more generally as “how many times did an operation happen?”.

Elliptic point addition is the “top level” algebraic group (1 binary operation) in ECC and importantly is where the discrete log problem is defined. If you look at the equations to define the third point above, x and y are in the real number field (2 binary operations) and using the good ol’ normal addition and multiplication operations. This is nice cause we are so used to this field and can easily visualize the geometry. But real numbers don’t play nice with limited computer physics, so not super practical in that sense. Here is the thing though, we can keep that nifty discrete log problem property of elliptic point addition, but build it on top of a different field which satisfies the two operation requirements. Abstract algebra has our back and ensures that the property will hold no matter the field.

What field do we want x and y to be a part of? A finite one for usability and preferably one that computers readily understand.

To start, we can define a set to be every integer less than some p. So if p = 5, the set would be {0, 1, 2, 3, 4}. This is obviously finite. Now we have to define the two binary operations necessary for this set to be a field. Once those are in place we can swap this field with any other field. The two binary operations are commonly referred to as “additive” and “multiplicative”, and with good reason, our standard addition and multiplication operations across real numbers satisfies them, but these can be any binary operation which satisfy the requirements described above. Just hammering home that these operations will not be the addition and multiplication we are used to, but the algebra point equations above that operate on the field will still work. They will just use the new additive and multiplicative binary operations.

Enter modular arithmetic. Any programmer knows of the modulo % operator, we use it all the time. Well it turns out that we can define some binary operations which use the same sorta thinking. The modulo operator gives us the “remainder” part of a division operation.

A = B * Q + R where 0 <= R < B
A % B = R  

The quotient remainder theorem and the standard modulo operator.

We use the modulo operator often in programs when we want to take an integer of any amount and bucket it into one of p buckets in a deterministic fashion. One could see how this might be useful on a finite set.

A +op B = (A + B) % p  
A *op B = (A * B) % p

The +op and *op binary operators defined with modular arithmetic

Bit of a sidebar, but one characteristic to note when programming these modular arithmetic operations is that the % can be applied whenever. The mental model is to picture a clock which is usually mod 12. If you add 100 hours while the hand is on 3, you don’t end up at 103, you keep wrapping around. And it doesn’t matter if you “slice” off the extra on every wrap or wait till the end, you will end up at the same spot: 7. You can take a modulo of a negative number too, you just head in the other direction on the clock (counter-clockwise). Notice how this means the modulo output will always be positive. There is a little programming gotcha though, the modulo operator is not the same for all languages. This clock-based model is called “Euclidean”, another approach is often called “remainder” where it is possible for the output to also be negative. Programming languages like Rust added an explicit function, rem_euclid, to help differentiate.

Another extremely useful characteristic of these operations is that they nicely loop back around “the clock” instead of going off into infinity, so the field is cyclic. One can begin to see why this field is useful for computers. It only involves integers, none of that non-determinitic floating stuff, and everything is nicely contained in a “box” whose size is determined by whatever modulo is used by the set.

It must be noted that for these binary operations to hold though, the finite set needs to be modulo a prime number (will dig into that later). But what do we have now? Well, we can choose a finite field with a set large enough that is impossible to break the discrete log problem with brute force, but small enough that the operations can easily be handled by computers. The number 256 sounds pretty good! 256 bits, 32 bytes, is a digestible amount for computers. And the number of elements in the set would be 2^256, which is more than there are atoms in the Milky Way, so a tough nut to crack. So we want a prime number close to 2^256…

We are finally arriving at bitcoin’s secp256k1! 256? You don’t say…As the number kinda alludes to, secp256k1 is more than just a curve. It is the two abstract algebra layers, the modular arithmetic field and the elliptic curve with point addition group, plus some agree’d upon constants. We all have to operate in the same field and curve for this to be useful, so we choose a prime for the field and a a and b for the curve. Now that we have the stage all set, we can build some cool stuff like signatures on top.