2022.04.10

2022.03

A little bit of signatures

THE LSAT SERIES

Signatures are Dope

The day job is allowing me to really sink in to the practical cryptography patterns. I am going to keep using that word “practical” as an excuse to why I only understand them up to a certain point. Once we enter deep-thought-mind-bending math, I just have to trust it for now.

One of the most practical and ever present cryptography patterns is the digital signature. I am going to walk through the layers of patterns which combined make up the digital signature.

Encryption and Hashing

At the bottom (or as deep as I am willing to go today) we have encryption and hashing. Both of these patterns use cryptography to provide some guarantees which are simple to understand, even if how they are being guaranteed is a little on the mind-bending side. These guarantees might seem just neat at first, but when combined together, create tools which are used everywhere today and have endless potential.

Encryption is a mapping of any piece of data and a key to another piece of data: data + key => other_data. But encryption provides some interesting guarantees (backed by the mind-bending maths) about this relationship.

+-------------+   +---------------+
|             |   |               |
| data1+key---+---+-->other_data1 |
|             |   |               |
|             |   |               |
|  data2+key--+---+-->other_data2 |
|             |   |               |
|             |   |               |
| data3+key---+---+-->other_data3 |
|             |   |               |
|             |   |               |
|  data4+key--+---+-->other_data4 |
|             |   |               |
+-------------+   +---------------+

encryption is a one-to-one relationship

  1. It is impossible to calculate the original data from other_data without the key
  2. It is impractical to even try guessing (would take mind-bending amounts of time)
  3. Changing the input data or key just a little results in a completely different output

So encryption is a practical way to scramble data.

Hashing is a mapping of any piece of data to a much smaller (fixed sized, really tiny) piece of data: data => small_data. Hashing has similar guarantees to encryption, with a twist of its own.

+----------+     +----------------+
|          |     |                |
|  data1---+--+--+-->small_data1  |
|          |  |  |                |
|          |  |  |                |
|  data2---+--+  |                |
|          |     |                |
|          |     |                |
|   data3--+-----+-->small_data2  |
|          |     |                |
|          |     |                |
| data4----+-----+--->small_data3 |
|          |     |                |
+----------+     +----------------+

hashing is a many-to-one relationship

  1. Like encryption, it’s impossible to calculate data from small_data
  2. Also impractical to even try guessing
  3. Changing the input data just a little also results in a completely different output

What is the twist? That many-to-one relation implies that different input data maps to same output. A collision! But, and we have to trust the mind-bending maths again here, the chances of a collision are just so, so, so, SO small…we just don’t worry about it.

So hashing is a practical way to identify data.

Symmetric and Asymmetric Encryption

Encryption comes in two forms: symmetric and asymmetric. These have to do with the type of key used in the mapping.

Symmetric is when the same key is used to encrypt and decrypt the data. This is straightforward to understand, but limits the use cases. One of the obvious use cases of encryption is to transfer date without others being able to read it. The data is safe to transfer since its scrambled, but how do you transfer the key?

Enter asymmetric encryption. Asymmetric encryption uses more of that mind-bending maths to create two keys which have a nifty relationship: encryption with one can only be decrypted with the other. The “how do you transfer the key?” question can now be answered:

  1. the data recipient (let’s call her Alice) sends her first key (let’s call it her public key) to the data sender (let’s call him Bob)
  2. Bob encrypts the data with Alice’s public key and sends her the encrypted data
  3. Alice decrypts the data using the second key in her key pair (let’s call that her private key)

Signatures

Now the top level pattern: digital signatures. What is the use case? Data shooting across the internet goes through a lot of intermediate servers and routers which are not controlled by the data sender or data receiver. It is possible that the intermediate servers could tamper with some data or pretend to be a sender. A data receiver wants to verify who sent data and what data did they send? The data needs a signature.

The signature pattern begins with a small chunk of data sent along side the original data. A message authentication code (MAC). How does this chunk of data allow a receiver to verify who and what?

  1. The data sender uses hashing to create a small identifier for the data being sent
  2. The data sender encrypts the small identifier (this is a Hashed MAC or HMAC)
  3. The data sender sends the HMAC along with the data to the data receiver
  4. The data receiver decrypts the HMAC and hashes the data
  5. If the two values match they know who sent the data and that it hasn’t been tampered with

If the sender and receiver are the same person (e.g. maybe sending a data to a different storage) symmetric encryption can be used since the key never needs to be sent. Asymmetric encryption is used if the sender and receiver are different people.

+------+
| Data |
+--+---+
   |
   | Hash
   |
   v
 +----+          +---------+
 | ID +--------->|Signature|
 +----+          +---------+
        Encrypt

generate HMAC signature

+------+    +---------+
| Data |    |Signature|
+--+---+    +----+----+
   |             |
   | Hash        | Decrypt
   |             |
   v             v
 +----+        +----+
 | ID |   ==   | ID |
 +----+        +----+

validate data with signature