2022.03
A little bit of signatures
THE LSAT SERIES
Signatures are Dope
The day job is allowing me to really sink in to the practical cryptography patterns. I am going to keep using that word “practical” as an excuse to why I only understand them up to a certain point. Once we enter deepthoughtmindbending math, I just have to trust it for now.
One of the most practical and ever present cryptography patterns is the digital signature. I am going to walk through the layers of patterns which combined make up the digital signature.
Encryption and Hashing
At the bottom (or as deep as I am willing to go today) we have encryption and hashing. Both of these patterns use cryptography to provide some guarantees which are simple to understand, even if how they are being guaranteed is a little on the mindbending side. These guarantees might seem just neat at first, but when combined together, create tools which are used everywhere today and have endless potential.
Encryption is a mapping of any piece of data and a key to another piece of data: data + key => other_data
. But encryption provides some interesting guarantees (backed by the mindbending maths) about this relationship.
++ ++
   
 data1+key++>other_data1 
   
   
 data2+key++>other_data2 
   
   
 data3+key++>other_data3 
   
   
 data4+key++>other_data4 
   
++ ++
encryption is a onetoone relationship
 It is impossible to calculate the original
data
fromother_data
without thekey
 It is impractical to even try guessing (would take mindbending amounts of time)
 Changing the input data or key just a little results in a completely different output
So encryption is a practical way to scramble data.
Hashing is a mapping of any piece of data to a much smaller (fixed sized, really tiny) piece of data: data => small_data
. Hashing has similar guarantees to encryption, with a twist of its own.
++ ++
   
 data1+++>small_data1 
    
    
 data2++  
   
   
 data3++>small_data2 
   
   
 data4++>small_data3 
   
++ ++
hashing is a manytoone relationship
 Like encryption, it’s impossible to calculate
data
fromsmall_data
 Also impractical to even try guessing
 Changing the input data just a little also results in a completely different output
What is the twist? That manytoone relation implies that different input data maps to same output. A collision! But, and we have to trust the mindbending maths again here, the chances of a collision are just so, so, so, SO small…we just don’t worry about it.
So hashing is a practical way to identify data.
Symmetric and Asymmetric Encryption
Encryption comes in two forms: symmetric and asymmetric. These have to do with the type of key used in the mapping.
Symmetric is when the same key
is used to encrypt and decrypt the data. This is straightforward to understand, but limits the use cases. One of the obvious use cases of encryption is to transfer date without others being able to read it. The data is safe to transfer since its scrambled, but how do you transfer the key?
Enter asymmetric encryption. Asymmetric encryption uses more of that mindbending maths to create two keys which have a nifty relationship: encryption with one can only be decrypted with the other. The “how do you transfer the key?” question can now be answered:
 the data recipient (let’s call her Alice) sends her first key (let’s call it her public key) to the data sender (let’s call him Bob)
 Bob encrypts the data with Alice’s public key and sends her the encrypted data
 Alice decrypts the data using the second key in her key pair (let’s call that her private key)
Signatures
Now the top level pattern: digital signatures. What is the use case? Data shooting across the internet goes through a lot of intermediate servers and routers which are not controlled by the data sender or data receiver. It is possible that the intermediate servers could tamper with some data or pretend to be a sender. A data receiver wants to verify who sent data and what data did they send? The data needs a signature.
The signature pattern begins with a small chunk of data sent along side the original data. A message authentication code (MAC). How does this chunk of data allow a receiver to verify who and what?
 The data sender uses hashing to create a small identifier for the data being sent
 The data sender encrypts the small identifier (this is a Hashed MAC or HMAC)
 The data sender sends the HMAC along with the data to the data receiver
 The data receiver decrypts the HMAC and hashes the data
 If the two values match they know who sent the data and that it hasn’t been tampered with
If the sender and receiver are the same person (e.g. maybe sending a data to a different storage) symmetric encryption can be used since the key never needs to be sent. Asymmetric encryption is used if the sender and receiver are different people.
++
 Data 
+++

 Hash

v
++ ++
 ID +>Signature
++ ++
Encrypt
generate HMAC signature
++ ++
 Data  Signature
+++ +++
 
 Hash  Decrypt
 
v v
++ ++
 ID  ==  ID 
++ ++
validate data with signature