Bitcoin

2023.06.05

Money for the streets

Reactor online. Sensors online. Weapons online. All systems nominal.

This manual first covers high level fundamentals of bitcoin and then descends into technical details.

Consensus

The value of bitcoin is derived entirely from its leaderless-consensus (known as Nakamodo Consensus).

The word consensus maybe makes you think of a jury where members find a verdict. Or maybe you think of a distributed system which relies on a quorum to make decisions. In both these scenarios, there is trust put in relatively few leaders in order to pragmatically create consensus.

Trust is power. And humans tend to abuse it. Thus bitcoin’s value proposition: consensus, but without leaders. There is no power to abuse. Whether this is valuable or not…well, that is a question for yourself.

From a high level, Nakamodo Consensus is achieved through a protocol backed by game theory and applied cryptography. Users place trust in a protocol instead of third parties.

proof of work

Let’s build this magic protocol from the ground up.

There is a property of practical applied cryptography hash functions which is used over and over in the bitcoin system: hard to reverse, but easy to verify. A real world metaphor might be cooking pancakes. It’s easy for someone to try a bite of pancakes and verify “Yep, these are pancakes”. It is harder for them to guess the exact recipe used to cook those pancakes.

This hash property can be chained together. Back to pancakes, there could be a new recipe, recipe2, which is just recipe1 plus a new secret ingredient (it’s chocolate chips). This means that if recipe1 changes even a little (we added bananas), the pancakes produced by recipe2 will also change (in this case, really good chocolate chip banana pancakes).

The bitcoin protocol generates an ever growing list of transactions which everyone can trust by leveraging these patterns.

The hard to reverse, but easy to verify property enables random, short-term leaders who get to add one group of new transactions (a block) to the list of all transactions ever (the blockchain). There is a financial incentive to be the leader, the leader collects the per-transaction fees in the block. But users must contribute work (energy) in order to just have the chance to be the leader. The more work, the higher the chance (linear). What work exactly? The leader needs to do the hard part of the hard to reverse, but easy to verify hash property. They have to guess an ingredient in the pancakes. But where the set of pancake ingredients is extremely limited, the set in bitcoin is massive (think atoms-in-the-universe scale). At some point though a user will guess the right input and add a block. All other users are able to do the second half of the hash property hard to reverse, but easy to verify and easily verify that work was done to create the block (…proof of work) and that the transactions are valid. There is no incentive to attempt to broadcast bogus blocks since they can easily be detected and discarded. Proof of work allows bitcoin to be censorship resistant since there are no long term leaders with power to abuse. If Alice really hates Bob and decides “if I ever become the leader, I am not going to include Bob’s transactions in the block”, Bob can just take his transactions to any other user attempting to be the leader (called a miner in bitcoin lingo). Even if Alice becomes the leader for a block, that doesn’t influence any of the following blocks, she will have to do the work all over again and continue to be the leader in order to censor Bob.

Bitcoin uses the chaining characteristic to link the blocks on the blockchain together. This allows users to quickly verify all blocks in the blockchain are valid. Nothing has changed or been tampered with deep in the past. It also layers proof-of-works on top of each other, making it extremely expensive to even attempt to modify a block. Let’s say Alice paid Bob for some pancakes and the transaction is now three blocks deep in the blockchain. As stated earlier, Alice hates Bob, so she decides to try and change the transaction sending the funds back to herself. In order to do this, she needs to change the transaction in the block and re-calculate the block’s proof-of-work. But changing the transaction changes all the newer blocks as well, so she actually has to calculate proof-of-works for the next three blocks! Plus, the rest of users are still actively adding new valid blocks to blockchain, so she has to do the all of this faster than they can add new blocks. Turns out this costs way more than the pancakes and there is no incentive to even try.

So what differentiates the bitcoin blockchain from another one which follows the same rules? The one with the most work poured into it (not necessarily the longest) is the blockchain (of course, the block need to be valid according to the network…of course). This is where more game theory gets involved to allow users to assume the longest blockchain is the real blockchain. Using a blockchain with less work would just be risky for a user (hey man, I totally have 1,000 BTC, its just over here on this other blockchain which only three people use…).

51% attack

A 51% attack is probably the most well known theoretical attack on bitcoin. Luckily, (as in that Satoshi person was pretty smart) the work in Proof of Work also helps protect against this kind of attack.

A 51% attack is when someone controls at least 51% of the work being poured into bitcoin. At this point, chances are they will guess the next block and thus be the short term leader. And as long as they hold at least 51% of the work, chances are that will happen again and again. With this power they could:

  1. Control what transactions get in to the blockchain.
  2. Mine a “shadow” blockchain and perform a double-spend by swapping it with the public blockchain. It should be noted that this would be a public action though, so the rest of the network would be aware it’s happening.

This would suck, but it’s not actually that powerful. If an attacker could use the 51% to re-write transactions deep in the blockchain, somehow without it being obvious to everyone else, then it would be powerful. But that isn’t possible due to the proof-of-work chain. And as of 2021, the cost to control 51% of the bitcoin hashing power for just an hour is in the billions of USD. The cost-to-benefit of this attack hasn’t made sense since the early days of bitcoin (like 2012-ish).

If there ever is a successful 51% attack, it probably means the end of bitcoin, so maybe an attacker who wants to destroy bitcoin will try with this goal in mind. But if an attacker has enough power to even consider this attack (limited pretty much to a handful of States), there are probably cheaper ways to try and destroy bitcoin (good luck).

eclipse attack

An eclipse attack is kind of a more focused sybil attack. In a sybil attack, a bad actor floods a distributed network with nodes that they control in order to make it look like the network has strong consensus on something. In reality, it’s just one person’s opinion made to look like a lot of different people. An eclipse attack is when the target of the attack is just one person, not the whole network. An attacker singles out someone’s node and floods it with a lot of nodes all controlled by the attacker. In the context of bitcoin, an attacker might perform an eclipse attack on someone in order to try and trick them into thinking a different blockchain is the real one.

Proof-of-work again helps defend against this type of attack, and the general sybil attacks, because it doesn’t matter how many nodes an attack peers to their target node, the target only needs one other node to relay the real blockchain and they can easily verify that it is it. For an attacker to really gain anything, they will still have to produce valid blocks (work) so the attack is expensive and at the same time so cheap to beat. It’s probably not worth it.

blocksize

There is an “artificial” blocksize which limits the amount of transaction per second on the bitcoin blockchain. Transactions which pay higher fees are selected into a block. The scarce resource in this market is bytes on the blockchain.

This leads to a few questions.

The blocksize debate back in 2015 is bitcoin’s most famous holy war. As with all holy wars, there was a lot of noise coming from parties with different interests. I personally believe most of the noise was generated by parties which valued bitcoin succeeding quickly over bitcoin succeeding at all (e.g. my company survives if bitcoin is mass adopted in 2 years or my company fails, in which case I don’t care about bitcoin cause my company just failed).

My thinking:

Bitcoin’s only value proposition is its consensus without leaders, if that is degraded there is no point to bitcoin. Might as well use a simple, centralized database. The limit should be kept low to encourage the highest value consensus. Second layer applications are responsible for increasing transactions-per-second for different scenarios and developing more value on top of the layer one consensus.

Transactions

Bitcoin is transactions and the blockchain orders them.

anatomy

{
  "version": 1,
  "locktime": 0,
  "vin": [
    {
      "txid": "7957a35fe64f80d234d76d83a2a8f1a0d8149a41d81de548f0a65a8a999f6f18",
      "vout": 0,
      "scriptSig" : "3045022100884d142d86652a3f47ba4746ec719bbfbd040a570b1deccbb6498c75c4ae24cb02204b9f039ff08df09cbe9f6addac960298cad530a863ea8f53982c09db8f6e3813[ALL] 0484ecc0d46f1918b30928fa0e4ed99f16a0fb4fde0735e7ade8416ab9fe423cc5412336376789d172787ec3457eee41c04f4938de5cc17b4a10fa336a8d752adf",
      "sequence": 4294967295
    }
  ],
  "vout": [
    {
      "value": 0.01500000,
      "scriptPubKey": "OP_DUP OP_HASH160 ab68025513c3dbd2f7b92a94e0581f5d50f654e7 OP_EQUALVERIFY OP_CHECKSIG"
    },
    {
      "value": 0.08450000,
      "scriptPubKey": "OP_DUP OP_HASH160 7f9b1a7fb68d60c536c2fd8aeaa53a8f3cc025a8 OP_EQUALVERIFY OP_CHECKSIG",
    }
  ]
}

a transaction with one input and two outputs

A bitcoin transaction is made up of a set of inputs and a set outputs. Each input maps to a previous transaction’s output. Transaction outputs which have not been spent, as in have not been mapped to an input of another transaction, are called “Unspent Transaction Outputs” or UTXO’s.

A UTXO contains two parts:

  1. An amount of bitcoin
  2. A cryptographic puzzle, scriptPubKey, which must be solved in order to spend the bitcoin

The scriptPubKey name made sense historically, but now it would probably be better called “locking script” or maybe “witness script” (but more on that later). A fun fact about a bitcoin output, it must be spent in its entirety. This usually leads to a “change” output in a transaction which sends extra bitcoin back to the sender.

Each input contains four parts:

  1. A transaction ID, referencing the transaction that contains the UTXO being spent
  2. An output index (vout), identifying which UTXO from that transaction is referenced
  3. A scriptSig, the script which satisfies the puzzle placed on the UTXO
  4. A sequence number, a weird re-purposed field now used to enforce locktime and replace-by-fee

The scriptSig is another legacy name, today its more like the “unlocking script” or “witness”.

For a transaction to be valid:

  1. All inputs must map to outputs which haven’t already been spent
  2. All inputs must unlock outputs
  3. The sum of the inputs must be larger or equal to the sum of the new outputs. Any difference is an implicit transaction fee that is used to pay to get on the blockchain.
  4. The locktime must be passed (be it block time or height)

script

The scriptPubKey and scriptSig combine to form a little program to unlock a UTXO. The program is written in a language called Script. Script is intentionally very constrained since these need ot be verified all the time by all sorts of computers. It is not turing-complete, so no loops or complex control flow, keeps things very predictable (relatively).

To run a transaction script, the scriptSig + scriptPubKey are concatenated in that order (kinda feels backwards, but makes sense given how the program is executed). The program is executed from left => right. The programs only contain two types of tokens: opcodes and data. Data is loaded onto the stack and opcodes can pull data off the top of stack and load more on. A program unlocks a UTXO if it can run to completion and the stack only has a TRUE (or any non zero value) or is empty at the end.

An extremely simple (and insecure) example is a scriptSig of 2 and a scriptPubKey of 3 OP_ADD 5 OP_EQUAL. The transaction script would be 2 3 OP_ADD 5 OP_EQUAL.

The program execution:

  1. 2 is loaded on the stack
  2. 3 is loaded on top of 2 on the stack
  3. OP_ADD pops the two data values off, adds them together, and puts 5 on the stack
  4. 5 is loaded on top of the 5 on the stack
  5. OP_EQUAL pops the two data values off, compares them, and puts a TRUE on the stack

Since the program ran to the end and has a TRUE on the stack, the output is “unlocked”. Pretty simple! Only issue with this example is that anyone could unlock the UTXO as long as they understood simple addition. Luckily, bitcoin supports a few more opcodes that make it useful.

time-lock

Transactions have the locktime field to control when a transaction becomes valid, but outputs themselves can also be time-locked with two op codes. OP_CHECKLOCKTIMEVERIFY enforces the spending transaction to have a locktime past a certain point. The (perhaps more interesting) OP_CHECKSEQUENCEVERIFY can enforce relative time has passed from the block containing the UTXO and the spending transaction. This is the basis for digital contracts used on the “layer 2” Lightning Network.

OP_RETURN

OP_RETURN is a weird op code which fails the Script program immediately with a message. Its a way to embed some data into the blockchain, but it doesn’t burden the UTXO memory pool set since nodes can recognize that any script with OP_RETURN is un-spendable and not worth keeping in memory.

P2PKH

A bitcoin script can define many different and clever ways to lock an output. But some patterns are so useful, a lot of the bitcoin ecosystem is tailored around them. One of the popular OG (but not the oldest) patterns is Pay to Public Key Hash.

Pay to Public Key Hash, P2PKH, is a pattern to send bitcoin to a single person. In this case, a person holding the private key of a public private key pair. If the user wants to spend the bitcoin in this UTXO (for example, send it to another public key hash), they need to use their associated private key to provide a signature to solve this scriptPubKey. The check signature operator, OP_CHECKSIG, is essential for this script.

OP_DUP OP_HASH160 <PubkeyHash> OP_EQUALVERIFY OP_CHECKSIG

public key hash unlock script

The OP_DUP OP_HASH160 <PubkeyHash> OP_EQUALVERIFY makes sure that the OP_CHECKSIG can only be checked with the intended user’s public key. Without this, a user could provide any public key and an associated signature.

OP_CHECKSIG which returns true if a signature signs the correct parts of a transaction and matches a provided public key. The real interesting part there is what part of the transaction is being hash’d for this signature? Transactions signal which parts of the transaction a signature covers through the SIGHASH flag. What ever parts of the transaction are covered by the signature its safe to assume that the owner of the public key is cool with and haven’t been tampered with by a bad party.

So the P2PKH script ensures that only the owner of the public key can use the bitcoin.

A public key is a form of identification and is heavily used in the bitcoin ecosystem, so what’s a “public key hash” and why is it better than just using a plain old pubic key?

Blockspace is limited and costs money so anyway to save a few bytes is useful. Hashing the public key is a good security-to-performance trade-off since it really doesn’t hurt the security aspect at all, but shaves off quite a few bytes. Bitcoin is a bit quirky and takes the pubkey, sends it through SHA256 and then through RIPEMD160 to create the hash. We would have to ask Satoshi why both, but we end up with a value that takes up less space.

addresses

Bitcoin addresses are part of the tooling built on top of popular script patterns to make them easier to use. Generally, bitcoin address are short strings used to easily describe virtual spots to send bitcoin too. Bitcoin wallet software know how to decode these address and generate the correct script to create the UTXO the address describes.

For P2PKH is one bit an address encodes the public key hash. For P2PKH, the address always starts with a 1 (different numbers for different patters) so it’s easy for a human to quickly know what they are dealing with. These addresses also encode information so computers can verify we humans haven’t fat-finger’d a number and just sent some bitcoin to /dev/null.

Base58Check was the original bitcoin address encoding scheme, but since the SegWit softfork, bech32 is the new hotness.

fees

The blockchain blocksize limits the number of transactions that can fit in a block. How many transactions? That depends on the size (bytes) it takes to describe a transaction. One can imagine a transaction which uses a bunch of UTXOs to pay a big sum to someone. This transaction requires a lot of unlocking scripts, one for each UTXO. This is going to take more bytes on the blockchain than a transaction which only uses a simple one UTXO. If a miner has to choose between one big transaction or a few small ones, all other things equal, well then it will grab the small ones and make more on fees. So the big one needs to put up a larger fee. The sats/bytes ratio is used to see what it would take to get a transaction into the blockchain given the current market.

What if you create a transaction and the fee is too small? Wait it out? Luckily, two tools are available to bump the fee of a transaction.

The first is called “replace by fee” (RBF). This is a policy where node operators will replace an existing transaction in the mempool with a new one which spends at least one of the same UTXO, but pays a higher fee. An important point here is that this is just a policy, not part of the bitcoin protocol. Miners are free to include any transaction in a block that they want, including an old one that a user attempted to bump a fee on (although miners don’t have much incentive to grab a tx with less fees). But because this is just a node policy, there can be many wrinkles to excatly how a node and its operator support RBF. BIP-125 introduced an RBF policy (nowadays called “opt-in” RBF) which leverages the weird sequence number field on a transaction. If a transaction uses a sequence number less than ffffffff than it is signaling that it can be replaced (opting in) by a transaction with a higher sequence number. That new transaction has to have a higher fee than the original (if following this policy). An alternative policy gaining transaction these days is called “full RBF” which allows any transaction to be replaced, no opt-in necessary. Something to note about any RBF policy is that only the original transaction creator, the owner of the UTXO and the secrets to unlock it, can create a new transaction to bump the fee.

A second tool to bump a transaction fee is a pattern called “child pays for parent” (CPFP) also known as “ancestor feerate mining”. This is where a second transaction is created which pays an output of the first transaction. A relatively large fee is placed on this second transaction in order to incentivize miners to include both the original stuck transaction and the new one in a block. One nice aspect about CPFP is that usually wither the sender or receiver can decide to create a CPFP transaction. The sender would attach it to the original transaction’s change output, which is to them, where the receiver would attach it to the UTXO headed to them. Technically any number of transactions could be chained together (I guess the upper limit being the number of transactions in a block? Gotta think on that.) and as long as the last one has a large enough fee a miner could choose to include them all in a block. However, most nodes have policy to only re-broadcast transactions with less than 25 parents.

Both fee bumping mechanisms are exposed to “pinning” attacks where someone could attach a new transaction to the original which makes it either prohibitively expensive to pay for all the fees or in some way breaks node policies so the new transactions won’t be re-broadcast’d.

dust

Dust is a UTXO which holds so little value, it is more expensive to pay for a transaction to get on the blockchain than they output’s value. This is a moving target given the market for blockchain space is always changing.

Even in the best case scenario for a transaction trying to spend “dust” UTXOs, where there are tons of cheap (size-wize) inputs and only one output who’s cost is amortized, there is a physical minimum size of a UTXO input unlock script. In this best case scenario, this input is for a taproot UTXO and would be around ~60 vbytes. The cost of this input alone is determined by the current blockspace fee, let’s say its 15 sats/vbytes. That means the UTXO needs to have a value greater than 60 * 15 = 900 sats to be economically viable.

Easy to see the incentive of doing some house cleaning when fees are low, combining dust UTXOs into one so that the produced UTXO is economically viable when the mempool is full later on.

BIPs

Bitcoin is a technical marvel and might just save us all, but it wasn’t born perfect. Bitcoin Improvement Proposals (BIPs) have been adopted over the years to fix issues and improve the system. No matter what the specific issue a BIP is addressing, all BIPs share some common goals:

  1. Remain backwards compatible (a UTXO today is a UTXO tomorrow).
  2. Make bitcoin more efficient (decentralized).
  3. Make bitcoin more private (useful).

P2SH // BIP16

Pay to Script Hash, P2SH, scripts were standardized in BIP16. They allow transactions to be sent to a script hash (address starting with 3). Why is this useful and such a game changer?

Script is pretty powerful, but it puts the burden on the sender to come up with the appropriate locking script. For P2PKH, this is really easy because there is a standard address to send to a person. But what if a user wants to create a shared UTXO where anyone of three people could spend it? They need to get this multi-sig script to the sender and hope they copy/paste it right. P2SH allows the receiver to define the locking script and then just send the hash to the sender instead. Way safer and closer to the P2PKH pattern for users.

Verifying a P2SH is a whole different beast. The P2SH locking script pattern is recognized as a “special” form, so it gets executed slightly differently to “normal” scripts. Two code paths ain’t great, but the benefits out weigh the complexity here.

OP_HASH160 [20-byte-hash-value] OP_EQUAL

every P2SH has to first verify the hash of the redeem script, before the redeem script is then ran

The scriptSig used to unlock this is executed in a special way:

  1. Hash of redeem script (part of the scriptSig) is hashed and compared with the OP_HASH160 [20-byte-hash-value] OP_EQUAL
  2. Then, the redeem script is decoded (aka “unpacked” into the running program) and actually verified (assuming running a modern version of code, else just step 1 checks out for backwards compatibility)
<OP_0> <sig A> <sig C> <redeemScript> <OP_HASH160> <redeemScriptHash> <OP_EQUAL>

example redeem script shows signatures outside of redeem script since these would throw off hash, the OP_O is the multisig bug

SegWit // BIP141 + BIP143 + BIP144 + BIP145

The SegWit (“segregated witness”) upgrade was a large bundle of changes proposed back around 2015 and activated in 2017.

A goal of segwit was to fix transaction malleability. transaction malleability allows transactions to be tweaked and leave them almost entirely the same, but with a new ID. One example tweak is to change the signature script with additional instructions that aggregate to nothing (OP_DUP OP_DROP). The new transaction ID means any transactions based on the old ID are now invalid. This makes it really difficult to lock in digital contracts necessary for things like the Lighting Network.

Another big goal was to fix the quadratic sighash issue. In pre-SegWit bitcoin, the cost to create and verify a transaction signatures scales quadratic-ly (O(n^2)). This isn’t great because it eats away at bitcoin’s leaderless consensus value prop by bogging down small nodes, pushing towards centralization. It wasn’t obvious to me why this was quadratic. But the crux is that we have to measure on the input size, the hash computation is dependant on the size (which makes sense, but for some reason I was just thinking constant time there). So pre-segwit, n inputs have to be signed n times, n x n. Segwit addressed this by comping up with a new signature scheme where the transaction parts are hash’d before being signed for each input. This way the input size is constant for real.

A third improvement was a change in how bitcoin addresses are encoded from Base58Check to Bech32. There are two big benefits to Bech32. First, it uses a character set that doesn’t mix uppercase and lowercase letters, protecting human users from making mistakes. And second, the checksum validation is much safer, even allowing in some scenarios for mistakes to be corrected not just recognized.

There are other improvements bundled in the big segwit change that I am skipping for now, but the last big goal was to make all the changes backwards compatible (otherwise known as a “soft” fork). This just means that clients running old code wouldn’t break, but it doesn’t require that they get the new benefits. The fact that this major of a change went in as a softfork is kinda a minor miracle from an engineering perspective.

So how was this pulled off?

The big change was moving the witness data (the input unlock scripts) into a separate data structure in a block. The transaction data that describes its effects (where bitcoin is coming from and going to) remains in the original location. Kinda clever, it sticks the witness data in a coinbase output which uses an OP_RETURN code so that old software ignores it (backwards compatible), but new software knows where to look and verify the scripts. And very important, the witness data is still committed to by the block, it is included in the block header hash. Changing the witness data would require a new proof of work for the block.

Before segwit, it was pretty straight for miners to calculate what the most cost effective transactions to put in a block. A block had (has? depends…) a max size 1MB. A miner just needs to maximize tx fees / tx bytes. Small transactions (as in script size bytes, not value) with high fees are great! Large transactions with small fees suck! Easy. But how are segwit transactions measure now that part of the transaction is stored somewhere else? There are a lot of things to weigh here.

One of which is that witness data pre and post segwit is never in the UTXO set (a.k.a. the mempool). This means it puts less of a burden on the bitcoin system than the other parts of the transaction. Perhaps this data should get a relative “discount” to encourage more usage here? Another factor is that the old pre-segwit nodes have a 1MB blocksize limit, to change this would be a hardfork (these nodes wouldn’t accept a 1MB+ block). While post-segwit witness data is still stored in a block, it is stored in a new spot which is not included in the old 1MB calculation.

Segwit introduced a new calculation to figure out the “weight” of a transaction. These wight units (WU) are more abstract than the straight-forward bytes of old, but not too complex. The new policy for blocks is that they can only be 4M WUs, instead of 1M bytes (1MB). But how are the weight units calculated? 1 byte of base date is 4WU while 1 byte of witness data is 1 WU. Two really key points here. First, old pre-segwit blocks are made up of 100% base data so the blocksize rule remains the same in their eyes (backwards compatible). Second, the witness data is getting a discount. If a transaction can be re-written to move more of its logic into the witness data, than it will be cheaper to get on the blockchain. This incentivises putting much less burden on the UTXO set.

I am still looking for a good reason that the witness data is given a 1:4 discount vs. say 1:5 or 1:10. It appears to be a good spot where inputs and output costs are generally equal, which is nice to keep incentives easy to reason about, but not sure that was the reason it was chosen.

If the signature data is stored somewhere else, what do segwit inputs and outputs look like? A transaction paying to a segwit address has an output which just loads data onto the script stack. The input’s scriptSig (the old spot for the unlock script) is empty (the whole point of all this!). Checkout the backwards compatibility though, if an old client validates this transaction the script will end with data on the stack, the transaction is valid. To them it looks like an “anyone can spend” transaction. Newer clients though know how to recognized this segwit data on the stack and perform further validations. Transaction malleability isn’t possible with segwit transactions since the scriptSig is now empty, instead moved to the new witness field. And while the txid doesn’t commit to the new field in order to avoid malleability, the new wtxid does. But txid can be used to chain transactions since no malleability and it still commits to whatever is happening to the bitcoin. wtxid is used by minors to commit the whole transaction to a block.

Segwit introduced two new common transaction scripts to match the most used P2PKH and P2SH: P2WPKH and P2WSH. Segwit enabled wallets know how to decode these addresses and piece together the familiar script templates, but pull the data from the new spots. For interop with old clients, its even possible to embed a segwit script into the old P2SH.

Taproot // BIP340 + BIP341 + BIP342

Taproot was another big softfork bundle of improvements activated in late 2021. It extends upon segwit which extends upon P2SH.

A quick summary of these big leaps in the bitcoin protocol:

  1. P2SH moved the locking script from the output to the input
  2. segwit moved it out of the transaction
  3. taproot took advantage of both to ease script restrictions

So what did taproot enable?

I think it’s easiest to start with what P2SH addressed. Before P2SH, payers were required to describe locking scripts in the transaction outputs. This makes sense from the payer-defines-the-transaction perspective, but generally, its the payee who knows how they want to lock up funds. Having the locking scripts in the outputs also means they end up in the expensive in-memory UTXO pool of every node. Pay To Script Hash flips this and puts the locking script in the input that unlocks the output. The output now just contains a hash of the script. For the output to be unlocked, a user must provide a script that hashes to the output has and the script must return true.

Segwit improved the performance of the system by moving the unlock scripts out of the transactions, allowing them to be pruned in most parts of the system.

So P2SH and segwit were game-changers, but some weaknesses remained. First, the entire unlock script needs to be posted to the blockchain in order to unlock an output, even if only one of many sub-branches are relevant. Imagine a script with many if statements and only one of them actually unlocks the output, but the rest are still sitting there on the blockchain. This has a negative performance and privacy impact. For performance, there are a lot of wasted bytes carrying around those extra unused logic paths. Nodes have to enforce some “max size” settings to protect against a large script bogging down the network, and this limits scripts even if the code path that ends up getting used is relatively small. For privacy, the extra paths disclose unnecessary information. For example, on a lighting channel close the transaction posted to the blockchain gives away enough information that its a safe bet both parties are operating lighting nodes.

Taproot fixes the remaining performance and privacy issues of P2SH and sets the stage for future upgrades. The two keys to the upgrade were Schnorr Signatures (BIP-340) and Merklized Alternative Script Trees (MAST) (BIP-341) which were then codified in script with BIP-342.

Schnorr signatures are pretty much better in every way than bitcoin’s historically used ECDSA signatures. This makes sense because ECDSA was developed just to get around patent issues with Schnorr, so it’s essentially water’d down. The Schnorr patents have since expired so are now free to use! One key new feature they enable is simple key aggregation (linearity). Public keys can easily be sum’d together to form a new pubilc key which can be used in n-of-n multisignatures scenarios. Historically, multisignature scripts required signatures from all n users. Now these would like like simple one signature scripts. This helps both from a performance and privacy perspective.

MAST’s are new data structure that allows a script to only reveal the path which is used to unlock an output. A script which historically would have some if statements can instead be modeled in a tree, with each leaf node being a possible unlock. So a classic lightning payment channel HTLC output which has timelock clause to pull back funds on a fail transfer would only expose this clause if it has to be used. This makes it much less obvious that the transaction is a part of the lightning network. And even better, the unlock script only has to publish the “node” its using to unlock, freeing up all the wasted bytes which used to be used to describe the rest of the possible paths.

With these performance benefits in place, taproot scripts have more relaxed rules. The relaxing of rules led to the fun ordinal explosion, but that is a story for another day.

bitcoind

A manual for running bitcoind on bare metal.

rpcallowip=192.168.1.0/24
rpcbind=0.0.0.0

allow network requests on LAN

The packaged bitcoind.service unit file sets a STATE DIRECTORY with a mode that is used by the exec statement as the data dir. This holds the blockchain so I keep it off on a separate hdd. There is a symlink from /var/lib/bitcoind => /data/bitcoind. The symlink throws off systemd’s start up process though (guessing cause it can’t set the mode). I think the least invasive way to keep most of the supplied logic is to override just those settings.

sudo systemctl edit bitcoind

systemd will auto create an override file and reload the daemon

[Service]
StateDirectory=
StateDirectoryMode=

unset the settings