Part I · Foundations Chapter 1

The Raw Transaction

"The Times 03/Jan/2009 Chancellor on brink of second bailout for banks."—Satoshi Nakamoto
Genesis block coinbase, 2009

On January 9, 2009, Hal Finney—a 52-year-old cryptographer living in Temple City, California—noticed a post on the Cryptography Mailing List announcing something called "Bitcoin."^† Finney had spent decades in the cypherpunk trenches: he'd written key components of PGP, built the first reusable proof-of-work system (RPOW), and maintained a quiet conviction that digital cash would eventually work. He downloaded the software, compiled it, and on January 10 posted to Twitter: "Running bitcoin."^† He was the first person besides Satoshi Nakamoto to run a Bitcoin node.

Satoshi reached out by email. They exchanged bug reports. And on January 12, at 3:30 in the morning UTC, Satoshi broadcast a transaction to a network that consisted of exactly two computers—his and Hal's. He sent 10 bitcoins. Block 170 confirmed it.^† It was the first-ever transfer of Bitcoin from one person to another: a test payment between a pseudonymous inventor and the only person who showed up.

Hal Finney would later be diagnosed with ALS and pass away in 2014.^† His body is cryonically preserved at the Alcor Life Extension Foundation. But the 275 bytes of that test transaction are preserved too—in a more permanent medium. They are stored in Block 170, buried under nearly a million subsequent blocks of proof-of-work, immutable for as long as the Bitcoin network operates.

Here is every one of those bytes.

1.1The Specimen

275 bytes. That's it. This is the complete transaction—everything that was broadcast to the network and stored in the blockchain forever. Let's parse it.

1.2The Four Fields

Every Bitcoin transaction (pre-SegWit) has exactly four top-level fields, always in the same order:

Version

4 bytes

offset 0

→

Inputs

variable

→

Outputs

variable

→

Locktime

4 bytes

offset 271

The outer structure is rigid: version, then inputs, then outputs, then locktime. Always in that order, never rearranged. But within the inputs and outputs, the sizes are variable—they depend on how many there are and what kind of scripts they carry. Variable-length integers (varints) stitch the variable-size sections together, telling the parser how many items to expect and how many bytes each script occupies.

1.3Version (4 bytes)

Bytes 0–3: Version

01 00 00 00

The first four bytes are the transaction version number, encoded as a 32-bit unsigned integer in little-endian byte order.

Little-Endian: The First Trap

Bitcoin stores multi-byte integers with the least significant byte first. The bytes 01 00 00 00 represent the integer \(1\), not \(16,777,216\) (which would be 01 00 00 00 in big-endian).

This is the opposite of how humans write numbers, and it trips up every developer the first time. When you see 01 00 00 00 in raw hex, think: "\(\texttt{0x00000001} = 1\)."

Why little-endian? Bitcoin was written on an x86 machine, and x86 processors use little-endian natively. Satoshi simply used the hardware's native byte order. Every Bitcoin implementation must handle this, regardless of platform.

Version 1 is the original transaction format. Version 2 was introduced by BIP 68^† (2016) to enable relative timelocks via the nSequence field. The version number affects how certain fields are interpreted—we'll see this in Chapter 16 when we cover timelocks.

1.4Input Count (1 byte)

Byte 4: Input Count (varint)

01

The next field is the number of inputs, encoded as a variable-length integer (varint). This transaction has 1 input.

Varint Encoding

Bitcoin uses a compact encoding for integers that might be small:

First byte	Format	Range
`0x00`–`0xFC`	1 byte, value is the byte itself	0–252
`0xFD`	3 bytes: `FD` + 2-byte LE integer	253–65,535
`0xFE`	5 bytes: `FE` + 4-byte LE integer	65,536–\(2^{32}-1\)
`0xFF`	9 bytes: `FF` + 8-byte LE integer	\(2^{32}\)–\(2^{64}-1\)

Most transactions have fewer than 253 inputs and outputs, so the varint is a single byte. This saves space: a 1-byte count instead of a fixed 4-byte integer saves 3 bytes per count field, and every byte matters when millions of transactions compete for block space.

1.5Input 0: Spending Satoshi's Coins

Each input identifies a specific previously-created output (a "UTXO"—Unspent Transaction Output) and provides the proof that the spender is authorized to spend it.

1.5.1Previous Transaction Hash (32 bytes)

Bytes 5–36: Previous Transaction ID

c9 97 a5 e5 6e 10 41 02 fa 20 9c 6a 85 2d d9 06

60 a2 0b 2d 9c 35 24 23 ed ce 25 85 7f cd 37 04

These 32 bytes are the txid of the transaction whose output is being spent—but in reversed byte order (little-endian hash).

The bytes above, reversed, give us the txid:

0437cd7f8525ceed2324359c2d0ba26006d92d856a9c20fa0241106ee5a597c9

This is the coinbase transaction of Block 9^†—mined by Satoshi on January 9, 2009, the day the Bitcoin software was publicly released. The Genesis Block had been mined six days earlier on January 3, but no additional blocks appeared until Satoshi restarted mining on the 9th. Block 9 was one of the first blocks after that restart. The 50 BTC block reward sat unspent for three days before Satoshi sent 10 BTC of it to Hal.

Block 9 and the Coinbase Maturity Rule

Block 9 was mined on January 9, 2009. At the time, Satoshi was the sole miner on the network; Hal Finney wouldn't join until the next day. The 50 BTC coinbase reward from Block 9 couldn't be spent immediately—Bitcoin enforces a coinbase maturity rule requiring 100 confirmations before a coinbase output is spendable.^† By Block 170, Block 9 had 161 confirmations (\(170 - 9 = 161\)), so the coins were available. This rule prevents miners from spending rewards that might vanish if the block is orphaned in a chain reorganization.

The Reversed Txid Trap

Transaction IDs are SHA-256d hashes, and hashes are displayed in big-endian by convention (most significant byte first). But in the raw transaction, they are stored in little-endian (least significant byte first). You must reverse the 32 bytes to match what block explorers show.

This catches everyone. Even experienced developers get confused by it. The convention exists because SHA-256 produces bytes in a specific order, and Bitcoin stores them in the opposite order for consistency with its internal representation of 256-bit integers.

1.5.2Output Index (4 bytes)

Bytes 37–40: Output Index (vout)

00 00 00 00

This is the index of the specific output being spent within the referenced transaction, as a 32-bit little-endian integer. Value: \(0\)—the first (and only) output of the Block 9 coinbase transaction.

Together, the previous txid and vout form a unique pointer to a specific UTXO:

0437cd7f...a597c9:0

This is the fundamental data structure of Bitcoin: every input points to exactly one previous output. Outputs are created; inputs consume them. An output that has never been consumed is "unspent"—a UTXO.

The UTXO Model: Bitcoin Has No Accounts

Bitcoin does not use accounts or balances. Instead, the entire state of the system is a set of unspent transaction outputs (UTXOs)—coins that have been created but not yet consumed.

Every transaction destroys some UTXOs (by referencing them as inputs) and creates new ones (as outputs). A wallet's "balance" is simply the sum of all UTXOs it can unlock. There is no database row that says "address X has Y bitcoins"—a node must scan the UTXO set and sum the values.

As of 2026, the UTXO set contains roughly 180 million entries. Every full node maintains this set in memory or on fast storage, because validating a new transaction requires proving that its inputs reference UTXOs that actually exist and have not been previously spent.

This model has a profound consequence: spending is destructive. You cannot "partially spend" a UTXO. If you want to spend less than its full value, you must consume it entirely and create a change output back to yourself. We will see exactly this pattern in the outputs below.

1.5.3ScriptSig Length (1 byte)

Byte 41: ScriptSig Length (varint)

48

The next varint tells us the length of the scriptSig (the unlocking script) in bytes. 0x48 = 72 in decimal. The next 72 bytes are the scriptSig.

1.5.4ScriptSig (72 bytes)

Bytes 42–113: ScriptSig (unlocking script)

47 30 44 02 20 4e 45 e1 69 32 b8 af 51 49 61 a1 d3

a1 a2 5f df 3f 4f 77 32 e9 d6 24 c6 c6 15 48 ab

5f b8 cd 41 02 20 18 15 22 ec 8e ca 07 de 48 60

a4 ac dd 12 90 9d 83 1c c5 6c bb ac 46 22 08 22

21 a8 76 8d 1d 09 01

This is the unlocking script—the proof that Satoshi is authorized to spend the Block 9 coinbase output. Let's trace every byte:

47 — Push opcode: push the next 71 bytes (0x47 = 71) onto the stack. This is not an OP_code—push opcodes from 0x01 to 0x4b are data push instructions that simply indicate how many following bytes to push.

The 71 bytes that get pushed are a DER-encoded ECDSA signature followed by a SIGHASH flag. Let's parse the DER structure byte by byte:

Byte(s)	Hex	Meaning
43	`30`	DER sequence tag—everything that follows is a structured sequence
44	`44`	Sequence length: 68 bytes follow (the entire signature minus SIGHASH)
45	`02`	DER integer tag—the next value is a big-endian integer
46	`20`	Integer length: 32 bytes (\(r\) component)
47–78	`4e45e169...5fb8cd41`	The \(r\) value (32 bytes): one of two integers that make up the ECDSA signature^†
79	`02`	DER integer tag (second integer)
80	`20`	Integer length: 32 bytes (\(s\) component)
81–112	`181522ec...8d1d09`	The \(s\) value (32 bytes): the second signature integer, computed from the private key, the message hash, and the nonce (Chapter 3)
113	`01`	SIGHASH_ALL: this signature commits to all inputs and all outputs

Note that both \(r\) and \(s\) are exactly 32 bytes here (their first bytes, 4e and 18, both have the high bit clear, so no leading zero is needed).

DER Encoding: Why Signatures Have Variable Length

ECDSA signatures consist of two 256-bit integers \((r, s)\). In theory, each is exactly 32 bytes. In practice, DER encoding adds structure bytes and sometimes pads with a leading zero (when the high bit is set, to prevent the integer from being interpreted as negative).

A DER-encoded Bitcoin signature is typically 70–72 bytes:

1 byte: sequence tag (30)
1 byte: total length
1 byte: integer tag (02)
1 byte: \(r\) length (32 or 33)
32–33 bytes: \(r\) value
1 byte: integer tag (02)
1 byte: \(s\) length (32 or 33)
32–33 bytes: \(s\) value
1 byte: SIGHASH flag

Total: 70–72 bytes (plus the push opcode). Taproot's Schnorr signatures (Chapter 12) fixed this: they are always exactly 64 bytes, with no DER encoding.

Note what is not in the scriptSig: there is no public key. In a P2PK transaction, the public key is in the output script (the scriptPubKey of the previous transaction), not in the input. The scriptSig contains only the signature. We'll see this change in P2PKH (Chapter 5), where the public key moves to the scriptSig.

1.5.5Sequence Number (4 bytes)

Bytes 114–117: Sequence Number

ff ff ff ff

The sequence number, as a 32-bit little-endian integer: 0xFFFFFFFF = \(4,294,967,295\).

Satoshi originally intended this field for a transaction replacement mechanism that was never implemented in its original form. The maximum value 0xFFFFFFFF means "this input is final—do not replace."

Later, this field was repurposed for two mechanisms:

BIP 125 (Replace-By-Fee)^†: If the sequence is less than 0xFFFFFFFE, the transaction signals that it can be replaced by a higher-fee version.
BIP 68 (Relative Timelocks): In version-2 transactions, sequence numbers below 0x80000000 encode a relative timelock.

We'll explore both in Chapter 16.

1.6Output Count (1 byte)

Byte 118: Output Count (varint)

02

Two outputs. Satoshi is sending 10 BTC to Hal Finney and returning 40 BTC to himself as change.

1.7Output 0: 10 BTC to Hal Finney

1.7.1Value (8 bytes)

Bytes 119–126: Output Value

00 ca 9a 3b 00 00 00 00

The output value in satoshis, as a 64-bit little-endian integer:

0x000000003B9ACA00 = 1,000,000,000 satoshis = 10 BTC

Satoshis: Bitcoin's Atomic Unit

All values in Bitcoin transactions are denominated in satoshis (1 sat = \(10^{-8}\) BTC). There are no floating-point numbers anywhere in the protocol. This eliminates rounding errors entirely—an underappreciated design decision.

The maximum value of a 64-bit unsigned integer is \(2^{64} - 1 \approx 1.8 \times 10^{19}\), which is far larger than the maximum possible supply of \(21,000,000 \times 10^8 = 2.1 \times 10^{15}\) satoshis. The value field will never overflow.

1.7.2ScriptPubKey Length (1 byte)

Byte 127: ScriptPubKey Length

43

0x43 = 67 bytes. The next 67 bytes are the scriptPubKey (the locking script)—the conditions that must be satisfied to spend this output.

1.7.3ScriptPubKey (67 bytes)

Bytes 128–194: ScriptPubKey (locking script)

41 04 ae 1a 62 fe 09 c5 f5 1b 13 90 5f 07 f0 6b 99

a2 f7 15 9b 22 25 f3 74 cd 37 8d 71 30 2f a2 84

14 e7 aa b3 73 97 f5 54 a7 df 5f 14 2c 21 c1 b7

30 3b 8a 06 26 f1 ba de d5 c7 2a 70 4f 7e 6c d8 4c

ac

This is a Pay-to-Public-Key (P2PK) script—the simplest possible locking mechanism:

OP_PUSHBYTES_65 <Hal Finney's uncompressed public key> OP_CHECKSIG

Breaking it down:

41 (= 65): Push the next 65 bytes onto the stack.
The 65 bytes are Hal Finney's uncompressed public key on the secp256k1 elliptic curve:
- 04 prefix: indicates an uncompressed point (both coordinates included)
- Next 32 bytes: the \(x\)-coordinate of the public key point
- Next 32 bytes: the \(y\)-coordinate of the public key point
ac: OP_CHECKSIG — verify the signature against this public key.

Public Keys and the secp256k1 Curve

Bitcoin's cryptography is built on the elliptic curve secp256k1,^† defined by the equation \[ y^2 = x^3 + 7 \pmod{p} \] where \(p = 2^{256} - 2^{32} - 977\), a 256-bit prime. A public key is a point \((x, y)\) on this curve, derived from a private key \(d\) by computing \(Q = dG\), where \(G\) is a fixed generator point.

An uncompressed public key stores both coordinates: the prefix byte 04 followed by 32 bytes for \(x\) and 32 bytes for \(y\), totaling 65 bytes. But since the curve equation determines \(y\) from \(x\) (up to a sign), you can save 32 bytes by storing only \(x\) plus a single bit indicating which of the two possible \(y\) values to use:

Format	Prefix	Size
Uncompressed	`04`	65 bytes (\(1 + 32 + 32\))
Compressed (even \(y\))	`02`	33 bytes (\(1 + 32\))
Compressed (odd \(y\))	`03`	33 bytes (\(1 + 32\))

In 2009, Bitcoin used uncompressed keys exclusively. Compressed keys were introduced later (2012–2013) and save 32 bytes per output—a significant savings at scale. Taproot (Chapter 12) uses 32-byte x-only keys, dropping even the prefix byte.

Hal Finney's Public Key

The 65-byte uncompressed public key beginning with 04ae1a62... belongs to Hal Finney (1956–2014). This key is permanently recorded in Block 170 of the Bitcoin blockchain—the first public key ever used to receive a person-to-person Bitcoin transfer. Every byte of it is visible to anyone who queries the blockchain.

As we'll see in Chapter 5 (P2PKH) and Chapter 12 (Taproot), later transaction types go to great lengths to hide the public key behind a hash, revealing it only at the moment of spending. But P2PK puts it right in the output, naked and permanent. This 65-byte key is the reason Hal Finney's coins are among the most quantum-vulnerable UTXOs on the network.

1.8Output 1: 40 BTC Change to Satoshi

1.8.1Value (8 bytes)

Bytes 195–202: Output Value

00 28 6b ee 00 00 00 00

0x00000000EE6B2800 = 4,000,000,000 satoshis = 40 BTC

Satoshi spent the entire 50 BTC coinbase output from Block 9 and sent 40 BTC back to himself. There is no explicit "change" mechanism in Bitcoin—you simply create a new output paying yourself.

Note: \(50 - 10 - 40 = 0\). The fee is zero. In 2009, there was no fee market—blocks were nearly empty and Satoshi was the miner.

1.8.2ScriptPubKey Length (1 byte)

Byte 203: ScriptPubKey Length

43

0x43 = 67 bytes. Same length as Output 0—both are P2PK scripts with uncompressed keys.

1.8.3ScriptPubKey (67 bytes)

Bytes 204–270: ScriptPubKey (locking script)

41 04 11 db 93 e1 dc db 8a 01 6b 49 84 0f 8c 53

bc 1e b6 8a 38 2e 97 b1 48 2e ca d7 b1 48 a6 90

9a 5c b2 e0 ea dd fb 84 cc f9 74 44 64 f8 2e 16

0b fa 9b 8b 64 f9 d4 c0 3f 99 9b 86 43 f6 56 b4

12 a3 ac

Same structure: OP_PUSHBYTES_65 <pubkey> OP_CHECKSIG. This public key (0411db93e1...) is Satoshi's—the same key that received the Block 9 coinbase reward.

Satoshi's Public Key: The Most Watched Coins in History

The public key 0411db93e1dcdb8a... is one of the most analyzed artifacts in Bitcoin. Researcher Sergio Demian Lerner identified a distinctive pattern in Satoshi's early mining activity—the "Patoshi pattern"—which links approximately 1.1 million BTC across Blocks 1 through 36,000 to a single miner, almost certainly Satoshi.^†

The coins sent back to this key in our transaction—40 BTC—have never moved. Nor have the vast majority of Patoshi coins. As of 2026, they have been dormant for over 17 years, worth billions of dollars at current prices. Every transaction from a Satoshi-era address makes worldwide news, and blockchain analysts monitor these UTXOs continuously.

Whether Satoshi lost the private keys, chose never to spend, or is simply waiting is unknown. The blockchain records the outputs but reveals nothing about intent.

Key Reuse: A Nuance

Satoshi sent the change back to the same public key that received the original coinbase output. This is key reuse—the same 65-byte public key now appears in multiple UTXOs.

In a P2PK transaction, however, this is less damaging than it sounds. The public key is already exposed in the scriptPubKey of every output, whether reused or not. There is no hash layer to lose. Satoshi's key 0411db93e1… was visible the moment the Block 9 coinbase was mined—before he ever spent a satoshi. Reusing it in the Block 170 change output revealed nothing new.

The practical consequences are limited to linkability (trivially connecting Block 9 and Block 170 to the same entity—useful to researchers like Lerner, meaningless in a two-person network) and consolidated quantum exposure (if a quantum computer ever factors the discrete log for this key, all UTXOs sharing it fall at once—but every P2PK output is equally quantum-vulnerable regardless of reuse).

The real sin of key reuse arrives in Chapter 5 with P2PKH, where the public key is hidden behind RIPEMD160(SHA256(pubkey)) until the moment of spending. In that world, reusing an address means that once you spend any output, the exposed public key strips the hash protection from every remaining unspent output at that address. That is the original sin—losing a security layer you were supposed to have. P2PK never had the layer to lose.

In January 2009, none of this mattered: Satoshi was the only user besides Hal, hierarchical deterministic wallets (BIP 32)^† wouldn't exist for another four years, and there was no address-graph adversary to worry about. But the pattern Satoshi established here—reusing keys by default—persisted into the P2PKH era, where the consequences became real.

1.9Locktime (4 bytes)

Bytes 271–274: Locktime

00 00 00 00

The locktime field: 0x00000000 = \(0\).

A locktime of zero means the transaction is valid immediately—it can be included in any block. Non-zero values restrict when the transaction can be mined:

Values \(< 500,000,000\): interpreted as a block height. The transaction cannot be mined before that block.
Values \(\geq 500,000,000\): interpreted as a Unix timestamp. The transaction cannot be mined before that time.

The dual interpretation based on a threshold value is unusual—but it works because the block height will not reach 500 million for roughly 9,500 years.

1.10The Complete Map

Complete Transaction Map — Block 170 (275 bytes)
01000000 ← Version (4B): 1
01 ← Input count: 1
INPUT 0
c997a5e5 6e104102 fa209c6a 852dd906
60a20b2d 9c352423 edce2585 7fcd3704 ← Prev txid (32B)
00000000 ← Prev vout (4B): 0
48 ← ScriptSig length: 72B
47304402 204e45e1 6932b8af 514961a1
d3a1a25f df3f4f77 32e9d624 c6c61548
ab5fb8cd 41022018 1522ec8e ca07de48
60a4acdd 12909d83 1cc56cbb ac462208
2221a876 8d1d0901 ← DER sig (71B) + sighash 0x01
ffffffff ← Sequence: final
02 ← Output count: 2
OUTPUT 0 — 10 BTC to Hal Finney
00ca9a3b 00000000 ← Value (8B): 1,000,000,000 sats
43 ← ScriptPubKey length: 67B
4104ae1a 62fe09c5 f51b1390 5f07f06b
99a2f715 9b2225f3 74cd378d 71302fa2
8414e7aa b37397f5 54a7df5f 142c21c1
b7303b8a 0626f1ba ded5c72a 704f7e6c
d84c ac ← Pubkey (65B) + OP_CHECKSIG
OUTPUT 1 — 40 BTC change to Satoshi
00286bee 00000000 ← Value (8B): 4,000,000,000 sats
43 0411db93 e1dcdb8a 016b4984 0f8c53bc
1eb68a38 2e97b148 2ecad7b1 48a6909a
5cb2e0ea ddfb84cc f9744464 f82e160b
fa9b8b64 f9d4c03f 999b8643 f656b412
a3 ac ← Pubkey (65B) + OP_CHECKSIG
00000000 ← Locktime (4B): 0

1.11Computing the Transaction ID

The transaction ID (txid) is the fingerprint of the transaction, computed by double-hashing the entire serialized byte sequence:

\[ \text{txid} = \text{SHA-256}(\text{SHA-256}(\texttt{raw\_tx})) \]

The process has three steps:

Serialize: concatenate all 275 bytes exactly as they appear in the raw hex above—version through locktime, no gaps, no padding.
Double hash: feed the 275 bytes into SHA-256 to get a 32-byte intermediate hash, then feed that intermediate hash into SHA-256 again to get the final 32-byte digest.
Reverse: display the resulting 32 bytes in big-endian (reversed) order, because Bitcoin's internal hash representation is little-endian but human convention is big-endian.

The result for our specimen:

SHA-256 pass 1	`240cf324ec3cf59609733e2a45e1408673306be8dcd4caf3067aa9355a0269e3`
SHA-256 pass 2	`169e1e83e930853391bc6f35f605c6754cfead57cf8387639d3b4096c54f18f4`
Reversed (txid)	`f4184fc596403b9d638783cf57adfe4c75c605f6356fbc91338530e9831e9e16`

This is the permanent, immutable identifier of the first person-to-person Bitcoin transaction. You can verify it yourself: take the 275 raw bytes, run them through any SHA-256 implementation twice, and reverse the result. Every block explorer in the world will return the same hash.

The txid can never change because the transaction bytes can never change—they are embedded in Block 170, which is buried under 940,000+ subsequent blocks of proof-of-work.

Why Double SHA-256?

Bitcoin uses SHA-256 applied twice (SHA-256d) rather than once.^† Satoshi never explained why, but the likely reason is defense against length-extension attacks. SHA-256 (like all Merkle–Damgård hash functions) is vulnerable to an attack where knowing \(H(m)\) lets you compute \(H(m \| \text{padding} \| m')\) without knowing \(m\). Applying SHA-256 twice prevents this.

This same construction is used for block hashes, Merkle trees, and address derivation throughout Bitcoin.

1.12What We Learned

This 275-byte transaction contains:

A version number (4 bytes, little-endian) declaring the serialization format.
An input that references a specific UTXO by txid:vout and provides a signature proving authorization.
Two outputs that lock new UTXOs behind public-key scripts.
A locktime of zero (immediately valid).

The total value of the input (50 BTC from Block 9's coinbase) equals the total value of the outputs (\(10 + 40 = 50\) BTC). The difference (the fee) is zero. In practice, the fee is almost always positive—miners have little reason to include zero-fee transactions, though they remain consensus-valid.

The output scripts are P2PK: raw public keys with OP_CHECKSIG. This is the simplest possible locking mechanism, and we'll see in Chapter 2 how the Script VM executes it. But first, notice what this format tells us about Bitcoin's architecture:

There is no "from" address. Inputs don't specify who is sending—they point to a UTXO and provide a proof. The concept of a "sender" is an abstraction that wallets construct for the user.
There is no "balance" anywhere. A wallet's balance is the sum of all UTXOs it can spend. The protocol knows nothing about balances.
Change is explicit. If you want to spend less than the full UTXO, you must create a change output back to yourself. There is no implicit "remainder."
The fee is implicit. The fee is the difference between total input value and total output value. It is never stated explicitly—it is computed by every validating node.

1.12.1Looking Ahead: The Script Machine

We have parsed every byte of this transaction, but we have not yet executed it. The scriptSig and scriptPubKey are not just data—they are programs. Bitcoin nodes don't simply check that the fields are well-formed; they run the scripts through a stack-based virtual machine and accept the transaction only if execution succeeds.

In Chapter 2, we will watch the Script VM execute this exact transaction step by step, seeing how OP_CHECKSIG pops a signature and a public key off the stack, performs an ECDSA verification, and pushes TRUE if and only if Satoshi's signature is valid for Hal Finney's public key. The stack machine is where Bitcoin's programmable money lives—and it is surprisingly simple.

1.13Exercises

Litmus

The version field bytes are 01 00 00 00. What integer does this represent, and why isn't it \(16,777,216\)?
The varint byte 48 appears at offset 41. What decimal value does it represent, and what does it count?
The output value bytes 00 28 6b ee 00 00 00 00 encode how many satoshis? How many BTC is that?
True or false: the transaction ID displayed on block explorers has the same byte order as the 32-byte hash stored in the raw transaction.
What is the total fee (in satoshis) for this transaction? How do you compute it from the raw bytes?

Hands-On

Using a block explorer (e.g., mempool.space), look up the Block 9 coinbase transaction (txid 0437cd7f8525ceed2324359c2d0ba26006d92d856a9c20fa0241106ee5a597c9). Identify its output value and scriptPubKey. Verify that the input of our specimen points to output index 0 of this transaction.
Take the raw hex of our specimen and manually locate every varint. List each one with its byte offset, hex value, and the decimal value it encodes. How many total varints are there? (Remember: every count field and every script-length field uses varint encoding.)
The DER-encoded signature in the scriptSig starts at byte 43. Extract the \(r\) and \(s\) components (32 bytes each). Write them in hex. Why might \(r\) or \(s\) sometimes be 33 bytes in other transactions?
Reverse the 32-byte previous transaction hash from the raw input to produce the txid in standard (big-endian) display format. Verify your result matches the Block 9 coinbase txid.

Proofs

Varint efficiency. Consider a transaction with \(n\) inputs and \(m\) outputs, where all counts and script lengths are \(\leq 252\). How many varint fields does the transaction contain? How many bytes does varint encoding save compared to fixed 4-byte integers? Compute the savings for our specimen (\(n = 1, m = 2\)).
Fee uniqueness. Prove that the transaction fee is uniquely determined by the raw bytes alone—that is, no external information is needed to compute it, assuming you have access to the referenced UTXO set. Then explain why this means a node cannot verify fees from the transaction alone: it needs the UTXO set.
No sender identity. In a traditional banking system, a wire transfer has a "from" field. Explain precisely why Bitcoin transactions have no "from" field. What information would a node need to reconstruct a "sender," and where would it come from?

Connections

Quantum vulnerability. Hal Finney's uncompressed public key (04ae1a62...) is exposed directly in the scriptPubKey. In the context of The Quantum Threat, explain why P2PK outputs are the most vulnerable to a quantum computer running Shor's algorithm. How does the P2PKH scheme (Chapter 5) add a layer of protection, and what are its limits?
SegWit weight. This pre-SegWit transaction is 275 bytes. If SegWit had existed in 2009, the transaction could have moved the signature to the witness. Using the weight formula \(\text{weight} = \text{base size} \times 3 + \text{total size}\), estimate the weight and virtual size (vsize) of a hypothetical SegWit version of this transaction. (Assume the witness would contain the 72-byte signature; the remaining fields stay in the base.)

Bridge

What does OP_CHECKSIG actually do? We know the scriptPubKey is <pubkey> OP_CHECKSIG and the scriptSig is <signature>. Describe, in words, the sequence of stack operations that must happen to verify this transaction. What gets pushed? What gets popped? What determines "success"? (Chapter 2 will make this precise.)
What gets signed? The scriptSig contains a signature, but a signature must be over something—some message digest. The signature cannot be over the full transaction, because the transaction contains the signature (circular dependency). Speculate: what parts of the transaction does the signature commit to, and how is the circularity resolved? (Chapter 3 gives the full answer.)

1.14Solutions

Litmus Solutions

L1. The bytes 01 00 00 00 are in little-endian order, meaning the least significant byte comes first. Reading them as a 32-bit integer: \(\texttt{0x00000001} = 1\). The value \(16,777,216 = \texttt{0x01000000}\) would be the big-endian interpretation, which is not how Bitcoin encodes integers.

L2. 0x48 = \(72\) in decimal. It is a varint encoding the length of the scriptSig in bytes. Since \(72 < 253\), the varint is a single byte equal to the value itself.

L3. Little-endian:

00 28 6b ee 00 00 00 00 → 0x00000000EE6B2800 = 4,000,000,000 satoshis = 40 BTC.

L4. False. Transaction IDs on block explorers are displayed in big-endian (most significant byte first), but they are stored in the raw transaction in little-endian (reversed) byte order. You must reverse the 32 bytes to convert between the two formats.

L5. The input spends a 50 BTC coinbase output. The outputs are 10 BTC + 40 BTC = 50 BTC. The fee is \(50 - 50 = 0\) satoshis. You compute it as: (sum of input values) \(-\) (sum of output values). The input values are not in the transaction itself—they come from the referenced UTXOs.

Hands-On Solutions

H1. The Block 9 coinbase transaction has one output: 50 BTC (\(5,000,000,000\) satoshis) locked to scriptPubKey 410411db93...ac—a P2PK script containing Satoshi's public key (0411db93e1dcdb8a…). Our specimen's input references txid 0437cd7f...a597c9 at vout \(= 0\), which matches this output. (Note: this is a different key from the Genesis Block's coinbase, which uses 04678afdb0….)

H2. The varints in this transaction are:

Offset 4: 01 = 1 (input count)
Offset 41: 48 = 72 (scriptSig length)
Offset 118: 02 = 2 (output count)
Offset 127: 43 = 67 (output 0 scriptPubKey length)
Offset 203: 43 = 67 (output 1 scriptPubKey length)

Total: 5 varints. (The output count uses the same varint encoding as input count—both are varints. Every count and every script length field is a varint.)

H3. The DER signature begins at byte 43 (after the push opcode 47 at byte 42):

\(r\) (bytes 47–78): 4e45e16932b8af514961a1d3a1a25fdf 3f4f7732e9d624c6c61548ab5fb8cd41
\(s\) (bytes 81–112): 181522ec8eca07de4860a4acdd12909d 831cc56cbbac4622082221a8768d1d09

An \(r\) or \(s\) value is 33 bytes when its most significant bit is 1 (i.e., the first byte is \(\geq\) 0x80). DER encoding prepends a 00 byte to prevent the integer from being interpreted as negative.

H4. The raw bytes (in transaction order):

c997a5e5 6e104102 fa209c6a 852dd906 60a20b2d 9c352423 edce2585 7fcd3704

Reversed (reading 32 bytes from right to left):

0437cd7f 8525ceed 2324359c 2d0ba260 06d92d85 6a9c20fa 0241106e e5a597c9

This matches the Block 9 coinbase txid.

Proofs Solutions

P1. The varint fields in a transaction with \(n\) inputs and \(m\) outputs are:

1 input count varint
\(n\) scriptSig length varints (one per input)
1 output count varint
\(m\) scriptPubKey length varints (one per output)

Total varint fields: \(n + m + 2\). When all values are \(\leq 252\), each varint uses exactly 1 byte. A fixed 4-byte encoding would use \(4(n + m + 2)\) bytes. Savings: \[ 4(n + m + 2) - (n + m + 2) = 3(n + m + 2) = 3n + 3m + 6\ \text{bytes}. \] For our specimen (\(n = 1, m = 2\)): \(3(1) + 3(2) + 6 = 15\) bytes saved. With 5 varint fields each using 1 byte instead of 4, that's \(5 \times 3 = 15\) bytes. On a 275-byte transaction, those 15 bytes represent a 5.5% reduction—modest, but across millions of transactions, it adds up to megabytes of block space.

P2. The fee equals \(V_{in} - V_{out}\). The output values \(V_{out}\) are encoded directly in the transaction bytes. However, the input values \(V_{in}\) are not in the transaction—each input specifies only a (txid, vout) pair pointing to a previous output. To learn \(V_{in}\), a node must look up the referenced UTXO in its UTXO set (or in the referenced transaction). Therefore, the fee is uniquely determined by the raw bytes plus the UTXO set, but a node cannot verify the fee from the transaction bytes alone—it must have access to the UTXOs being spent.

P3. A Bitcoin transaction input contains a reference to a UTXO (txid:vout) and a scriptSig that satisfies the UTXO's locking conditions. It does not contain a "from" field—there is no sender address, no account identifier, no identity. The concept of "sender" is a wallet-layer abstraction constructed by:

Looking up the scriptPubKey of the referenced UTXO.
Deriving the address that scriptPubKey encodes (e.g., hashing the public key for P2PKH).
Labeling that address as the "sender."

This is imprecise: the "sender" is really the entity that can produce a valid scriptSig, which may be different from the entity that originally received the UTXO. Bitcoin's protocol knows only UTXOs and proofs, not identities.

Connections Solutions

C1. P2PK outputs store the full public key directly in the scriptPubKey. A quantum computer running Shor's algorithm on a sufficiently large quantum register could extract the private key from this exposed public key, then forge a valid signature to steal the funds. Every P2PK output is vulnerable the moment it is created—the public key is visible to anyone scanning the blockchain.

P2PKH (Chapter 5) replaces the raw public key with its RIPEMD-160(SHA-256) hash. The public key is not revealed until the UTXO is spent (when the spender provides the key in the scriptSig). This means an unspent P2PKH output is protected: a quantum attacker cannot run Shor's algorithm without the public key. However, once the UTXO is spent (or if the address is reused), the public key is exposed, and the protection vanishes. P2PKH buys time, not immunity.

C2. Our transaction is 275 bytes total. If we moved the 72-byte signature to the witness:

Base size (non-witness): \(275 - 72 = 203\) bytes (approximately—the exact split depends on the serialization format, which includes the marker/flag bytes and witness count varint).
Total size: 275 bytes (witness is still transmitted).
More precisely, a SegWit serialization would add marker (1 byte) + flag (1 byte) + witness item count (1 byte) + witness item length (1 byte), so total size \(\approx 275 + 4 - 0 = 279\) bytes and base size \(\approx 275 - 72 = 203\) bytes.
Weight \(= 203 \times 3 + 279 = 609 + 279 = 888\) weight units.
Vsize \(= \lceil 888 / 4 \rceil = 222\) vbytes.

The virtual size (222 vbytes) is smaller than the actual size (275 bytes), which means lower fees under SegWit's fee calculation. This is the fee incentive that drives SegWit adoption.

Bridge Solutions

B1. The Script VM runs the scriptSig first (pushing the signature), then runs the scriptPubKey against the resulting stack—two phases, never concatenated (see Chapter 2):

Push signature (from scriptSig): The 71-byte DER-encoded signature is pushed onto the stack.
Push public key (from scriptPubKey): The 65-byte uncompressed public key is pushed onto the stack. (The push opcode 41 in the scriptPubKey handles this.)
OP_CHECKSIG: Pops the top two items (public key, then signature). Computes the transaction digest (the "message" being signed), performs ECDSA verification of the signature against the public key, and pushes TRUE (1) if valid, FALSE (0) if not.

The transaction is valid if and only if the stack's top element is TRUE after execution. Chapter 2 traces this process with the actual hex values.

B2. The signature cannot cover the entire transaction because the scriptSig contains the signature—a chicken-and-egg problem. The solution: before computing the signature hash, the scriptSig field is replaced with the scriptPubKey of the output being spent (or emptied, depending on the SIGHASH flag). The modified transaction (with the signature "hole" filled by the locking script) is then serialized and double-SHA-256 hashed to produce the message digest that gets signed. This is the sighash algorithm, and Chapter 3 covers it in full detail, including all four SIGHASH flag variants.

← Preface Ch. 2 →