In the late 1960s, Charles Moore was writing programs to control radio telescopes at the National Radio Astronomy Observatory in Charlottesville, Virginia. The computations were repetitive—convert coordinates, apply corrections, format output—and the existing programming tools were unwieldy. So Moore built his own language, one that eventually became Forth: a stack-based language where every operation pops its arguments from a shared stack and pushes its result back. No variables, no complex syntax, just a stack and a dictionary of words.
Forth was small, fast, and deterministic. It ran on embedded systems, spacecraft, and factory controllers—anywhere you needed reliable execution in tight constraints. Decades later, when Satoshi Nakamoto needed a scripting language for Bitcoin transactions, the design requirements were strikingly similar: it had to be small (every byte costs block space), deterministic (every node must reach the same result), and safe (malicious scripts must not be able to crash the network).
Satoshi's solution was a Forth-like stack machine stripped down to its bare essentials. No loops, no recursion, no floating-point arithmetic, no file I/O. Just a stack, a set of opcodes, and a simple rule: if the script finishes with a non-zero value on top of the stack, the transaction is valid.
In Chapter 1, we parsed every byte of the Satoshi-to-Hal transaction. Now we execute it.
Every Bitcoin input is validated by running two scripts in sequence:
<pubkey> OP_CHECKSIG from Block 9's coinbase.An early misconception is that Bitcoin concatenates the scriptSig and scriptPubKey into a single program and runs it. Satoshi's original implementation did concatenate them—the function VerifySignature() in script.cpp literally joined scriptSig + OP_CODESEPARATOR + scriptPubKey and executed the result as a single program.† This was changed because concatenation created a subtle vulnerability: a malicious scriptSig could manipulate the stack in ways that bypassed the scriptPubKey's checks.
The modern evaluation model (since mid-2010) runs them separately:
The key insight: the scriptSig has no access to the scriptPubKey's code, and the scriptPubKey operates on whatever the scriptSig left behind. The scriptPubKey is a lock; the scriptSig is a key. The lock defines success; the key provides the evidence.
Why concatenation was dangerous: Under concatenation, a scriptSig of OP_1 OP_RETURN would push TRUE onto the stack, then OP_RETURN would halt execution immediately—leaving TRUE on top regardless of what the scriptPubKey contained.† More broadly, a scriptSig could include opcodes like OP_DUP or OP_HASH160 that interfere with the scriptPubKey's stack expectations. Separation eliminates this entire class of attacks: the scriptSig can only leave data for the scriptPubKey, never inject operations into its execution flow.
The script engine maintains a main stack and an alt stack. Both hold arbitrary byte arrays (not typed values—everything is raw bytes, interpreted according to context). Almost all operations use the main stack; the alt stack is used only by the opcodes OP_TOALTSTACK and OP_FROMALTSTACK for temporary storage.
The main stack is a last-in, first-out (LIFO) data structure:
There are no named variables, no registers, no heap. Every value lives on the stack or nowhere. This radical simplicity is the point: a stack machine is easy to analyze, easy to bound, and hard to exploit.
The alt stack serves as temporary storage: OP_TOALTSTACK moves the top item from the main stack to the alt stack, and OP_FROMALTSTACK moves it back. This is useful when a script needs to rearrange deeply nested stack elements—move the top items to the alt stack, access what's underneath, then move them back. You'll see this pattern in multisig and HTLC scripts.
Stack elements are raw byte arrays, but many opcodes interpret them as integers. Bitcoin Script uses a custom encoding called little-endian sign-magnitude. Understanding this encoding is essential—it explains why certain byte patterns appear throughout Bitcoin transactions and why DER signatures need padding bytes.
Script's number encoding was not designed from first principles—it was inherited from a library dependency. In Bitcoin's original C++ source code (v0.1, January 2009), Satoshi used OpenSSL's BIGNUM type—wrapped in a class called CBigNum†—to handle all integer operations inside the script interpreter.
OpenSSL's BIGNUM library includes a pair of functions, BN_bn2mpi() and BN_mpi2bn(), that convert integers to and from a wire format called MPI (Multi-Precision Integer)†. MPI format stores numbers in big-endian sign-magnitude: the bytes go from most significant to least significant, and the highest bit of the leading byte is the sign flag. When the magnitude's high bit is set, a 0x00 prefix byte is prepended to keep the sign clear—the same reason DER-encoded integers need the same padding.
Satoshi's CBigNum class provided two critical methods, setvch() and getvch(), that bridged between Script's stack elements and OpenSSL's BIGNUM. The code reveals the origin of Script's byte order†:
getvch() calls BN_bn2mpi() to get the big-endian MPI representation, strips the 4-byte length prefix, then reverses the byte array—converting big-endian to little-endian.setvch() does the inverse: it reverses the input bytes (little-endian big-endian), prepends the 4-byte MPI length header, and calls BN_mpi2bn().That reverse() call is the entire reason Script numbers are little-endian. The sign-magnitude property was inherited directly from MPI format. Satoshi did not document why he chose to reverse the byte order, but the likely reason is consistency with the rest of Bitcoin's serialization: version numbers, output values, locktime—all are little-endian because x86 processors store multi-byte integers least-significant-byte-first, and Satoshi used the hardware's native byte order throughout.
In May 2014, developer Cory Fields replaced CBigNum with a new self-contained class called CScriptNum that removed the OpenSSL dependency entirely†. The new CScriptNum stores results internally as a 64-bit integer (int64_t) but implements exactly the same byte-level encoding as CBigNum::getvch()/setvch()—little-endian sign-magnitude—preserving perfect consensus compatibility†. The encoding could never change: any alteration would cause a consensus fork.
Every Script integer is stored as a byte array with two properties:
Numbers must use minimal encoding—no unnecessary leading zero bytes—except when a zero byte is needed to keep the sign bit clear. This minimal encoding rule was formalized by Pieter Wuille in BIP 62† (Rule 4: "Any time a script opcode consumes a stack value that is interpreted as a number, it must be encoded in its shortest possible form. )
text = text.replace("'", Negative zero' is not allowed") and is enforced in Bitcoin Core via the SCRIPT_VERIFY_MINIMALDATA flag†. Although BIP 62 itself was never activated as a consensus soft fork (it was superseded by SegWit's malleability fixes in BIP 141), the MINIMALDATA rule was incorporated into SegWit's script validation rules and remains enforced for all witness scripts today.
This is the key insight, and it comes down to how bits are numbered inside a byte.
A byte is 8 bits, numbered 0 through 7 from right to left:
Bit 7 | Bit 6 | Bit 5 | Bit 4 | Bit 3 | Bit 2 | Bit 1 | Bit 0
\(2^7 = 128\) | \(2^6 = 64\) | \(2^5 = 32\) | \(2^4 = 16\) | \(2^3 = 8\) | \(2^2 = 4\) | \(2^1 = 2\) | \(2^0 = 1\)
Bit 7 is the highest bit—the one worth 128 in an unsigned byte. In sign-magnitude encoding, Bitcoin Script repurposes this bit: instead of contributing value, it indicates the sign. That leaves bits 0–6 (7 bits) for the magnitude, giving a single byte a range of \(-127\) to \(+127\).
The hex value 0x80 is exactly the byte with only bit 7 set: 0x80 = 1 (bit 7, sign) + 000 0000 (bits 6–0, magnitude = 0)
So 0x80 means "negative zero"—the sign bit is on, but the magnitude is zero. This is why it cannot represent \(+128\): the bit that would encode 128 has been commandeered as the sign flag.
More generally, any hex byte whose first nibble is 8–F has bit 7 set, because the high nibble 8\({}_{16}\) \(=\) 1000\({}_{2}\) already flips that bit:
Hex | Binary | Bit 7?
0x7F | 0111 1111 | Clear (positive)
0x80 | 1000 0000 | Set (negative)
0x9A | 1001 1010 | Set (negative)
0xFF | 1111 1111 | Set (negative)
The boundary is clean: 0x00–0x7F sign bit clear (positive or zero); 0x80–0xFF sign bit set (negative).
With this foundation, we can trace how Script encodes every integer:
0.5cm
Number | Hex bytes | Why
\(0\) | (empty) | Zero is the empty byte array
\(1\) | 01 | Fits in 7 bits, sign bit clear
\(-1\) | 81 | 01 with sign bit set: \(\texttt{0x01} | \texttt{0x80} = \texttt{0x81}\)
\(127\) | 7f | 0111 1111—all 7 value bits used, sign bit still clear
\(128\) | 80 00 | 80 alone \(=\) \(-0\) (sign bit set!), so append 00
\(-128\) | 80 80 | Low byte 80, high byte has sign bit: 80
\(255\) | ff 00 | ff alone \(=\) \(-127\), needs 00 to keep sign clear
\(256\) | 00 01 | Little-endian: \(256 = \texttt{0x0100}\) 00 01
\(-256\) | 00 81 | Same, but sign bit set in last byte: 01 81
0.5cm
The critical row is \(128\): the byte 0x80 has its high bit set, so Script would read it as negative. The extra 0x00 byte carries a clear sign bit, making the value positive. This is exactly analogous to DER integer encoding (Chapter 1), where a leading 0x00 prevents a positive integer from being misread as negative. The same principle—the same bit, in the same position—drives both encodings.
TRUE and FALSE follow the same rules. FALSE is any byte array that evaluates to zero: the empty array, or any array consisting entirely of 0x00 bytes. TRUE is anything else. OP_CHECKSIG pushes 0x01 for success and an empty array for failure.
The small-number opcodes OP_1 through OP_16 push single-byte values 0x01 through 0x10. OP_0 (also called OP_FALSE) pushes an empty array. OP_1NEGATE pushes 0x81—which, per the table above, is exactly \(-1\).
The small-number opcodes and TRUE/FALSE semantics are not theoretical curiosities—they appear in virtually every Bitcoin transaction. Here are real specimens you can inspect on mempool.space.
OP_CHECKSIG: TRUE and FALSE on the Stack.
The very first person-to-person Bitcoin transaction—Satoshi to Hal Finney, Block 170 (January 12, 2009)†—used a Pay-to-Public-Key (P2PK) output:
<65-byte pubkey> OP_CHECKSIG
When this output was spent, the script interpreter executed OP_CHECKSIG with the signature and public key on the stack. On success, OP_CHECKSIG pushed the byte 0x01—the number \(1\), which is TRUE—onto the stack. If the signature had been invalid, it would have pushed an empty byte array (FALSE). That single byte decided whether 10 BTC moved or didn't.
OP_0, OP_1, OP_2: Bare Multisig.
One of the earliest bare multisig transactions (Block 164,467, January 2012)† created a 1-of-2 multisig output:
OP_1
Here OP_1 (0x51) pushes the number \(1\) (the threshold—one signature required), and OP_2 (0x52) pushes \(2\) (the total number of public keys). OP_CHECKMULTISIG pops these numbers to know how many keys and signatures to expect. Without the small-number opcodes, there would be no way to parameterize multisig scripts.
OP_0, OP_2, OP_3: The Classic 2-of-3 Multisig Spend.
The workhorse of Bitcoin custody—the 2-of-3 multisig—uses three small-number opcodes. This P2SH spend from Block 205,285 (October 2012)† reveals the redeem script inside the scriptSig:
OP_2
The scriptSig that satisfies it is:
OP_0
The leading OP_0 deserves attention: it pushes an empty byte array (FALSE) as a dummy element consumed by OP_CHECKMULTISIG. This is the famous off-by-one bug in Satoshi's original implementation—OP_CHECKMULTISIG pops one more element than it should. The bug can never be fixed (it would break consensus), so every multisig spend in Bitcoin history includes this phantom OP_0†.
OP_RETURN: Provably Unspendable.
This transaction from Block 308,570 (June 2014)† immortalized a message on the blockchain:
OP_RETURN 636861726c6579206c6f766573206865696469
The hex decodes to the ASCII string "charley loves heidi." OP_RETURN (0x6a) immediately marks the script as invalid—it makes the output provably unspendable. The value field is 0 satoshis. Because nodes know this output can never be spent, they do not store it in the UTXO set, saving memory.
OP_0 and OP_1 as Witness Version Numbers.
Since SegWit (Block 481,824, August 2017)†, the small-number opcodes took on a second life as witness version indicators. A SegWit v0 output (P2WPKH or P2WSH) begins with OP_0:
OP_0 <20-byte-hash> (P2WPKH—witness version 0)
A Taproot output (BIP 341, activated November 2021) begins with OP_1:
OP_1 <32-byte-x-only-pubkey> (P2TR—witness version 1)
The opcodes OP_2 through OP_16 are reserved for future witness versions—a design that allows up to 15 additional SegWit upgrades via soft fork without changing the script language. Every modern Bitcoin wallet output on the network today contains one of these small-number opcodes as its first byte.
Numeric Values in Timelocks.
OP_CHECKLOCKTIMEVERIFY (CLTV, BIP 65) and OP_CHECKSEQUENCEVERIFY (CSV, BIP 112) compare a numeric value on the stack against the transaction's nLockTime or nSequence fields. A typical CLTV script locks funds until a specific block height:
<4-byte locktime> OP_CHECKLOCKTIMEVERIFY OP_DROP
OP_DUP OP_HASH160
The locktime value (e.g., block height 500,000 \(=\) 0x000750a0, stored on the stack as a0 07 20 00 in little-endian sign-magnitude) is a Script integer. It follows the same encoding rules: little-endian, sign bit in the highest bit of the last byte, minimal encoding required. CLTV and CSV are the only standard opcodes that interpret Script numbers as absolute or relative time—making the number encoding rules directly consensus-critical for timelock contracts, Lightning Network payment channels, and vault constructions.
You have now seen the sign-bit problem twice: once in DER encoding (Chapter 1, where ECDSA signature integers need a 0x00 pad when the high byte is \(\geq\) 0x80) and again here in Script number encoding. Both are sign-magnitude systems. Both use bit 7 of the last (DER) or first (Script, after little-endian reordering) significant byte as the sign indicator. When you encounter a mysterious 0x00 byte in a Bitcoin transaction, the sign bit is almost always the explanation.
Bitcoin Script is deliberately not Turing-complete. There are no loop constructs (for, while), no jump instructions, and no recursion. Every script executes in a single linear pass through its opcodes, which means:
This was a deliberate security decision. On August 25, 2010, Satoshi disabled 16 opcodes in a single commit titled simply "misc changes"†. Bitcoin chose safety over expressiveness. Ethereum made the opposite choice.
Among the 16 disabled opcodes, OP_CAT has the most interesting history. It concatenates two byte arrays on the stack: given [x1, x2], it produces x1 || x2. Simple enough—but this simplicity concealed an exponential memory bomb.
The exploit. A malicious script could push a single byte onto the stack and then repeat OP_DUP OP_CAT in sequence. Each pair doubles the top element's size:
Step | Element size | Operations
0 | 1 byte | Push 0x01
1 | 2 bytes | OP_DUP OP_CAT
2 | 4 bytes | OP_DUP OP_CAT
| |
20 | 1 MB | OP_DUP OP_CAT
30 | 1 GB | OP_DUP OP_CAT
40 | 1 TB | OP_DUP OP_CAT
Forty repetitions—80 bytes of script—would demand a stack element exceeding 1 terabyte†. In practice, Bitcoin already had a 5,000-byte maximum stack element size in 2010, which capped the actual damage. But Satoshi evidently concluded that the risk of undiscovered interactions among the string and arithmetic opcodes outweighed their utility, and disabled the entire batch prophylactically. No exploit was ever observed in the wild.
Why bring it back? In the years since, developers realized that OP_CAT is the missing primitive that would unlock an entire class of Bitcoin contracts. Concatenation is the key operation for:
OP_CAT with Schnorr signatures, scripts can inspect and constrain the spending transaction itself (emulating CheckSigFromStack).OP_CAT is restored. If elliptic curve cryptography is ever broken, Lamport signatures could serve as a fallback.BIP 347† proposes re-enabling OP_CAT exclusively within Tapscript (BIP 342), by redefining the currently-undefined opcode OP_SUCCESS126—which uses the same byte value (0x7e) as the original OP_CAT. The exponential memory attack is neutralized because Tapscript enforces a hard 520-byte maximum on every stack element. Forty doublings would be stopped cold after 9 iterations (512 bytes), long before any resource exhaustion. The proposal is a soft fork: old nodes see OP_SUCCESS126 and treat the script as automatically valid; new nodes enforce the concatenation rules.
Bitcoin Script has approximately 100 defined opcodes, organized by function. Many were disabled early in Bitcoin's history. Here are the categories that matter for transaction parsing:
These are the most common opcodes in real transactions. They push raw data onto the stack:
Opcode | Hex | Meaning
OP_0 | 0x00 | Push empty byte array ("false")
0x01–0x4b | 0x01–0x4b | Push the next \(N\) bytes (\(N\) = opcode value)
OP_PUSHDATA1 | 0x4c | Next byte = length \(L\); push next \(L\) bytes
OP_PUSHDATA2 | 0x4d | Next 2 bytes = length \(L\) (LE); push next \(L\) bytes
OP_PUSHDATA4 | 0x4e | Next 4 bytes = length \(L\) (LE); push next \(L\) bytes
OP_1–OP_16 | 0x51–0x60 | Push the number 1–16
OP_1NEGATE | 0x4f | Push the number \(-1\)
The push opcodes 0x01–0x4b are not named—they are implicit. When the script engine encounters the byte 0x47 (= 71), it reads the next 71 bytes and pushes them as a single stack element. This is how signatures and public keys get onto the stack.
Opcode | Hex | Effect
OP_DUP | 0x76 | Duplicate the top item
OP_DROP | 0x75 | Remove the top item
OP_SWAP | 0x7c | Swap the top two items
OP_IFDUP | 0x73 | Duplicate top if non-zero
OP_DEPTH | 0x74 | Push the stack size
OP_TOALTSTACK | 0x6b | Move top to alt stack
OP_FROMALTSTACK | 0x6c | Move top of alt stack to main
Opcode | Hex | Effect
OP_HASH160 | 0xa9 | Pop top, push RIPEMD-160(SHA-256(top))
OP_HASH256 | 0xaa | Pop top, push SHA-256(SHA-256(top))
OP_SHA256 | 0xa8 | Pop top, push SHA-256(top)
OP_RIPEMD160 | 0xa6 | Pop top, push RIPEMD-160(top)
OP_CHECKSIG | 0xac | Pop pubkey and sig, verify ECDSA; push 1 (true) or 0 (false)
OP_CHECKMULTISIG | 0xae | Pop \(m\) sigs, \(n\) pubkeys, and a dummy element; verify \(m\)-of-\(n\)
OP_CHECKMULTISIG has a famous implementation bug: it pops one more element from the stack than it needs. After popping \(m\) signatures and \(n\) public keys, it pops one additional "dummy" element that it completely ignores. Callers must push an extra OP_0 (or any value) before the signatures.
This is not a security vulnerability—it's just a waste of one byte per multisig input. But it has been enshrined in consensus: every multisig transaction since 2009 must include the dummy element. BIP 147† (2017) tightened the rule to require the dummy be exactly OP_0 (not just any value), preventing a minor malleability vector.
We'll see this in action in Chapter 7 when we parse a real multisig transaction. The wasted byte is a permanent reminder that consensus bugs, once deployed, can never be fixed—only constrained.
Opcode | Hex | Effect
OP_EQUAL | 0x87 | Pop two items, push 1 if identical, 0 otherwise
OP_EQUALVERIFY | 0x88 | Same as OP_EQUAL + OP_VERIFY (fail if not equal)
OP_VERIFY | 0x69 | Pop top; if zero, script fails immediately
OP_RETURN | 0x6a | Immediately marks output as provably unspendable
The *VERIFY variants are compound opcodes: they perform the base operation and then immediately fail the script if the result is false. This is a common pattern—OP_EQUALVERIFY is used in P2PKH scripts (Chapter 5) to verify the public key hash matches before checking the signature.
Opcode | Hex | Effect
OP_IF | 0x63 | Pop top; if non-zero, execute the following block
OP_NOTIF | 0x64 | Pop top; if zero, execute the following block
OP_ELSE | 0x67 | Execute this block if the preceding IF/NOTIF was not taken
OP_ENDIF | 0x68 | End the conditional block
These are the only flow control opcodes in Bitcoin Script—and crucially, they are not loops. OP_IF/OP_ELSE/OP_ENDIF create branching paths, but execution always moves forward. This is how Bitcoin implements spending conditions with multiple paths: "either provide a signature for key A, or wait 100 blocks and provide a signature for key B." We'll see this pattern in HTLC scripts (Chapter 18) and timelocks (Chapter 16).
Bitcoin Script supports limited integer arithmetic on stack elements interpreted as numbers (up to 4 bytes / 32 bits):
Opcode | Hex | Effect
OP_ADD | 0x93 | Pop two, push their sum
OP_SUB | 0x94 | Pop two, push \(b - a\) (second minus top)
OP_1ADD | 0x8b | Pop top, push \(\text{top} + 1\)
OP_1SUB | 0x8c | Pop top, push \(\text{top} - 1\)
OP_LESSTHAN | 0x9f | Pop two; push 1 if \(b < a\), else 0
OP_GREATERTHAN | 0xa0 | Pop two; push 1 if \(b > a\), else 0
OP_WITHIN | 0xa5 | Pop three; push 1 if \(\min \leq x < \max\)
OP_NUMEQUAL | 0x9c | Pop two; push 1 if numerically equal
Note the absence of multiplication, division, and modular arithmetic—those were among the opcodes disabled in 2010. The remaining arithmetic is sufficient for timelocks (comparing block heights and timestamps) but not for general-purpose computation.
The arithmetic opcodes are a subtler and more instructive story than OP_CAT. They were not disabled because of a single dramatic exploit. They were disabled because they turned every node's consensus logic into a thin wrapper around OpenSSL's arbitrary-precision bignum library—and that library was never designed to be consensus-critical.
The setup. In Satoshi's original script.cpp†, the constant nMaxNumSize was set to 258—meaning Script integers could be up to 258 bytes (2,064 bits). Compare that to the 4-byte (32-bit) limit enforced today. The arithmetic opcodes called directly into OpenSSL:
Opcode | Implementation
OP_MUL | BN_mul(\ | bn, \ | bn1, \ | bn2, pctx)
OP_DIV | BN_div(\ | bn, NULL, \ | bn1, \ | bn2, pctx)
OP_MOD | BN_mod(\ | bn, \ | bn1, \ | bn2, pctx)
OP_LSHIFT | bn = bn1 << bn2.getulong() (up to 2,048 bits)
OP_2MUL | bn <<= 1
OP_2DIV | bn >>= 1
Problem 1: OpenSSL as consensus engine. Bitcoin's consensus rules must produce identical results on every node, on every platform, with every compiler, forever. OpenSSL was not designed for this. Different versions of BN_div could handle edge cases differently—negative dividends, very large operands, platform-specific rounding. If two nodes running different OpenSSL versions disagreed on the result of a single OP_DIV, they would disagree on whether a transaction was valid. That is a consensus fork—the most dangerous failure mode in Bitcoin. This was not a theoretical concern: OpenSSL was updated frequently, and Bitcoin had no control over which version miners and node operators ran.
Problem 2: The OP_LSHIFT crash. A specific bug was discovered in OpenSSL's BN_lshift that could crash any Bitcoin node when triggered by a crafted script†. This was a remote code execution-class vulnerability: anyone could broadcast a transaction containing the malicious script, and every node that attempted to validate it would crash. A sustained attack could take down the entire network.
Problem 3: Computational cost. Multiplying two 258-byte (2,064-bit) numbers is an \(O(n^2)\) operation in the number of digits. With the 200-opcode limit, an attacker could force roughly 100 arbitrary-precision multiplications per script evaluation. Every node on the network would have to perform this computation for every such transaction—a denial-of-service vector that scaled with the number of malicious transactions broadcast.
Problem 4: Unbounded output growth. The nMaxNumSize check applied only to inputs—the code verified that each operand was \(\leq 258\) bytes before the operation, but never checked the result. Multiplying two 258-byte numbers could produce a 516-byte result. Left-shifting by 2,048 bits could grow a number by 256 bytes. These oversized results could then interact with other opcodes in unpredictable ways.
The decision. Satoshi's response was characteristically decisive: kill the entire batch. Not just the opcodes with known bugs, but every opcode that delegated consensus-critical computation to OpenSSL's bignum library. The opcodes that survived (OP_ADD, OP_SUB, comparisons) were later reimplemented in the self-contained CScriptNum class with a strict 4-byte input limit, eliminating the OpenSSL dependency entirely.
Today's Script arithmetic is deliberately impoverished: integers are capped at \(2,147,483,647\) (32 bits), there is no multiplication, no division, no modular arithmetic. Bitcoin traded computational expressiveness for the guarantee that every node will agree on every result, forever. For a system that secures hundreds of billions of dollars, that trade-off was correct.
Context. This purge happened in August 2010—the same month as the infamous "value overflow" bug that created 184 billion bitcoins out of thin air.† It was the most dangerous month in Bitcoin's history, and Satoshi's response to both crises was characteristically swift.
The irony. Some of these disabled opcodes—particularly OP_CAT (concatenate two byte strings)—are now considered useful for advanced covenant designs. A proposal to re-enable OP_CAT in Tapscript (BIP 347†) has been under active discussion since 2024, nearly 14 years after it was disabled. The disabled opcodes are ghosts in the instruction set—visible but untouchable, at least for now.
Now we execute the Satoshi-to-Hal transaction from Chapter 1. This is a P2PK (Pay-to-Public-Key) spend, the simplest possible script type.
From Chapter 1, we extracted:
47 304402204e45e169...8d1d0901<71-byte signature>41 04ae1a62...6cd84c ac<65-byte pubkey> OP_CHECKSIGThe scriptSig has one operation: push the signature. The scriptPubKey has two: push the public key, then run OP_CHECKSIG.
For reference, the actual hex values that flow through the stack:
304402204e45e169...8d1d0901 (the DER-encoded \((r, s)\) plus SIGHASH byte, as parsed in Chapter 1)04ae1a62fe09c5f5...6cd84c (Hal Finney's uncompressed secp256k1 point)That's it. Three steps. The entire validation of the first person-to-person Bitcoin transaction comes down to: push a signature, push a public key, check the signature.
OP_CHECKSIG is the most complex opcode in Bitcoin Script. Behind that single byte 0xac is a multi-step process:
0x01 = SIGHASH_ALL, meaning the signature commits to all inputs and all outputs.TRUE (the byte 0x01) if verification succeeds, FALSE (empty array) if it fails.The key insight is step 4b: the scriptSig field—which contains the signature we're verifying—is replaced with the scriptPubKey before hashing. This solves what would otherwise be an impossible circularity: the signature cannot sign itself.
The replacement is specific to each input. When validating input \(i\), the scriptSig of input \(i\) is replaced with the relevant scriptPubKey; the scriptSigs of all other inputs are set to empty. This means each input's signature is independent—it commits to the full transaction structure but not to other inputs' signatures.
Chapter 3 dissects this process byte by byte, including the four SIGHASH variants that control exactly which inputs and outputs the signature covers.
The mental model that unlocks Bitcoin Script is this: every output is a puzzle, and every input is a solution.
The scriptPubKey defines conditions—"produce a valid signature for this public key" or "provide a hash preimage that matches this digest" or "present \(m\) signatures from this set of \(n\) keys." The scriptSig provides the evidence that satisfies those conditions—a signature, a public key, a preimage, whatever the puzzle demands.
This is why Satoshi called it a "predicate": the scriptPubKey is a Boolean function that returns true or false. The scriptSig provides the inputs to that function. If the combined execution leaves TRUE on the stack, the predicate is satisfied and the coins can move.
This model scales to every transaction type in Bitcoin's history. We have only seen P2PK so far, but here is a preview of the pattern as it evolves across the chapters ahead:
Type | scriptPubKey (puzzle) | scriptSig (solution)
P2PK (Ch. 4) | "Check sig against this key" | Signature
P2PKH (Ch. 5) | "Check sig against the key that hashes to this" | Signature + public key
P2SH (Ch. 6) | "Provide a script that hashes to this, then satisfy it" | Arguments + redeem script
P2WPKH (Ch. 9) | "SegWit v0 keyhash program" | (empty; evidence moves to a new "witness" field)
P2TR (Ch. 12) | "Taproot: tweaked key" | (empty; witness has Schnorr sig or script path)
Don't worry about the unfamiliar names—every row in this table gets its own chapter. The point is that the lock-and-key model never changes. Every chapter from here forward is a variation on this theme.
Beyond the individual opcode semantics, Bitcoin enforces several global rules on script execution:
OP_PUSHDATA2 when a direct 0x05 push would suffice is non-standard (though not consensus-invalid).These limits are designed to prevent denial-of-service attacks. A script that could create a million stack elements or run ten thousand opcodes could slow validation to a crawl. The limits ensure that any valid script can be executed in bounded time with bounded memory.
Bitcoin has two layers of transaction rules:
Consensus rules are enforced by all nodes. A transaction that violates a consensus rule is invalid and will never be included in a block. If a miner includes it anyway, the block itself is rejected by the network.
Standardness rules are enforced only by default node configurations when relaying unconfirmed transactions. A "non-standard" transaction won't be relayed by most nodes, but if a miner includes it in a block, the block is valid. Standardness rules are policy, not law.
This distinction matters because many Script limitations are standardness rules, not consensus. A miner can include a transaction with a 5,000-opcode script if they choose—other nodes will accept the block. But getting that transaction to the miner is the challenge, since relay nodes will reject it.
The P2PK script we executed is the simplest possible Bitcoin script. It has one virtue—simplicity—and several problems:
These limitations drove the evolution from P2PK to P2PKH ("pay to the hash of the public key"), which introduced human-readable addresses, reduced output sizes, and added a layer of hash protection. The structural difference is visible in the scriptPubKey:
P2PKH replaces the 65-byte key with a 20-byte hash and adds three opcodes—cutting the output from 67 bytes to 25 bytes while hiding the public key behind two layers of hashing. That's Part II.
But before we get there, we need to understand one more foundational piece: how the transaction bytes are serialized for signing, how the sighash is computed, and how the different SIGHASH flags change what a signature commits to. That's Chapter 3.
OP_CHECKSIG executes? What are they, listed from bottom to top?0x47 is not OP_anything—it is a data push. How many bytes does it push? What is the general rule for opcodes in the range 0x01–0x4b?OP_CHECKSIG push onto the stack when the signature is invalid?OP_DUP OP_HASH160 <20-byte hash> OP_EQUALVERIFY OP_CHECKSIG. Trace the execution of this script with a valid scriptSig of <sig> <pubkey>. Show the stack state after each opcode. (You don't need real hex values—use symbolic names.)0x47) followed by 71 bytes of signature data. Suppose the signature had been 73 bytes instead of 71. What push opcode would have been used? What if the data were 76 bytes? What about 256 bytes?76 a9 14 89abcdef89abcdef89abcdef89abcdef89abcdef 88 ac. What type of script is this? (Hint: this is not P2PK.)OP_CAT (BIP 347) change this? What does the covenant debate tell us about Bitcoin's approach to programmability?OP_CHECKSIG, the scriptSig is replaced with the scriptPubKey before hashing. But what if the spender wants to sign only some of the outputs (to allow others to add more)? What if they want to allow additional inputs? How many meaningful combinations of "which inputs" and "which outputs" exist? (Chapter 3 covers the answer: the four SIGHASH flags.)L1. Two items: the signature (bottom) and the public key (top). The scriptSig pushed the signature; the scriptPubKey's first instruction pushed the public key on top. OP_CHECKSIG pops both—public key first (it's on top), then the signature.
L2. 0x47 = 71 decimal, so it pushes the next 71 bytes. The general rule: any opcode in the range 0x01–0x4b (1–75) is an implicit push that reads the next \(N\) bytes (where \(N\) is the opcode value) and pushes them as a single stack element. These opcodes have no names—they are pure data pushes.
L3. False. Since 2010, the scriptSig and scriptPubKey are executed separately. The scriptSig runs first, its resulting stack is copied, and the scriptPubKey runs on the copied stack. They are never concatenated.
L4. 201 non-push opcodes. Data push opcodes (0x00–0x4e and OP_1–OP_16) are not counted toward this limit.
L5. OP_CHECKSIG pushes an empty byte array (which evaluates to FALSE/zero) when the signature is invalid.
H1. P2PKH scriptSig: <sig> <pubkey>. P2PKH scriptPubKey: OP_DUP OP_HASH160 <hash> OP_EQUALVERIFY OP_CHECKSIG.
Execution trace:
Step | Operation | Stack (top right)
0 | — | (empty)
1 | Push <sig> | [sig]
2 | Push <pubkey> | [sig, pubkey]
3 | OP_DUP | [sig, pubkey, pubkey]
4 | OP_HASH160 | [sig, pubkey, hash(pubkey)]
5 | Push <hash> | [sig, pubkey, hash(pubkey), hash]
6 | OP_EQUALVERIFY | [sig, pubkey] (fails if hashes differ)
7 | OP_CHECKSIG | [TRUE]
Steps 1–2 are the scriptSig; steps 3–7 are the scriptPubKey. The key insight is step 6: OP_EQUALVERIFY checks that the hash of the provided public key matches the hash stored in the output. Only then does OP_CHECKSIG check the signature against that key.
H2.
0x49 (= 73). Still in the direct-push range 0x01–0x4b.0x4c (= 76) would work as a direct push, but 76 is exactly the boundary. Direct push goes up to 0x4b (= 75). So 76 bytes requires OP_PUSHDATA1 (0x4c) followed by the length byte 0x4c (= 76), then the 76 data bytes. Total: \(1 + 1 + 76 = 78\) bytes.OP_PUSHDATA1 (0x4c) can handle lengths up to 255. For 256 bytes, you need OP_PUSHDATA2 (0x4d) followed by a 2-byte little-endian length 00 01 (= 256), then the 256 data bytes. Total: \(1 + 2 + 256 = 259\) bytes.H3. The Block 1 coinbase transaction has a single output with scriptPubKey:
41 04...pubkey...ac
Yes, it is P2PK—the same <65-byte pubkey> OP_CHECKSIG pattern. The public key (0496b538...) is different from Block 9's coinbase key (0411db93...) and from Hal Finney's key. However, Patoshi pattern analysis suggests both Block 1 and Block 9 were mined by the same entity (Satoshi), using different key pairs.
H4. Byte-by-byte decode:
76: OP_DUP (duplicate top of stack)a9: OP_HASH160 (pop, push RIPEMD160(SHA256(top)))14: Push next 20 bytes (0x14 = 20)89abcdef...89abcdef: 20-byte hash value88: OP_EQUALVERIFY (check equality, fail if not equal)ac: OP_CHECKSIG (verify signature)This is a P2PKH (Pay-to-Public-Key-Hash) script. The telltale pattern is 76 a9 14 <20 bytes> 88 ac—we will dissect this in Chapter 5.
P1. The script is a finite byte sequence of length \(n\). Execution proceeds by reading opcodes from left to right, advancing a program counter. Each opcode consumes at least 1 byte: a non-push opcode consumes exactly 1 byte, while a push opcode 0x01–0x4b consumes \(1 + L\) bytes (1 opcode byte + \(L\) data bytes), and OP_PUSHDATA1/2/4 consume \(2 + L\), \(3 + L\), or \(5 + L\) bytes respectively. Since each step advances the program counter by at least 1 byte, and the script is \(n\) bytes long, execution terminates in at most \(n\) steps. (In practice, the 201 non-push opcode limit tightens this further, but the byte-length argument suffices for termination.)
P2. Each step either pushes at most one item (push opcode, OP_DUP) or pops items (OP_DROP, OP_CHECKSIG). Since there are at most \(n\) steps (by P1), the stack can grow by at most 1 per step, giving a bound of \(n\).
A tighter bound: push opcodes that push \(L\) bytes consume \(1+L\) bytes of script, but add only 1 stack element. So a script of \(n\) bytes can contain at most \(n/2\) push operations (each consuming \(\geq 2\) bytes: 1 opcode + \(\geq 1\) data byte), giving a stack depth bound of \(n/2\). The even tighter bound accounts for OP_DUP: a single-byte opcode that increases stack depth by 1, so alternating OP_0 OP_DUP OP_0 OP_DUP ... gives a depth of \(n\) again. In practice, the 1,000-element stack limit is the binding constraint, not the script length.
P3. Under concatenation, the combined script is scriptSig | scriptPubKey, executed as a single program. The scriptSig can include opcodes, not just data, which lets it interfere with the scriptPubKey's logic.
Concrete attack. Consider a scriptPubKey that checks a hash preimage:
OP_HASH160
The intended scriptSig is <preimage>—only someone who knows the secret can spend.
Under concatenation, a malicious scriptSig can bypass this:
OP_DROP OP_1
The combined program becomes: OP_DROP OP_1 OP_HASH160 <hash> OP_EQUAL. Execution:
OP_DROP. This specific attack fails on an empty stack.A subtler attack: a scriptSig that pushes extra items to manipulate the final stack. Consider scriptSig = . Under concatenation, this executes:
stack: [wrong]OP_HASH160 stack: [hash(wrong)] stack: [hash(wrong), expected]OP_EQUAL stack: [FALSE]This still fails—the attacker recomputed the same check. But now the scriptPubKey runs on a stack that already has FALSE on it, and the scriptPubKey's own OP_HASH160 operates on FALSE. The result is unpredictable and depends on the specific scriptPubKey logic.
The real vulnerability is simpler: without a clean-stack rule, a scriptSig of under concatenation pushes [garbage, 1] onto the stack. If the scriptPubKey's execution path doesn't consume that extra element, the script finishes with [garbage, ..., result]. If the top element happens to be truthy, the transaction is valid—even though the scriptPubKey's actual check might have pushed its result beneath the attacker's leftover items.
Under separated execution: The scriptSig runs alone. It can push data, but it cannot run opcodes that "set up" the stack in ways that bypass the scriptPubKey. When the scriptPubKey runs on the copied stack, it sees exactly the data items the scriptSig left—nothing more. A scriptSig of OP_1 leaves [1] on the stack. The scriptPubKey's OP_HASH160 then hashes 1, compares it to the expected hash, and fails. The attacker has no way to influence the scriptPubKey's control flow because the two execution phases are completely isolated.
C1. A covenant restricts how the output of a future spending
transaction is structured—the scriptPubKey would need to inspect the spending transaction's outputs and enforce conditions on them. Bitcoin Script cannot do this because no opcode lets a script read the spending transaction's fields. OP_CHECKSIG hashes the transaction internally, but the script only sees the Boolean result (valid/invalid), not the serialized bytes.
OP_CAT would allow scripts to construct byte strings on the stack.
Combined with OP_CHECKSIG and the existing hash opcodes, a script could reconstruct the expected sighash preimage, enforce that specific outputs are present, and verify a signature over the constructed data. This is the core of the covenant proposal: use OP_CAT to build the signed message on-stack and verify it matches what OP_CHECKSIG expects.
The debate reflects Bitcoin's philosophical tension: maximizing programmability (enabling vaults, payment pools, trustless bridges) vs. minimizing attack surface (every new opcode is a potential vulnerability). Bitcoin's history suggests it will move slowly if at all.
C2. Bitcoin limits the instruction set: no loops, no unbounded computation, maximum opcode count of 201. This makes analysis trivial—any script can be validated in bounded time. Ethereum limits the execution cost: any program can run, but each operation costs gas, and a transaction includes a gas limit. If gas runs out, execution reverts.
Security trade-offs: Bitcoin's approach makes it impossible to write a denial-of-service script—the worst-case cost is fixed. Ethereum's approach enables more complex contracts but creates a gas-estimation problem (underestimate and the transaction fails, overestimate and you overpay) and has produced many reentrancy attacks (the DAO hack of June 2016†) that are structurally impossible in Bitcoin Script because there are no external calls.
An example safe in Bitcoin but dangerous in EVM: a simple payment. In Bitcoin, <sig> <pubkey> | OP_CHECKSIG executes in 3 steps, always. In Solidity, a transfer() call can trigger arbitrary code in the recipient contract, which can call back into the sender (reentrancy).
An example possible in EVM but impossible in Bitcoin: an on-chain automatic market maker (AMM) that maintains state, performs multiplication and division on token amounts, and updates balances across multiple calls. Bitcoin Script has no state, no multiplication (OP_MUL disabled), and no loops.
B1. Four SIGHASH types control what the signature commits to:
This gives \(3 \times 2 = 6\) meaningful combinations. SIGHASH_ALL handles standard payments; ANYONECANPAY|ALL enables crowdfunding; NONE creates blank checks; SINGLE supports token-like constructions.
B2. P2PKH needs to verify that the spender's public key matches the hash stored in the output. The scriptPubKey becomes:
OP_DUP OP_HASH160 <20-byte hash> OP_EQUALVERIFY OP_CHECKSIG
OP_DUP duplicates the public key (so one copy survives the hash check). OP_HASH160 hashes it. The 20-byte expected hash is pushed. OP_EQUALVERIFY checks they match. Finally, OP_CHECKSIG verifies the signature against the original public key. Predicted stack trace: [sig, pk] [sig, pk, pk] [sig, pk, hash(pk)] [sig, pk, hash(pk), expected] [sig, pk] [TRUE].