On November 14, 2021, at 05:15 UTC, F2Pool mined block 709,632—the block at which Taproot activated.† Three blocks later, a user named @bitbug42 broadcast a transaction with an OP_RETURN message that captured the moment:
I like Schnorr sigs and I cannot lie.
That transaction—our specimen for this chapter—was one of the first Taproot spends ever confirmed. Its witness field contained something no Bitcoin transaction had ever carried before: a 64-byte Schnorr signature. No DER encoding. No SIGHASH byte. No variable-length anything. Just 64 bytes of pure mathematical elegance: 32 bytes for \(R_x\) (the \(x\)-coordinate of the nonce point) and 32 bytes for \(s\) (the scalar).
Taproot was specified across three BIPs:†
| BIP | Scope | Purpose |
|---|---|---|
| 340 | Signatures | Schnorr signature scheme for secp256k1. Defines signing, verification, batch verification, and the tagged hash construction. |
| 341 | Spending rules | Taproot output structure: the output key, key path spending, script path spending, the control block, and TapTweak. |
| 342 | Script rules | TapScript: the modified Script rules for Taproot script-path execution, including new opcodes (OP_CHECKSIGADD) and revised limits. |
Together, they represent the most mathematically sophisticated upgrade in Bitcoin's history—and the most elegant. Where SegWit was engineering (reorganize the data structures), Taproot is cryptography (exploit the algebraic structure of elliptic curves to hide complexity behind simplicity).
Our specimen is one of the first Taproot spends ever confirmed, from block 709,635 (November 14, 2021)—just three blocks after activation.
Txid:†
33e794d097969002ee05d336686fc03c9e15a597c1b9827669460fac98799036
| Metric | Value | Notes |
|---|---|---|
| Total size | 220 B | Including witness |
| Stripped size | 152 B | Without marker, flag, witness |
| Weight | 676 WU | \(152 \times 3 + 220\) |
| vsize | 169 vB | \(676/4\) |
| Fee | 21,250 sats | \(88,480 - 0 - 67,230\) |
| Fee rate | 125.7 sat/vB | Celebratory premium |
The overall structure is identical to the SegWit transactions of Part III: version, marker, flag, inputs, outputs, witness, locktime. The revolution is in what sits in the witness and how the output is interpreted.
| Field | Hex (little-endian) | Value |
|---|---|---|
| Version | 01000000 | 1 |
| Marker | 00 | SegWit marker |
| Flag | 01 | SegWit flag |
| Input count | 01 | 1 input |
| Prev txid | d1f1c1f8…054958 | LE; reversed = 5849051c… |
| Prev vout | 01000000 | 1 |
| ScriptSig length | 00 | 0 bytes (empty) |
| Sequence | fdffffff | 0xfffffffd: RBF-enabled |
As with all native SegWit spends, the scriptSig is empty. The spending proof lives entirely in the witness.
| Field | Hex (little-endian) | Value |
|---|---|---|
| Output count | 02 | 2 outputs |
| Value | 0000000000000000 | 0 sats |
| SPK length | 31 | 49 bytes |
| scriptPubKey | 6a2f49206c…3432 | OP_RETURN + 47-byte message |
| Value | 9e06010000000000 | 67,230 sats |
| SPK length | 22 | 34 bytes |
| scriptPubKey | 5120a37c…28f9 | P2TR (OP_1 + 32-byte key) |
| Locktime | ffd30a00 | 709,631 (anti-fee-sniping) |
Output 0 is an OP_RETURN—a provably unspendable data carrier. The 47-byte payload decodes to ASCII: I like Schnorr sigs and I cannot lie. @bitbug42.
Output 1 is a P2TR output. Its 34-byte scriptPubKey is the fingerprint we need to examine.
Every P2TR scriptPubKey is exactly 34 bytes:
| Hex | Opcode | Meaning |
|---|---|---|
51 | OP_1 | Witness version 1 |
20 | OP_PUSHBYTES_32 | Push next 32 bytes |
<32 bytes> | output key | x-only public key \(Q\) |
Our specimen's prevout scriptPubKey: 51 (OP_1) 20 (push 32) 339ce7e1 65e67d93 adb3fef8 8a6d4bee d33f01fa 876f05a2 25242b82 a631abc0 (32-byte output key \(Q\))
Compare this with every previous output type:
| Type | SPK size | Structure |
|---|---|---|
| P2PK (uncompressed) | 67 B | |
| P2PKH | 25 B | DUP HASH160 <20B hash> EQUALVERIFY CHECKSIG |
| P2SH | 23 B | HASH160 <20B hash> EQUAL |
| P2WPKH | 22 B | OP_0 <20B hash> |
| P2WSH | 34 B | OP_0 <32B hash> |
| P2TR | 34 B | OP_1 <32B key> |
P2TR is the same size as P2WSH (34 bytes), but stores a key, not a hash. This is a fundamental design choice: the output commits directly to a public key rather than a hash of a script. The key can encode a simple single-signer case or a complex multi-party arrangement—and no observer can tell which until the output is spent.
OP_1 (0x51) signals witness version 1. Recall that OP_0 (0x00) signals version 0 (P2WPKH and P2WSH from Part III). BIP 141† reserved OP_1 through OP_16 for future witness versions, enabling upgrades via soft fork without changing the basic output structure. Taproot uses version 1; versions 2–16 remain available for future protocols.
Old nodes see OP_1 <32 bytes> and treat it as "anyone can spend" (the stack ends with a truthy value). This is how Taproot maintains backward compatibility as a soft fork: old nodes accept Taproot transactions without understanding them; new nodes enforce the Taproot rules.
The 32-byte value in the P2TR scriptPubKey is an x-only public key—a secp256k1 point represented by its \(x\)-coordinate alone, with no prefix byte.
In Parts I–III, public keys were either:
04 + 32-byte \(x\) + 32-byte \(y\).02 or 03 + 32-byte \(x\).Taproot drops the prefix entirely. Since secp256k1 has \(y^2 = x^3 + 7\), every \(x\)-coordinate has two valid \(y\) values: one even, one odd. BIP 340 resolves the ambiguity by convention:
The y-coordinate is always even.
If the "natural" \(y\) for a given \(x\) is odd, the signer negates the private key (\(d \to n - d\)) so that the resulting public key \(Q = dG\) has an even \(y\). This saves 1 byte per key on-chain and eliminates the prefix entirely.
Our specimen's output key:
339ce7e1 65e67d93 adb3fef8 8a6d4bee d33f01fa 876f05a2 25242b82 a631abc0
This 32-byte value is the public key. No prefix, no hash, no indirection. The \(y\)-coordinate is recoverable from the curve equation, and the even-\(y\) convention resolves the ambiguity.
The witness of our specimen contains a single item: a 64-byte Schnorr signature.
a6 0c 38 3f 71 ba c0 ec 91 9b 1d 7d bc 3e b7 2d
d5 6e 7a a9 95 83 61 55 64 f9 f9 9b 8a e4 e8 37
b7 58 77 3a 5b 2e 4c 51 34 88 54 c8 38 9f 00 8e
05 02 9d b7 f4 64 a5 ff 2e 01 d5 e6 e6 26 17 4a
| Component | Size | Value |
|---|---|---|
| \(R_x\) (nonce point \(x\)-coord) | 32 B | a60c383f… e4e837 |
| \(s\) (scalar) | 32 B | b758773a… 26174a |
No DER encoding. No SEQUENCE tags, no INTEGER markers, no padding bytes, no variable lengths. Just two 32-byte integers concatenated. Compare this with the ECDSA signatures from every previous chapter:
| Scheme | Sig size | Encoding |
|---|---|---|
| ECDSA (Chs 1–11) | 71–73 B | DER + SIGHASH byte |
| Schnorr (Ch 12+) | 64 B | Raw \(R_x \| s\) |
The fixed 64-byte format eliminates three problems at once:
Given the output key \(Q\) (x-only, even \(y\)), the 64-byte signature \((R_x, s)\), and the message \(m\) (the BIP 341 sighash digest), verification proceeds:
If the equation holds, the signature is valid.
The signer knows private key \(d\) (with \(Q = dG\), even \(y\)) and picks a secret nonce \(k\) (with \(R = kG\), even \(y\)). The signature is: \[ s = k + e d n \] Verification: \[ sG = (k + ed)G = kG + edG = R + eQ \] This is simpler than ECDSA verification (Chapter 3), which required computing \(s^{-1}\) and two scalar multiplications. Schnorr uses no modular inverse—just one multiplication and one point addition. This simplicity is what enables batch verification and key aggregation.
BIP 340 introduces tagged hashes—a domain separation technique that prevents hash collisions between different protocol contexts:
\[ \text{tagged\_hash}(\text{tag}, m) = \text{SHA-256}(\text{SHA-256}(\text{tag}) \| \text{SHA-256}(\text{tag}) \| m) \]
The tag (e.g., "BIP0340/challenge", "TapTweak", "TapLeaf") is hashed twice and prepended to the message before the outer SHA-256. This ensures that a hash computed for one purpose (signature challenge) can never collide with a hash computed for another (key tweaking), even if the underlying data is identical.
Taproot uses tagged hashes everywhere: signature challenges, key tweaking, Merkle tree computation, and leaf hashing. Each context uses a unique tag string.
Our specimen demonstrates key path spending—the simplest and most common Taproot spend. The entire witness is a single Schnorr signature:
Witness: [<64-byte signature>]
No script. No public key. No redeem script. Just the signature.
The validation rule (BIP 341): if the witness stack contains exactly one element (and that element is 64 or 65 bytes), the node interprets it as a Schnorr signature and verifies it against the output key \(Q\).
This is even simpler than P2PK (Chapter 4), where the witness needed both a signature and a public key. In P2TR key path, the public key is already committed in the output; only the signature is needed.
In P2WPKH (Chapter 9), the witness contains both [sig, pubkey]—the node hashes the pubkey and checks it against the 20-byte witness program. In P2TR key path, the public key is the witness program itself. The 32-byte value after OP_1 is not a hash; it is the actual key. No hashing step is needed. The node verifies the Schnorr signature directly against this key.
This eliminates one layer of indirection—and one potential attack vector. There is no hash preimage to brute-force; the key is right there. The quantum implications are the same as P2PK: the key is exposed from the moment the output is created. However, Taproot's key tweaking mechanism (Section 11.7) provides a different kind of protection.
The 32-byte key in a P2TR output is not the signer's "raw" public key. It is a tweaked key:
\[ Q = P + \text{tagged\_hash}(\texttt{"TapTweak"},\; P_x \| c) \cdot G \]
where \(P\) is the internal key (the signer's actual public key) and \(c\) is the Merkle root of a tree of TapScripts (or empty, if there are no scripts).
For a key-path-only output (no scripts), \(c\) is empty and the tweak simplifies to: \[ Q = P + \text{tagged\_hash}(\texttt{"TapTweak"},\; P_x) \cdot G \]
The tweak serves two purposes:
When spending via key path, the signer does not sign with the original private key \(d\) (where \(P = dG\)). Instead, they sign with the tweaked private key: \[ d' = d + \text{tagged\_hash}(\texttt{"TapTweak"},\; P_x \| c) \pmod{n} \] This ensures \(d'G = dG + tG = P + tG = Q\), so the signature verifies against the output key \(Q\).
P2TR outputs use bech32m encoding (BIP 350†), an updated version of the bech32 encoding from Chapter 9. The addresses start with bc1p (the p encodes witness version 1 in bech32's character set):
bc1pxwww0ct9ue7e8tdnlmug5m2tamfn7q06sahstg39ys4c9f3340qqxrdu9k
This is our specimen's input address. The structure is:
| Component | Value |
|---|---|
| Human-readable part (HRP) | bc (mainnet) |
| Separator | 1 |
| Witness version | p (= 1 in bech32) |
| Data | 32-byte output key, encoded in base-32 |
| Checksum | 6-character BCH code |
The original bech32 (BIP 173†) had a subtle weakness: certain single-character insertions or deletions at the end of the address could go undetected by the checksum. BIP 350 fixed this by changing the checksum constant from 1 to 0x2bc830a3. The fix applies only to witness version 1 and above—version 0 (P2WPKH, P2WSH) continues to use the original bech32 for backward compatibility.
The practical difference: bc1q addresses use bech32; bc1p addresses use bech32m. Wallets must detect the version and apply the correct checksum algorithm.
How does a P2TR key path spend compare to the P2WPKH transactions we studied in Chapter 9?
| Component | P2WPKH | P2TR key path |
|---|---|---|
| scriptPubKey | 22 B | 34 B |
| Witness: | ||
| Item count | 1 B (02) | 1 B (01) |
| Signature | 1 + 72 B | 1 + 64 B |
| Public key | 1 + 33 B | — |
| Total witness | 108 B | 66 B |
| 1-in-2-out transaction: | ||
| Total size | 222 B | 205 B |
| Weight | 561 WU | 616 WU |
| vsize | 141 vB | 154 vB |
P2WPKH: 71-byte typical sig, 33-byte compressed key, both outputs P2WPKH (22 B). P2TR: 64-byte sig, both outputs P2TR (34 B). P2TR outputs are 12 B larger per output.
P2TR key path spending produces larger vsize than P2WPKH for simple single-key transactions (154 vs 141 vbytes for 1-in-2-out). The witness is 42 bytes smaller (no public key, shorter signature), but each P2TR output is 12 bytes larger. Since output bytes are non-witness—weighted at \(4\times\) in the weight formula—the output overhead (\(+96\) WU for two outputs) outweighs the witness savings (\(-41\) WU), yielding a net increase of 55 WU.
The real savings come with complex scripts. A 2-of-3 multisig that would require 3 public keys and 2 signatures in P2WSH can, with MuSig2 key aggregation, be reduced to a single key path spend—indistinguishable from a single-signer transaction. The savings are not in bytes per se, but in hiding complexity behind simplicity.
Taproot introduces a new sighash algorithm (distinct from both the legacy algorithm of Chapter 3 and BIP 143† of Part III). The key differences:
tagged_hash("TapSighash", ...) to prevent cross-protocol hash collisions.0x00, a version byte that allows future sighash upgrades.0x00) that is functionally equivalent to SIGHASH_ALL but encoded as the absence of a suffix byte on the signature. This is why our specimen's signature is 64 bytes, not 65.OP_1 + 32-byte x-only output key.bc1p…) fix a checksum weakness in the original bech32 encoding.Key path spending is the common case—fast, simple, and private. But what happens when the key path is not available? What if one party loses their key, or a timelock expires, or a hash preimage is revealed? Chapter 12 explores the script path: the hidden tree of alternative spending conditions that Taproot can encode behind a single 32-byte output key, revealed only when needed.
Exercises
0140<64 bytes>. Parse the witness structure: what does the 01 mean? What does the 40 mean? Why is there no public key?a5b3c1... (32 bytes). What would the bech32m address look like? (You may use a bech32m encoder.)Solutions
L1. 34 bytes. It contains OP_1 (1 byte, witness version 1), OP_PUSHBYTES_32 (1 byte), and the 32-byte x-only output key \(Q\).
L2. 64 bytes (or 65 with an explicit SIGHASH type). The two components are \(R_x\) (32 bytes, the \(x\)-coordinate of the nonce point \(R\)) and \(s\) (32 bytes, the scalar).
L3. Taproot defines SIGHASH_DEFAULT (hash type 0x00), which is functionally identical to SIGHASH_ALL. When the signer uses SIGHASH_DEFAULT, no suffix byte is appended—the signature is exactly 64 bytes. An explicit SIGHASH type (e.g., SIGHASH_ALL = 0x01) would add a 65th byte. SIGHASH_DEFAULT saves 1 byte per signature in the common case.
L4. Witness version 1. The opcode OP_1 (0x51) signals it.
L5. By the witness stack size. If the witness contains exactly one element (64 or 65 bytes), it is a key path spend—the element is a Schnorr signature verified against the output key. If the witness contains two or more elements, the last element is treated as the control block (which encodes the internal key and Merkle proof), the second-to-last is the script being executed, and the remaining elements are script inputs.
H1. From the raw hex (220 bytes total):
00 01).01 40 + 64 bytes = 66 bytes.H2. The OP_RETURN scriptPubKey is 6a2f49206c696b65…3432. After 6a (OP_RETURN) and 2f (OP_PUSHBYTES_47), the 47-byte payload decodes to ASCII:
I like Schnorr sigs and I cannot lie. @bitbug42
H3. The witness bytes 01 40 <64 bytes>:
01: witness item count (1 item for this input).40: length of the first (and only) witness item (64 bytes = 0x40).There is no public key in the witness because in P2TR key path spending, the output key \(Q\) is already in the scriptPubKey. The node verifies the signature directly against \(Q\)—no hash check, no key transmission needed.
H4. The P2TR scriptPubKey would be: 51 20 a5b3c1… (34 bytes: OP_1 + OP_PUSHBYTES_32 + the 32-byte key). The bech32m address would start with bc1p followed by the base-32 encoding of the 32-byte key and a 6-character checksum. Use a bech32m library (e.g., Python's bech32 module) to compute the exact address.
P1. The signer computes \(s = k + e \cdot d \pmod{n}\), where \(R = kG\), \(Q = dG\), and \(e = H(R_x \| Q_x \| m)\).
sG = (k + ed)G = kG + e dG = R + eQ
The verifier computes \(e\) from the public data (\(R_x\), \(Q_x\), \(m\)), then checks \(sG ?= R + eQ\). Since \(sG = R + eQ\) by construction, the equation holds.
P2. For any compressed key \((\texttt{02/03}, x)\), the x-only representation is simply \(x\). The parity prefix (02 or 03) is discarded. To recover the full point from \(x\): compute \(y^2 = x^3 + 7 \pmod{p}\), take the square root, and select the even root. If the original key had odd \(y\), the signer negates their private key (\(d \to n - d\)), which maps \(Q = dG\) to \((n-d)G = -dG\), negating \(y\) and making it even.
Security is preserved because: (a) the mapping from \((d, Q)\) to \((n-d, -Q)\) is a bijection; (b) negating \(d\) does not change the difficulty of the ECDLP; (c) the signer knows which case applies and adjusts accordingly. An attacker gains no advantage from the convention.
P3. Suppose an adversary knows \(Q\) and wants to find \(P'\) and \(t'\) such that \(Q = P' + t'G\) where \(t' = H(P'_x \| c')\). This requires:
P + tG = P' + t'G P - P' = (t' - t)G \] The adversary must find \(P'\) such that the discrete logarithm of \(P - P'\) equals \(t' - t = H(P'_x \| c') - H(P_x \| c)\). Since \(H\) is modeled as a random oracle, \(H(P'_x \| c')\) is unpredictable for any \(P' \neq P\). The adversary would need to solve a discrete logarithm to find a \(P'\) satisfying this equation. Without breaking ECDLP, finding such a collision is infeasible.
C1. MuSig2 (a two-round multi-signature protocol) works by:
This works because Schnorr signatures are linear: \(s_1 G + s_2 G = (s_1 + s_2)G\). ECDSA is not linear—the modular inverse \(s^{-1}\) in the verification equation breaks additivity. There is no known way to aggregate ECDSA signatures without interactive protocols that are more complex and less efficient.
C2. An observer seeing a P2TR output on-chain sees only the 32-byte output key \(Q\). They cannot determine whether \(Q\) encodes:
If the spender uses the key path (single 64-byte signature), no information about possible scripts is leaked. The spending transaction looks identical for all three cases.
Compare with P2SH/P2WSH: the spending transaction reveals the full redeem/witness script, exposing the number of signers, the threshold, the key order, and any timelocks. P2TR reveals nothing in the key path case. Even in the script path case (Chapter 12), only the executed leaf is revealed—unused branches remain hidden.
B1. For a script path spend, the witness must contain (in addition to the script inputs):
The node uses the control block to verify that the script was committed to in the output key \(Q\): it hashes the script into a leaf hash, combines it with the Merkle proof to reconstruct the Merkle root \(c\), computes \(Q' = P + H(P_x \| c) \cdot G\), and checks \(Q' = Q\).
B2. The Ordinals protocol embeds data (images, text, HTML) in Taproot script path witnesses. It constructs a TapScript that contains the data in a "do-nothing" envelope:
OP_FALSE OP_IF … OP_ENDIF
The OP_FALSE OP_IF … OP_ENDIF block is never executed (it's a dead branch), but the data inside it is committed to by the Taproot Merkle tree and stored in the witness. Since witness data is discounted (1 WU per byte vs 4 WU for non-witness), this is cheaper than OP_RETURN for large payloads. BIP 342 removed the legacy 10,000-byte script size limit, so TapScripts can be much larger—bounded only by the 4,000,000 WU block weight. Since inscription data uses push opcodes (exempt from the 10,000 non-push opcode limit), a single inscription can approach 400 KB.