Part V · Special Transactions Chapter 14

OP_RETURN: Data on the Blockchain

"The OP_RETURN change creates a provably-prunable output, to avoid data storage schemes—some of which were already deployed—that were storing arbitrary data such as images as forever-unspendable TX outputs, bloating bitcoin's UTXO database."—Bitcoin Core 0.9.0 Release Notes, March 2014

The blockchain is a ledger, not a filing cabinet. Every byte stored on-chain is replicated across tens of thousands of nodes, validated by every full node forever, and can never be deleted. Yet from Bitcoin's earliest days, people have found ways to embed arbitrary data alongside financial transactions—love notes, political statements, images, and entire protocols.

The question was never whether people would embed data on-chain. They were already doing it before Bitcoin Core 0.9.0—hiding messages in fake addresses, stuffing data into multisig scripts, and creating forever-unspendable outputs that bloated every node's UTXO set. The question was whether to provide a clean, prunable channel for data, or force it underground where it does more damage.

OP_RETURN was Bitcoin's answer: a designated opcode that makes an output provably unspendable. Nodes can safely prune OP_RETURN outputs from their UTXO set—the data is stored in the blockchain history but doesn't consume memory in the critical set of spendable coins.

Our specimen is a declaration of love, etched permanently into block 308,570 on a summer day in 2014.

14.1The Specimen

Txid:

8bae12b5f4c088d940733dcd1455efc6a3a69cf9340e17a981286d3778615684

FieldValueNotes
Block308,570June 30, 2014
Version1Legacy
Inputs1P2PKH (uncompressed key)
Outputs2OP_RETURN + P2PKH change
OP_RETURN data19 bytes"charley loves heidi"
Total size254 bytesLegacy format (no witness)
Weight1,016 WU\(254 \times 4\) (all non-witness)
Fee20,000 sats78.7 sat/byte

14.2The OP_RETURN Output

The heart of this transaction is Output 0—the OP_RETURN:

OP_RETURN Output Anatomy

00 00 00 00 00 00 00 00

15

6a

13

63 68 61 72 6c 65 79 20 6c 6f 76 65 73 20 68 65 69 64 69

OutputGreen8pt8pt00...008 bytesValue: 0 sats
VarintTeal8pt8pt151 byteScript length: 21 bytes
ScriptPubKeyLime8pt8pt6a1 byteOP_RETURN
ScriptPubKeyLime8pt8pt131 byteOP_PUSHBYTES_19
ScriptPubKeyLime8pt8pt63...6919 bytes"charley loves heidi" (ASCII)

14.2.1The Zero Value

An OP_RETURN output always carries zero satoshis. Any value assigned to it would be permanently destroyed—the output can never be spent. While Bitcoin consensus allows non-zero OP_RETURN values (they're simply burned), all standard node policies reject OP_RETURN outputs with non-zero values as wasteful.

14.2.2The Opcode

OP_RETURN (0x6a) immediately terminates script execution with failure. When the Script interpreter encounters this opcode, it does not evaluate any further bytes—the script fails unconditionally. This makes the output provably unspendable: no combination of witness data, scriptSig, or future opcodes can ever satisfy a script that begins with OP_RETURN.

Provably Unspendable = Prunable

Because no valid spending transaction can ever reference an OP_RETURN output, full nodes can safely prune these outputs from their UTXO set. The transaction remains in the blockchain history (and is replicated by all archival nodes), but the output never occupies space in the critical in-memory set of spendable coins. This is the key insight: OP_RETURN is cheaper for the network than alternatives like fake addresses, because fake addresses create UTXO entries that can never be cleaned up.

14.2.3The Data Payload

After OP_RETURN, the remaining bytes are arbitrary data. The Script interpreter never evaluates them—it already failed at the OP_RETURN. The push opcode (0x13 = push 19 bytes) is a convention, not a requirement: nodes parse it as valid Script syntax, but the bytes after OP_RETURN are effectively opaque data.

Our specimen's 19 bytes decode to ASCII:

636861726c6579206c6f766573206865696469
charley loves heidi

A love note, immortalized across every Bitcoin full node on Earth.

14.3The Data Limit

OP_RETURN's data capacity has evolved through three stages:

VersionDateLimit (scriptPubKey)
0.9.0March 201440 bytes (first standardization)
0.11.0+July 201583 bytes (\(\approx\)80 bytes of data)

The 83-byte limit is a relay policy, not a consensus rule. Miners can include OP_RETURN outputs of any size in their blocks—but standard Bitcoin Core nodes will not relay transactions with OP_RETURN data exceeding 83 bytes (the total scriptPubKey size, including the opcode and push bytes).

Policy \(\neq\) Consensus

Many features described in this book are relay policies—rules that nodes enforce when deciding which transactions to forward to peers—rather than consensus rules. OP_RETURN's 83-byte limit, the "dust" threshold, and opt-in RBF signaling are all relay policies, not protocol law. The distinction matters: a miner can include a 1,000-byte OP_RETURN in a block, and every node will accept it. They just won't relay it.

14.4Before OP_RETURN: Creative Abuses

OP_RETURN solved a problem that was already entrenched. Before 2014, data-embedding schemes used two techniques that were far worse for the network:

14.4.1Fake Addresses

The simplest trick: encode 20 bytes of arbitrary data as a fake Hash160 in a P2PKH output: OP_DUP OP_HASH160 <20 bytes of data> OP_EQUALVERIFY OP_CHECKSIG

This creates a valid-looking P2PKH output that no one can ever spend (no key hashes to that value). But unlike OP_RETURN, nodes cannot prune it—the output appears spendable, so it sits in the UTXO set forever, consuming memory on every full node.

14.4.2Multisig Data Embedding

More sophisticated: encode data in the public key slots of a bare multisig output. A 1-of-3 multisig can carry \(3 \times 33 = 99\) bytes of data disguised as compressed public keys: OP_1 <33B data> <33B data> <33B data> OP_3 OP_CHECKMULTISIG

Again, the output is unspendable (no corresponding private keys), but it pollutes the UTXO set. This was the technique used by early Counterparty transactions.

The Blockchain as Public Record

The impulse to inscribe permanent records on Bitcoin predates OP_RETURN by years. The Bitcoin blockchain contains the WikiLeaks Cablegate archive (embedded in 2011 using transaction metadata), tributes to the deceased, marriage proposals, and the complete text of Satoshi's whitepaper (reassembled from multi-signature outputs). OP_RETURN did not enable this impulse—it merely channeled it into a form less harmful to the network.

14.5Protocols Built on OP_RETURN

OP_RETURN's 80 bytes of structured data became the foundation for several major protocols:

14.5.1Omni Layer (USDT)

The Omni Layer (originally Mastercoin, launched 2013) uses OP_RETURN to embed token operations—issuance, transfer, and exchange offers. An Omni OP_RETURN begins with the 4-byte marker 6f6d6e69 ("omni" in ASCII), followed by a structured payload encoding the token ID, amount, and operation type.

For years, USDT (Tether) on Omni was the most common OP_RETURN protocol on Bitcoin—driving significant transaction volume and fee revenue. Each USDT transfer appeared on-chain as a regular Bitcoin transaction with an OP_RETURN carrying the Omni payload, plus a small "dust" output to the recipient's address.

14.5.2OpenTimestamps

OpenTimestamps aggregates thousands of timestamp requests into a single Merkle tree, then commits the root hash to the blockchain via one OP_RETURN output. A single 32-byte Merkle root can commit an unlimited number of documents to a specific point in time.

To verify a timestamp, the service provides a Merkle proof: the sequence of sibling hashes from the document's leaf to the on-chain root. Anyone can independently verify the proof in \(O( n)\) hash operations—an elegant use of OP_RETURN's limited space.

14.5.3Counterparty

Counterparty (launched 2014) encodes asset creation and decentralized exchange operations in OP_RETURN data, identified by the prefix 434e545250525459 ("CNTRPRTY" in ASCII). Counterparty transactions encode token issuances, transfers, and decentralized exchange orders—enabling an asset layer on Bitcoin without any consensus changes.

14.5.4Witness Commitment

As we saw in Chapter 13, every modern coinbase transaction contains a SegWit witness commitment in an OP_RETURN output with the magic prefix aa21a9ed. This is perhaps the most critical use of OP_RETURN in Bitcoin: it binds witness data to the block header, ensuring that stripping witness data invalidates the commitment.

14.6OP_RETURN vs. Witness Embedding

OP_RETURN has an 83-byte limit and costs 4 WU per byte (non-witness data). The Taproot upgrade (Part IV) opened a different channel for data: the witness field, where data costs only 1 WU per byte. This witness discount is the foundation of the Ordinals inscription protocol (Chapter 18):

OP_RETURNWitness embedding
Maximum data80 bytes400 KB (block weight)
Cost per byte4 WU1 WU
UTXO impactNone (prunable)None (witness only)
Use caseProtocol tags, hashesImages, media, code

Ordinals inscriptions use an OP_FALSE OP_IF … OP_ENDIF envelope in the witness, storing content types and data as push operations inside a dead code branch. Because witness data is discounted 4:1, a single inscription can embed 400 KB of data at one-quarter the cost of equivalent non-witness storage. Chapter 18 examines this mechanism in detail.

14.7Weight Analysis

Our specimen is a legacy transaction (no SegWit marker/flag, no witness). All bytes are non-witness data:

ComponentBytesNotes
Version4 B
Input count1 B
Input (txid + vout + scriptSig + seq)180 B32+4+1+139+4
Output count1 B
Output 0 (OP_RETURN)30 B8+1+21
Output 1 (P2PKH change)34 B8+1+25
Locktime4 B
Total254 BAll non-witness
Weight1,016 WU\(254 \times 4\)
vsize254 vB\(= 1,016 / 4\)

The OP_RETURN output contributes 30 bytes (120 WU) to the transaction. Of those, 21 bytes are the scriptPubKey (OP_RETURN + push opcode + 19 bytes of ASCII data), and the rest is the 8-byte zero value and 1-byte script length.

The Uncompressed Key Tax

Our specimen uses a 65-byte uncompressed public key, which was still common in mid-2014. A compressed key (33 bytes) would save 32 bytes in the scriptSig, reducing the total from 254 to 222 bytes. The OP_RETURN output itself (30 bytes) is only 12% of the transaction—the uncompressed key is the real space consumer. Modern wallets exclusively use compressed keys (or Taproot's 32-byte x-only keys), making this inefficiency a relic of early Bitcoin.

14.8What We Learned

14.8.1Looking Ahead

OP_RETURN and data embedding are atemporal—they exist across every era of Bitcoin. Chapter 15 introduces a different dimension of transaction anatomy: time. Timelocks make transactions programmable in time, enabling everything from delayed payments to Lightning Network channels.

Exercises

Litmus (L)

  1. What does OP_RETURN do when the Script interpreter encounters it?
  2. Why does an OP_RETURN output always carry zero satoshis?
  3. What is the maximum OP_RETURN data size in bytes? Is this a consensus rule or a relay policy?
  4. Why is OP_RETURN better for the network than embedding data in fake P2PKH addresses?
  5. Decode the ASCII message from the hex bytes: 63 68 61 72 6c 65 79 20 6c 6f 76 65 73 20 68 65 69 64 69.

Hands-On (H)

  1. Parse our specimen's OP_RETURN output byte by byte. Identify the value, script length, opcode, push length, and data. Verify the script length matches the actual byte count.
  2. Compute the fee rate for our specimen. Given 254 bytes, weight 1,016 WU, and fee 20,000 sats, express the fee in both sat/byte and sat/WU.
  3. Construct a raw OP_RETURN scriptPubKey for the message "Hello, Bitcoin!" (15 ASCII characters). Write the complete hex including the opcode and push byte.
  4. Our specimen uses an uncompressed public key (65 bytes). If a compressed key (33 bytes) were used instead, what would be the new scriptSig length, total transaction size, and weight?

Proofs and Reasoning (P)

  1. Explain why embedding data in fake P2PKH addresses is "more expensive" for the network than OP_RETURN, even though the on-chain size may be similar. Consider UTXO set growth, node memory, and initial block download.
  2. The OP_RETURN relay limit is 83 bytes for the entire scriptPubKey. What is the maximum number of ASCII characters that can fit in a single OP_RETURN output? Account for the OP_RETURN byte and the push opcode(s).
  3. Could a miner include an OP_RETURN output with 1,000 bytes of data in a valid block? Would other nodes accept it? Would other nodes relay transactions with such outputs? Explain the distinction.

Connections (C)

  1. Omni Layer. Explain how USDT transfers work on the Omni protocol. What goes in the OP_RETURN? Why does the transaction also need a small "dust" output to the recipient?
  2. OpenTimestamps. How does OpenTimestamps use a single 40-byte OP_RETURN to timestamp thousands of documents? What cryptographic structure makes this possible?

Bridge (B)

  1. Chapter 18 covers Ordinals inscriptions, which embed data in the witness rather than in OP_RETURN. Why does the witness discount make this economically viable for large data (images, media)? What is the approximate maximum inscription size?
  2. The Runes protocol (Chapter 19) uses OP_RETURN with a specific tag (OP_PUSHNUM_13). Why would a new protocol choose OP_RETURN over witness embedding? What advantages does OP_RETURN offer for structured, small-payload protocols?

Solutions

L1. OP_RETURN immediately terminates script execution with failure. The script is unconditionally invalid, making the output provably unspendable. No combination of input data can ever satisfy a script beginning with OP_RETURN.

L2. Any satoshis sent to an OP_RETURN output are permanently destroyed—the output can never be spent, so the value is irrecoverable. While consensus allows non-zero values (they're simply burned), standard relay policy rejects them. Setting the value to zero avoids wasting bitcoin.

L3. The maximum scriptPubKey size for a standard OP_RETURN is 83 bytes (relay policy in Bitcoin Core, set by MAX_OP_RETURN_RELAY). This is not a consensus rule—miners can include larger OP_RETURN outputs in blocks, and all nodes will accept those blocks as valid. However, transactions with OP_RETURN scriptPubKeys exceeding 83 bytes will not be relayed by standard nodes.

L4. OP_RETURN outputs are provably unspendable, so nodes can prune them from the UTXO set. Fake P2PKH addresses create outputs that appear spendable (the node cannot know the Hash160 is fake), so they remain in the UTXO set forever, consuming memory on every full node. OP_RETURN explicitly marks data outputs as non-financial, allowing the network to handle them efficiently.

L5. Reading each hex byte as ASCII: 63=c, 68=h, 61=a, 72=r, 6c=l, 65=e, 79=y, 20=(space), 6c=l, 6f=o, 76=v, 65=e, 73=s, 20=(space), 68=h, 65=e, 69=i, 64=d, 69=i. The message is: "charley loves heidi".

H1. Output 0:

Script length check: \(1 + 1 + 19 = 21\) bytes = 0x15.

H2. Fee rate: \(20,000 / 254 = 78.7\) sat/byte. In weight units: \(20,000 / 1,016 = 19.7\) sat/WU. Since this is a legacy transaction (no witness), the sat/byte and sat/vB rates are identical, and the sat/WU rate is exactly one-quarter of the sat/byte rate.

H3. "Hello, Bitcoin!" = 15 ASCII characters. Hex encoding:

48 65 6c 6c 6f 2c 20 42 69 74 63 6f 69 6e 21

The complete scriptPubKey: 6a (OP_RETURN) + 0f (push 15 bytes) + the 15 data bytes:

6a 0f 48 65 6c 6c 6f 2c 20 42 69 74 63 6f 69 6e 21

Total: 17 bytes.

H4. With a compressed key: scriptSig = \(1 + 72 + 1 + 33 = 107\) bytes (vs 139 with uncompressed). Savings: 32 bytes. New total: \(254 - 32 = 222\) bytes. New weight: \(222 \times 4 = 888\) WU (vs 1,016). New vsize: 222 vB.

P1. On-chain, both approaches use a similar number of bytes. But the critical difference is in the UTXO set—the in-memory database of all spendable outputs. An OP_RETURN output is provably unspendable, so nodes prune it immediately. A fake P2PKH output is indistinguishable from a real one, so it must be stored in the UTXO set forever (or until a spending transaction proves it's been claimed—which, for a fake address, is never).

The UTXO set must fit in RAM for fast validation. As of 2024, it contains 170 million entries totaling 7 GB. Every fake address adds one entry that can never be removed. Over time, this accumulates into a permanent tax on every full node. During IBD (initial block download), these fake UTXOs are created, indexed, and stored just like real ones—wasting disk I/O and memory that OP_RETURN avoids entirely.

P2. The 83-byte limit applies to the entire scriptPubKey: OP_RETURN (1 byte) + push opcode(s) + data. There are two cases:

The maximum is therefore 80 ASCII characters, achieved with OP_PUSHDATA1.

P3. Yes, a miner can include a 1,000-byte OP_RETURN in a valid block. The OP_RETURN data limit is a relay policy, not a consensus rule. Every node will accept the block as valid because no consensus rule restricts OP_RETURN size (beyond the overall block weight limit of 4,000,000 WU).

However, standard nodes will not relay transactions with OP_RETURN scriptPubKeys exceeding 83 bytes. The miner would have to create the transaction themselves and include it directly in their block template. This is why the relay policy is often called "soft" enforcement—it controls propagation, not validity.

C1. An Omni/USDT transfer is a standard Bitcoin transaction with three components: (1) an input spending the sender's BTC, (2) an OP_RETURN output containing the Omni payload (6f6d6e69 prefix + token ID + amount + operation), and (3) a small "dust" output (546 sats) to the recipient's Bitcoin address.

The dust output is necessary because Omni uses Bitcoin addresses to identify token holders. The OP_RETURN encodes the transfer instruction ("send 100 USDT from address A to address B"), but Omni nodes need to know which Bitcoin output corresponds to the recipient. The convention: the first non-OP_RETURN output is the recipient. The dust amount is the minimum value Bitcoin Core will relay (below the dust threshold, nodes reject the transaction as spam).

C2. OpenTimestamps uses a Merkle tree. Each document submitted for timestamping is hashed, and all hashes become leaves in a binary Merkle tree. The single root hash (32 bytes) is embedded in an OP_RETURN output. With a 32-byte root, a tree of depth \(d\) can timestamp \(2^d\) documents. A tree of depth 20 timestamps over 1 million documents in one OP_RETURN.

To prove a document was timestamped, the service provides a Merkle proof: the sequence of sibling hashes from the document's leaf to the root. Anyone can independently verify the proof against the on-chain root in \(O( n)\) hash operations. This is the same structure Bitcoin uses for transaction Merkle trees in block headers.

B1. Witness data costs 1 WU per byte, while non-witness data (including OP_RETURN) costs 4 WU per byte. For a 100 KB image: via OP_RETURN, it would cost \(100,000 \times 4 = 400,000\) WU (10% of block weight); via witness, only \(100,000 \times 1 = 100,000\) WU (2.5% of block weight). The 4:1 cost advantage makes witness embedding economically viable for large data.

The maximum inscription size is bounded by the block weight limit: \(4,000,000\) WU minus the minimum transaction overhead. Since witness data costs 1 WU/byte, the theoretical maximum approaches 400 KB. In practice, inscriptions are slightly smaller to leave room for the transaction structure and other block transactions.

B2. OP_RETURN is superior for structured, small-payload protocols like Runes because: (1) the data is in the non-witness portion, making it visible in the stripped transaction (no witness needed to read it); (2) the fixed 80-byte limit is sufficient for protocol headers, token IDs, and amounts; (3) OP_RETURN is a well-established standard supported by all node software; (4) it doesn't require Taproot or SegWit—any transaction version can include it. Witness embedding is better for large opaque data (images, media), but for small structured protocol messages, OP_RETURN is simpler, more universal, and more explicit.

← Ch. 13 Ch. 15 →