Why Selective Pruning Is Not Possible

← Back to Chapter 18
1 How Bitcoin Core Stores Blocks

Bitcoin Core writes blocks sequentially into numbered flat files: blk00000.dat, blk00001.dat, etc. Each file holds up to 128 MiB of block data. Blocks are packed contiguously — there are no gaps, no padding, no per-transaction index within the file.

src/node/blockstorage.hConstants
static const unsigned int MAX_BLOCKFILE_SIZE = 0x8000000;  // 128 MiB

// Each block is preceded by 8 bytes: 4-byte network magic + 4-byte size
static constexpr uint32_t STORAGE_HEADER_BYTES{
    std::tuple_size_v<MessageStartChars> + sizeof(unsigned int)
};  // = 8 bytes

Each block in the file is preceded by an 8-byte header: 4 bytes of network magic (f9beb4d9 for mainnet) and 4 bytes for the block's serialized size. The block data follows immediately — header, transactions, and (post-SegWit) witness data, all serialized contiguously.

blk00428.dat (128 MiB)
M
H
TX
W
M
H
TX
WITNESS (inscription: 3.9 MB)
M
H
TX
W
Magic + size (8B) Block header (80B) Non-witness tx data Witness (signatures) Witness (inscription)
2 The Write Path: How a Block Gets to Disk

When a new block is validated, Bitcoin Core calls WriteBlock(). This function finds the next available position in the current blk*.dat file, writes the 8-byte header, then serializes the entire block — including all witness data — in one contiguous write.

src/node/blockstorage.cppBlockManager::WriteBlock()
FlatFilePos BlockManager::WriteBlock(const CBlock& block, int nHeight)
{
    const unsigned int block_size{
        static_cast<unsigned int>(GetSerializeSize(TX_WITH_WITNESS(block)))
    }; ENTIRE block size, including ALL witness data

    // Find next position: appends to current blk*.dat file
    FlatFilePos pos{FindNextBlockPos(block_size + STORAGE_HEADER_BYTES,
                                     nHeight, block.GetBlockTime())}; Position = byte offset in file

    AutoFile file{OpenBlockFile(pos, /*fReadOnly=*/false)};

    {
        BufferedWriter fileout{file};
        fileout << GetParams().MessageStart() << block_size; // 8-byte header
        pos.nPos += STORAGE_HEADER_BYTES;
        fileout << TX_WITH_WITNESS(block); Block + witness serialized as ONE blob
    }

    return pos; // {file_number, byte_offset} — stored in LevelDB index
}

The critical line is TX_WITH_WITNESS(block). The block's non-witness data and witness data are interleaved in Bitcoin's serialization format — witness fields appear inside each transaction, between the inputs and outputs. They are not stored as a separate appendix that could be cleanly removed.

3 The Index: How Core Finds a Block

After writing, Core stores the block's position in a LevelDB database (blocks/index/). Each entry maps a block hash to a FlatFilePos — a file number and byte offset:

Block Index EntryWhat LevelDB stores per block
// Each block's disk location is just two numbers:
int      nFile;     // Which blk*.dat file (e.g., 428)
uint32_t nDataPos;  // Byte offset within that file (e.g., 83,291,648)
uint32_t nUndoPos;  // Byte offset of undo data in rev*.dat

There is no per-transaction index. Core knows where block 785,002 starts in the file, but not where its individual transactions or witness fields are. To read a single transaction, Core reads the entire block from its start position and deserializes it in memory.

BlockFileOffsetSize
785,001blk00428.dat0x000000001.2 MB
785,002blk00428.dat0x0012C0083.95 MB (inscription block)
785,003blk00428.dat0x004F84101.8 MB

Block 785,003 starts at byte 0x004F8410 — immediately after the last byte of block 785,002. If you were to strip the 3.9 MB inscription witness from block 785,002, its serialized size would shrink, and every byte offset after it would be wrong. Block 785,003's index entry would point into the middle of garbage.

4 Why Selective Pruning Breaks Everything
The Fundamental Problem

Bitcoin's block serialization format interleaves witness data inside each transaction. Witness fields are not a separate appendix — they appear between a transaction's inputs and outputs in the serialized byte stream. You cannot remove witness data without re-serializing the entire transaction, which changes the block's size, which shifts every subsequent byte offset in the file, which invalidates every index entry for every block stored after it.

Here is the serialization order for a single SegWit transaction:

Wire format (BIP 144)SegWit transaction serialization
// A single SegWit transaction on disk:
[version]            // 4 bytes
[marker: 0x00]       // 1 byte  — signals SegWit
[flag:   0x01]       // 1 byte
[input_count]        // varint
[inputs...]          // prevout + scriptSig + sequence
[output_count]       // varint
[outputs...]         // value + scriptPubKey
[witness_0]          INTERLEAVED HERE — not at the end
[witness_1]          // one witness stack per input
[witness_n]          // contains the inscription data
[locktime]           // 4 bytes

To "prune" the witness from this transaction, you would need to:

  1. Remove the marker and flag bytes
  2. Remove all witness stacks
  3. Re-serialize the transaction (it's now a different byte length)
  4. Re-serialize the block (its total size changed)
  5. Update the 4-byte size field in the block's storage header
  6. Rewrite the blk*.dat file from this point forward, shifting every subsequent block
  7. Update every affected block's nDataPos in the LevelDB index

For a single block, this means rewriting potentially hundreds of megabytes of file data and updating hundreds of index entries. For pruning all inscription witnesses from the entire chain, it means rewriting most of the 650+ GB blockchain.

5 What Would Actually Work

Selective witness pruning is not impossible — it's just not supported by Core's current storage format. Two approaches could work:

Approach A: External Post-Processor

A standalone tool that operates on the blk*.dat files outside of Core:

  1. Read each block from the file
  2. Deserialize, strip witness data from selected transactions
  3. Re-serialize the block with stripped transactions
  4. Write the modified blocks to new blk*.dat files
  5. Rebuild the LevelDB index with corrected offsets

Advantage: No modification to Core's consensus code.
Disadvantage: Requires rewriting the entire blockchain on disk. The node must be stopped during the process. Other nodes requesting the stripped blocks would receive invalid data (the witness commitment in the coinbase would not match).

Approach B: Separate Witness Storage

Modify Core to store witness data in separate files (e.g., wit00428.dat) rather than interleaved in blk*.dat:

  1. Non-witness transaction data stays in blk*.dat with stable byte offsets
  2. Witness data goes to wit*.dat files that can be independently deleted
  3. The block index tracks both positions

Advantage: Deleting witness data doesn't shift any byte offsets. Simple and clean.
Disadvantage: Requires a hard migration of existing block files, changes to the serialization layer, and every block-serving function needs to recombine data from two files. No Bitcoin Core developer has proposed this.

Why the Witness Commitment Makes This Safe

Every post-SegWit block contains a witness commitment — a hash of all witness data, stored in an OP_RETURN output of the coinbase transaction. This commitment is part of the non-witness block data, so it survives pruning. A node that has pruned witness data can still verify that the data existed at validation time by checking the commitment. It just can't serve the witness data to peers requesting full blocks.

As of 2026, no mainline Bitcoin client offers selective witness pruning. The choice remains binary: keep everything (-prune=0, 650+ GB) or prune entire blocks (-prune=550, ~5 GB but unable to serve historical data).