TeachMeBitcoin

Constructing the Merkle Tree: Mathematical Proofs and Transaction Root Generation

From TeachMeBitcoin, the free encyclopedia Reading time: 4 min

Constructing the Merkle Tree: Mathematical Proofs and Transaction Root Generation

A block header cannot hold megabytes of raw transaction data; it is strictly constrained to an 80-byte memory space. To cryptographically bind an arbitrary number of transactions to this tiny header, Bitcoin utilizes a binary Merkle Tree.

This guide details the mathematical pairing equations, leaf replication rules for odd transaction counts, and the logarithmic efficiency of Merkle validation audits.


1. What is a Merkle Tree?

A Merkle Tree is a binary cryptographic tree constructed by recursively pairing and hashing transaction identifiers (TXIDs) until a single 32-byte hash remains: the Merkle Root. This root is saved directly inside the 80-byte block header.

 THE BINARY MERKLE TREE LAYOUT

 [ Merkle Root (H_ABCD) ]
 ▲
 ┌─────────────────────┴─────────────────────┐
 │ │
 [ Hash AB (H_AB) ] [ Hash CD (H_CD) ]
 ▲ ▲
 ┌───────┴───────┐ ┌───────┴───────┐
 │ │ │ │
 [ Tx A ] [ Tx B ] [ Tx C ] [ Tx D ]
 (Coinbase Tx) (TXID) (TXID) (TXID)

2. Mathematical Pairing and Odd Leaf Duplication

Let's define a block containing $N$ transactions, represented by leaf hashes $L_0, L_1, L_2, \dots, L_{N-1}$.

The Pairing Function

Any parent node $H_P$ is computed by concatenating its left child $H_L$ and right child $H_R$ and applying a double-SHA256 hash:

$$H_P = \text{SHA256}\Big(\text{SHA256}\big(H_L \parallel H_R\big)\Big)$$

Where $\parallel$ represents raw byte concatenation.

The Odd Leaf Duplication Rule

If a tree level contains an odd number of nodes, the final node does not have a partner. To resolve this, the odd node is paired with itself:

$$\text{If } H_K \text{ is odd: } \quad H_{\text{Parent}} = \text{SHA256}\Big(\text{SHA256}\big(H_K \parallel H_K\big)\Big)$$

 ODD TRANSACTION COUNT DUPLICATION

 [ Merkle Root ]
 ▲
 ┌─────────────────────┴─────────────────────┐
 │ │
 [ Parent Hash AB ] [ Parent Hash CC ]
 ▲ ▲
 ┌───────┴───────┐ ┌───────┴───────┐
 │ │ │ │
 [ Tx A ] [ Tx B ] [ Tx C ] [ Tx C ]
 (TXID) (TXID) (TXID) (Duplicate Hash)

Note: The transaction $Tx_C$ is not duplicated in the actual block data; only its hash is duplicated in memory during the Merkle root calculation.


3. Logarithmic Audit Efficiency (SPV Clients)

The primary benefit of a Merkle Tree is that it allows Simplified Payment Verification (SPV) light clients to verify that a transaction is included in a block without downloading the entire blockchain or block data.

To verify Transaction $A$, an SPV client only needs:

  1. The 80-byte block header (containing the validated Merkle Root).

  2. The transaction $Tx_A$.

  3. The Merkle Audit Path (the sibling hashes along the path from $Tx_A$ to the root).

The Math:

For a block containing $10,000$ transactions:

☕ Help support TeachMeBitcoin

TeachMeBitcoin is an ad-free, open-source educational repository curated by a passionate team of Bitcoin researchers and educators for public benefit. If you found our articles helpful, please consider supporting our hosting and ongoing content updates with a clean donation:

Ethereum: 0x578417C51783663D8A6A811B3544E1f779D39A85
Bitcoin: bc1q77k9e95rn669kpzyjr8ke9w95zhk7pa5s63qzz
Solana: 4ycT2ayqeMucixj3wS8Ay8Tq9NRDYRPKYbj3UGESyQ4J
Address copied to clipboard!