Data Directory Anatomy: Understanding Blocks, Chainstate, and Indexes
10. Data Directory Anatomy: Understanding Blocks, Chainstate, and Indexes
The "Data Directory" is the physical manifestation of the Bitcoin network on your hard drive. It is a complex ecosystem of databases, raw binary files, and configuration scripts. Understanding its anatomy is vital for maintenance, migrations, and troubleshooting. This chapter provides a surgical tour of the folders and files that make up your local copy of the blockchain.
The Root Level: The "Command Center"
When you open your .bitcoin folder, you see several key files:
-
bitcoin.conf: The configuration file we discussed in Chapter 9. -
bitcoind.pid: A small file containing the process ID of the running node. It prevents you from accidentally starting two nodes at the same time. -
debug.log: The node's "Diary." Every connection, every block, and every error is recorded here.
The blocks/ Folder: The Eternal Ledger
This is the largest folder in the directory, containing the actual raw data for every block ever mined.
-
blk*.dat: These are giant 128MB binary files. They contain the raw transactions as they were received over the network. They are "Write-Once"—once a block is written here, it is never changed. -
rev*.dat: The "Undo" data. If the network has a "Chain Reorganization" (Reorg), these files tell the node how to "roll back" the UTXO set to a previous state. -
index/: A LevelDB database that acts as a table of contents. It tells the node exactly whichblk.datfile and at what "Offset" (byte position) a specific block hash is located.
The chainstate/ Folder: The "Living" Database
This is the most critical folder in the entire system. It contains the UTXO Set (Unspent Transaction Outputs).
-
The Function: When a new transaction arrives, the node doesn't search through the 600GB of blocks to see if the money is real. It searches the much smaller
chainstatedatabase. -
LevelDB: It uses Google's LevelDB for high-speed key-value lookups.
-
The Risk: If you delete this folder, your node is useless. It must re-scan the entire 600GB of blocks to rebuild the list of "who owns what."
The wallets/ Folder: The Keys to the Kingdom
This is where your money lives.
-
wallet.dat: The actual database containing your private keys and transaction history. -
Descriptors vs. Legacy: Modern wallets are stored in subfolders with their own dedicated SQLite database. Legacy wallets are a single BerkeleyDB file.
-
Always Back Up: This is the only folder that cannot be recreated from the network. If you lose it, your money is gone.
The indexes/ Folder (Optional)
If you enable certain flags in bitcoin.conf, new indexes are created here:
-
txindex/: A map of every transaction ID in history. -
blockfilterindex/: Used by "Light Clients" (BIP 157/158) to query your node for relevant transactions without needing to download everything.
Managing Disk Space: The "Pruning" Strategy
If your data directory is getting too large, you don't have to delete it. You can move the blocks folder to a cheaper, slower HDD while keeping the chainstate on your fast SSD (using Symlinks, as we will explore in Chapter 11).
Understanding the anatomy of the data directory allows you to perform "Surgery" on your node. You can move it, back it up, and optimize it with the confidence of an expert system administrator.
TeachMeBitcoin is an ad-free, open-source educational repository curated by a passionate team of Bitcoin researchers and educators for public benefit. If you found our articles helpful, please consider supporting our hosting and ongoing content updates with a clean donation: