Backup Cryptography: Securing wallet.dat, HD Seeds, and Descriptors
13. Backup Cryptography: Securing wallet.dat, HD Seeds, and Descriptors
In the world of Bitcoin, Loss is Permanent. There is no "Forgot Password" button and no customer support desk. This makes "Backup Logic" the single most important operational duty of any node owner. This chapter explores the cryptographic evolution of Bitcoin backups, from the brittle BerkeleyDB files of the early era to the modern, human-readable Descriptors of the current day.
The Dark Ages: Legacy BerkeleyDB Backups
In the early versions of Bitcoin (0.1 to 0.20), backups were a nightmare. The wallet used BerkeleyDB (BDB), a complex binary database.
-
The Keypool Quirk: To save time, Bitcoin pre-generated 100 private keys in a "Keypool." When you clicked "Receive," it gave you the next key from the pool.
-
The Disaster: If you generated 101 addresses, the 101st key was NOT in your old backup. If your computer crashed, any money sent to that 101st address was lost forever, even if you had the backup from the day before.
-
The Solution: You had to back up the
wallet.datfile every single time you used a new address. This led to thousands of users losing funds due to "Backup Stale" errors.
The Renaissance: Hierarchical Deterministic (HD) Wallets
The introduction of BIP 32 (and later BIP 39/44) changed everything. Instead of 100 random keys, the wallet now used a single Master Root Seed.
-
The Mathematical Magic: From one 256-bit number (the seed), you can derive an infinite number of private keys.
-
Child Key Derivation: $CKDpriv(ParentKey, index) \rightarrow ChildKey$.
-
The Benefit: You only need to back up the seed ONCE. Whether you generate 1 address or 1 million, they are all children of that same root. If you lose your computer, you just type in your 12 or 24 words, and the entire wallet history is recreated.
The Modern Era: Output Descriptors
As Bitcoin evolved to include SegWit, Multisig, and Taproot, the "Seed Phrase" wasn't enough. The wallet also needed to know how the seed was being used.
-
The Problem: A seed phrase alone doesn't tell a wallet whether to look for legacy addresses (P2PKH) or Taproot addresses (P2TR).
-
The Solution: Output Descriptors: A descriptor is a human-readable string that contains the seed (usually as an XPRV) and the "Derivation Path."
-
Example:
wpkh([f5923984/84'/0'/0']xpub6ER.../0/*)#82738273 -
The Power: A descriptor is a complete, portable "Recipe" for your wallet. It is the gold standard for modern Bitcoin backups.
The Cryptography of Wallet Encryption
When you "Encrypt" your wallet in Bitcoin Core, you aren't just putting a password on a file. You are performing a complex cryptographic operation:
-
KDF (Key Derivation Function): Your password is run through Scrypt thousands of times to produce a 256-bit "Master Key." This process is intentionally slow to prevent "Brute Force" attacks.
-
Master Key Encryption: This Master Key is used to encrypt the "Seed" using AES-256-CBC.
-
Unlock on Demand: When you want to send money, bitcoind uses your password to derive the Master Key, decrypts the Seed for a few seconds, signs the transaction, and then wipes the decrypted seed from the RAM.
The Operator's Backup Protocol
To ensure 99.999% fund safety, every operator should follow the "3-2-1 Rule":
-
3 Copies: One primary, two backups.
-
2 Media Types: One on an encrypted SSD, one on a physical "Seed Plate" (stainless steel).
-
1 Off-site: If your house burns down, your backup should be in a separate physical location.
Pro-Tip: Use backupwallet
Never copy the wallet.dat file while the node is running; the database might be in a "Dirty" state. Always use the RPC command which ensures a consistent snapshot:
bitcoin-cli backupwallet "/mnt/secure_usb/bitcoin_backup_$(date +%s).dat"
TeachMeBitcoin is an ad-free, open-source educational repository curated by a passionate team of Bitcoin researchers and educators for public benefit. If you found our articles helpful, please consider supporting our hosting and ongoing content updates with a clean donation: