TeachMeBitcoin

P2P TCP Stream Demultiplexing

From TeachMeBitcoin, the free encyclopedia ⏱️ 3 min read

P2P Stream Demultiplexing: Parsing TCP Byte Streams over Sockets

A common mistake for network developers is treating TCP sockets as message-oriented channels. TCP is strictly stream-oriented. It treats data as a continuous stream of sequential bytes without any native boundaries or packet markers.

To isolate discrete protocol messages from this continuous stream, Bitcoin nodes implement a Sliding Window Stream Parser that actively searches for network magic bytes.


🔀 1. The Stream-to-Message Problem

If a node broadcasts three consecutive messages (e.g., ping, ping, addr), the receiving node's network card does not receive them as three neat packets. Instead, they are aggregated at the socket layer.

                      CONTINUOUS SOCKET STREAM AGGREGATION

  Wire TCP Buffer:
 ┌────────────────────────────────────────────────────────────────────────┐
 │ ... \xf9\xbe\xb4\xd9ping...\xf9\xbe\xb4\xd9ping...\xf9\xbe\xb4\xd9addr │
 └────────────────────────────────────────────────────────────────────────┘

If the receiving node reads 100 bytes from the socket, it could receive: * A partial message. * Exactly one message. * Multiple messages combined. * A message split across multiple TCP frame reads.


🔍 2. The Sliding Window Stream Parser

To align the input stream, the node's reading engine operates on a state-machine logic:

                            THE SLIDING WINDOW PIPELINE

        Input Byte Stream: \x3C \xF9 \xBE \xB4 \xD9 \x76 \x65 \x72 ...

        [Frame A: \x3C \xF9 \xBE \xB4]  ──► Does not match Magic!
         ▲
         ├─► Slide window forward by 1 byte
         ▼
        [Frame B: \xF9 \xBE \xB4 \xD9]  ──► MATCH! Align and halt.
                                         • Parse next 20 bytes as Header Metadata
                                         • Read next S bytes as Payload

The Alignment Process:

  1. Seeking Mode: The parser reads 4 bytes from the socket buffer.
  2. Magic Check: If the 4 bytes do not match Mainnet magic 0xF9BEB4D9, the parser discards the first byte, shifts the window forward, reads 1 new byte from the socket, and checks again.
  3. Synchronization Mode: Once a matching 4-byte magic sequence is located:
    • The stream is now officially synchronized.
    • The parser immediately reads the next 20 bytes from the socket. This contains the rest of the message header (Command, Payload Size, Checksum).
    • The parser extracts the Payload Size (let's say $S$).
    • The parser blocks and reads exactly $S$ bytes from the socket to capture the complete payload.
    • It verifies the checksum, dispatches the payload to the command validation engine, and then reverts back to Seeking Mode to process the next message.

🚀 3. Edge-Case: Peer Desynchronization Recovery

If a peer sends corrupted payload bytes or drops a connection mid-transmission, the stream alignment can be broken.

☕ Help support TeachMeBitcoin

TeachMeBitcoin is an ad-free, open-source educational repository curated by a passionate team of Bitcoin researchers and educators for public benefit. If you found our articles helpful, please consider supporting our hosting and ongoing content updates with a clean donation:

Ethereum: 0x578417C51783663D8A6A811B3544E1f779D39A85
Bitcoin: bc1q77k9e95rn669kpzyjr8ke9w95zhk7pa5s63qzz
Solana: 4ycT2ayqeMucixj3wS8Ay8Tq9NRDYRPKYbj3UGESyQ4J
Address copied to clipboard!