Decoding Witness Data from Raw Transactions

11. Decoding Witness Data from Raw Transactions

What Is Witness Data?

Introduced in BIP141 (Segregated Witness), the witness is a separate data structure attached to each transaction input. It was moved outside the traditional transaction serialization to fix transaction malleability and enable more efficient validation.

The witness is a stack of byte vectors — literally a list of data items that serve as the unlocking data for SegWit inputs. For non-SegWit inputs, the witness is an empty list.

Raw Transaction Format with Witness

SegWit transactions use a new serialization format (identified by the segwit marker and flag bytes):

[version: 4 bytes]
[marker: 0x00]
[flag: 0x01]
[input count: varint]
[inputs]
[output count: varint]
[outputs]
[witness data for each input]
[locktime: 4 bytes]

The witness section is NOT hashed for txid computation. The txid uses the legacy format. The wtxid (witness txid) hashes the full SegWit serialization.

Parsing Witness Data

def parse_witness(data: bytes, offset: int) -> tuple:
    """Parse witness stack starting at offset. Returns (witness_items, new_offset)."""
    item_count = data[offset]
    offset += 1
    items = []
    for _ in range(item_count):
        item_len = data[offset]
        offset += 1
        # Handle multi-byte varints
        if item_len == 0xfd:
            item_len = int.from_bytes(data[offset:offset+2], 'little')
            offset += 2
        item = data[offset:offset+item_len].hex()
        items.append(item)
        offset += item_len
    return items, offset

def decode_segwit_tx(raw_hex: str):
    data = bytes.fromhex(raw_hex)
    offset = 0

    version = int.from_bytes(data[offset:offset+4], 'little')
    offset += 4

    is_segwit = (data[offset] == 0x00 and data[offset+1] == 0x01)
    if is_segwit:
        offset += 2  # skip marker and flag

    # Parse inputs
    input_count = data[offset]; offset += 1
    inputs = []
    for _ in range(input_count):
        txid = data[offset:offset+32][::-1].hex(); offset += 32
        vout = int.from_bytes(data[offset:offset+4], 'little'); offset += 4
        scriptsig_len = data[offset]; offset += 1
        scriptsig = data[offset:offset+scriptsig_len].hex(); offset += scriptsig_len
        sequence = int.from_bytes(data[offset:offset+4], 'little'); offset += 4
        inputs.append({"txid": txid, "vout": vout, "scriptsig": scriptsig})

    # Parse outputs (simplified)
    output_count = data[offset]; offset += 1
    for _ in range(output_count):
        value = int.from_bytes(data[offset:offset+8], 'little'); offset += 8
        spk_len = data[offset]; offset += 1
        offset += spk_len

    # Parse witness
    witnesses = []
    if is_segwit:
        for i in range(input_count):
            witness_items, offset = parse_witness(data, offset)
            witnesses.append(witness_items)

    return {"version": version, "inputs": inputs, "witnesses": witnesses}

P2WPKH Witness Structure

For a P2WPKH input, the witness stack has exactly two items:

witness[0]: <DER signature + sighash byte>
witness[1]: <compressed public key (33 bytes)>

P2WSH Witness Structure

For a P2WSH input spending a 2-of-3 multisig:

witness[0]: ""   (empty — same off-by-one bug as bare multisig)
witness[1]: <sig1>
witness[2]: <sig2>
witness[3]: <witness script (the serialized 2-of-3 multisig script)>

Taproot Witness Structures

Key-path spend:

witness[0]: <64-byte Schnorr signature>

Script-path spend:

witness[0]: <data for script execution>
witness[1]: <tapscript>
witness[2]: <control block: version + parity byte + internal key + merkle path>

The control block encodes the path through the Taproot merkle tree (MAST structure) proving that the tapscript is one of the committed spending conditions.

Pro Tip

When debugging scripts, always start with a high-level disassembly before diving into the stack trace. Tools like bitcoin-cli decodescript are your first line of defense in identifying standard script patterns.

☕ Help support TeachMeBitcoin

TeachMeBitcoin is an ad-free, open-source educational repository curated by a passionate team of Bitcoin researchers and educators for public benefit. If you found our articles helpful, please consider supporting our hosting and ongoing content updates with a clean donation:

Ethereum: 0x578417C51783663D8A6A811B3544E1f779D39A85

Bitcoin: bc1q77k9e95rn669kpzyjr8ke9w95zhk7pa5s63qzz

Solana: 4ycT2ayqeMucixj3wS8Ay8Tq9NRDYRPKYbj3UGESyQ4J