OP_SUBSTR - Disabled Substring Extraction
5. OP_SUBSTR — Disabled Substring Extraction
Overview
OP_SUBSTR (opcode 0x7F) was designed to extract a substring (a contiguous range of bytes) from a byte string on the stack. Like OP_CAT, it was disabled in August 2010 and currently causes immediate script failure.
Opcode value: 0x7F
Current status: Disabled
Original Intended Behavior
OP_SUBSTR took three arguments from the stack: the source string, a beginning index, and a length, and returned the specified byte range.
Stack before: [ <string> <begin> <size> ]
↑ top
OP_SUBSTR:
Pops <size> (number of bytes to extract)
Pops <begin> (starting byte index, 0-based)
Pops <string> (source byte array)
Extracts: result = string[begin : begin + size]
Pushes result
Stack after: [ <substring> ]
Example:
string = 0xDEADBEEFCAFE (6 bytes: DE AD BE EF CA FE)
begin = 2 (start at index 2)
size = 3 (extract 3 bytes)
Result = 0xBEEFCA (bytes at indices 2, 3, 4)
Use Cases and Significance
OP_SUBSTR would have enabled fine-grained inspection and decomposition of serialized data inside Script. Some envisioned use cases:
1. Extracting fields from serialized structures:
Suppose a 34-byte item on the stack encodes:
Bytes 0-3: version (4 bytes)
Bytes 4-35: payload (32 bytes)
Extracting the version:
<serialized_data> OP_0 OP_4 OP_SUBSTR → <version_bytes>
Extracting the payload:
<serialized_data> OP_4 OP_32 OP_SUBSTR → <payload_bytes>
2. Transaction introspection (theoretical):
If a transaction's serialized form were available on the stack,
OP_SUBSTR could extract specific fields (version, locktime, output amounts),
enabling native covenants and transaction-constraining scripts.
3. Script-level parsing:
Scripts could parse variable-length encoded data by extracting
length prefixes and then using those to determine how many bytes to extract.
Why It Was Disabled
Like OP_CAT, OP_SUBSTR had potential for resource exhaustion in combination with other opcodes, and Satoshi disabled the entire family of string manipulation opcodes in one sweep for safety. Additionally, OP_SUBSTR had edge-case behaviors:
Error conditions:
begin < 0 → undefined / error
begin > len(string) → error
begin + size > len(string) → error (out of bounds)
size < 0 → error
size = 0 → returns empty string 0x (arguably valid but weird)
The handling of these edge cases in the original implementation was not robust, adding another reason for disabling.
Relationship to OP_LEFT and OP_RIGHT
OP_SUBSTR is strictly more general than OP_LEFT and OP_RIGHT (discussed next). Both OP_LEFT and OP_RIGHT are special cases of OP_SUBSTR:
OP_LEFT(string, n) ≡ OP_SUBSTR(string, 0, n)
OP_RIGHT(string, n) ≡ OP_SUBSTR(string, len(string)-n, n)
If OP_SUBSTR were re-enabled with proper bounds checking and size limits, OP_LEFT and OP_RIGHT would be redundant.
Summary
OP_SUBSTR represents the ability to decompose on-stack byte arrays at arbitrary positions — a fundamental capability for any sophisticated data parsing within Script. Its absence, combined with the lack of OP_CAT, makes it impossible to both construct and deconstruct complex data structures natively in Bitcoin Script.
TeachMeBitcoin is an ad-free, open-source educational repository curated by a passionate team of Bitcoin researchers and educators for public benefit. If you found our articles helpful, please consider supporting our hosting and ongoing content updates with a clean donation: