Skip to content

TIP-854: Canonicalize calldata for signature-verification precompiles #854

@yanghang8612

Description

@yanghang8612
tip: 854
title: Canonicalize calldata for signature-verification precompiles
author: yanghang8612@163.com
discussions-to: https://github.com/tronprotocol/tips/issues/854
status: Draft
type: Standards Track
category: VM
created: 2026-04-12

Simple Summary

Canonicalize the total calldata length of the two signature-verification precompiles (batchValidateSign at 0x...09, validateMultiSign at 0x...0a): reject calldata whose byte length does not match the positive-tail shape the per-call energy cost already assumes when pricing the call. On reject, the precompile's execute returns false with empty output; the invoking call frame — reachable through any of CALL / CALLTOKEN / STATICCALL / DELEGATECALL / CALLCODE — consumes its pre-allocated energy, the stack receives 0, memory receives no return data, and the outer transaction continues with its remaining budget intact.

Motivation

These two precompiles charge energy under a fixed positive-tail total-length assumption: the per-call energy cost is derived from a formula that treats the calldata as a static head followed by exactly N >= 1 equally-sized tail slots. The current execution path does not enforce the same total-length predicate before decoding — the decoder follows whatever offsets the calldata supplies and zero-pads any missing bytes through Arrays.copyOfRange. The set of byte lengths the precompile currently accepts is therefore a superset of the lengths the pricing formula has been evaluated for, and many distinct byte strings can represent the same logical call: non-word-aligned calldata has its trailing sub-word bytes silently dropped at parse time, calldata shorter than or equal to the static head is zero-padded out or treated as empty arrays, and calldata whose tail length does not decompose into an integer number of items still flows through the decoder.

The effect is that the set of inputs these precompiles accept is larger than the documented interface suggests, which makes them harder to reason about for wallets, SDKs, indexers, audits, and formal specifications. This TIP closes the total-length gap by rejecting calldata whose byte length is outside the positive-tail predicate.

These two precompiles happen to use Solidity ABI encoding, but the TIP does not claim general Solidity-ABI canonicalisation as the reference. The reference is specifically the positive-tail total-length predicate the existing energy-cost formula already bakes in. Validation of inner dynamic offsets and full abi.encode conformance is intentionally out of scope.

Specification

Let W = 32. For each precompile, let H be the number of static head words it declares, and I the number of words consumed per priced tail item. H and I are exactly the offset and divisor already present in the per-call energy cost as (words - H) / I:

Precompile H I
validateMultiSign 5 5
batchValidateSign 5 6

After activation, at the top of each precompile's execute entry:

  • If data == null, or data.length % W != 0, or data.length <= H * W, or (data.length - H * W) % (I * W) != 0, execute returns false with empty output, without invoking the decoder and without performing any ecrecover. From the caller's perspective, the invoking call frame consumes its pre-allocated energy, the stack receives 0, memory receives no return data, and the outer transaction continues.
  • Otherwise, behaviour is identical to the current implementation: the existing decoder runs and follows the offsets supplied by calldata.

The per-call energy cost itself is unchanged, and its value on rejected calldata is not observable to the caller: the rejection path is a failed-execution return, and the runtime never evaluates the success-branch refund that would otherwise subtract the cost.

Rationale

The pricing formula has always assumed calldata is a static head followed by an integer number of equally-sized positive-tail slots. What this TIP fixes is that execute did not previously enforce the same total-length predicate at runtime. The check is deliberately restricted to exactly what that formula implies:

  • Non-multiple-of-32 length: the word-level parser silently drops trailing sub-word bytes, so such input cannot represent an integer number of header-plus-items words for any N.
  • Fewer than or equal to H words: the static head is not followed by any priced tail item. This includes ABI-encoded empty arrays, which are intentionally rejected because they are the zero-cost boundary that can otherwise still drive decoder work through attacker-controlled offsets.
  • (data.length - H*W) mod (I*W) != 0: the tail cannot be priced as exactly N >= 1 items of size I words.

Validation of inner dynamic offsets, array lengths, per-element offsets, full-shape abi.encode conformance, and any further decoder hardening are out of scope for this TIP. They can be addressed in a follow-up if the community wants stronger containment.

As a side observation, the length check also closes one specific path through the decoder — inputs shorter than or equal to the static head — that can otherwise be zero-padded or interpreted as empty arrays. Length-valid inputs with malformed inner offsets can still cause the decoder to read non-canonical locations or dereference past the end of the parsed word array and are not addressed by this TIP.

Compatibility

This feature is gated behind a hardfork flag and constitutes a hard fork. Pre-activation behaviour, including the per-call energy cost, is byte-for-byte unchanged.

For any calldata whose byte length already satisfies data.length == H*W + I*W*N for some positive N (the positive-tail shape pricing has been assuming), the new rule is a no-op and execution proceeds into the existing decoder exactly as before. The inputs whose observable behaviour changes are those whose byte length is incompatible with that formula, plus header-only / empty-array calldata at data.length == H*W.

Header-only / empty-array calldata changes from the legacy soft verification-failure path (execute succeeds with a zero word) to the new malformed-calldata path (execute returns false with empty output; the caller sees the inner call push 0, no return data copied, and the call's pre-allocated energy consumed). This is intentional because such input sits at the zero-cost boundary while still allowing decoder-controlled work in the legacy implementation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions