Skip to content

Perf: Branchless compare-and-swap in ScalarEmitter source generator #32

@jonathanpeppers

Description

@jonathanpeppers

Summary

Update ScalarEmitter.EmitSortMethod in the source generator to emit branchless Math.Min/Math.Max compare-and-swap patterns instead of the current branching if (a > b) swap pattern for supported numeric types.

Current Generated Code

The ScalarEmitter (line 42) currently emits:

if (e1 > e26) { byte temp = e1; e1 = e26; e26 = temp; }

This compiles to cmp + conditional branch (jg/jle). For random data, each branch has ~50% probability, which is worst case for branch predictors.

Proposed Generated Code

For types with Math.Min/Math.Max overloads, emit:

{ int t0 = System.Math.Min(e1, e26); int t1 = System.Math.Max(e1, e26); e1 = t0; e26 = t1; }

The .NET JIT lowers Math.Min/Math.Max to branchless cmov instructions on x64.

Scope of Change

File: SortingNetworks.Generators/ScalarEmitter.cs

Types that get branchless Min/Max: byte, sbyte, short, ushort, int, uint, long, ulong, float, double

Types that keep branching: char, nint, nuint (no direct Math.Min/Math.Max overloads), string, and custom IComparable<T> types

Not changed:

  • ApplyNetworkWithComparer (loop-based IComparer path) - unchanged
  • SIMD emitters (SimdX86Emitter, SimdArmEmitter) - already branchless
  • Network data fields - unchanged

Important Caveats - Benchmark Required

CMOV is not universally faster than branches for sorting networks:

  1. Data dependency chains: CMOV creates a serial dependency. Branches allow speculative execution, which can be faster when the predictor is accurate.
  2. Predictable inputs: For sorted/reversed/near-sorted data, the branch predictor quickly learns the pattern and branches win.
  3. CPU microarchitecture matters: CMOV latency varies (1-2 cycles on modern Intel/AMD), but branch misprediction penalty is ~12-20 cycles.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions