Skip to content

feat: ATen-style ops dispatch, compiler, and full test suite#5

Open
booth-algo wants to merge 2 commits intokev/aten-testbenchfrom
kev/aten-ops
Open

feat: ATen-style ops dispatch, compiler, and full test suite#5
booth-algo wants to merge 2 commits intokev/aten-testbenchfrom
kev/aten-ops

Conversation

@booth-algo
Copy link
Copy Markdown
Collaborator

@booth-algo booth-algo commented Mar 31, 2026

Summary

Depends on: #4 (kev/aten-testbench)

ATen Operator Dispatch (plena/ops/)

  • OpRegistry with CPU (golden reference) and PLENA (ISA generation) backends
  • native_ops.yaml declarative op definitions
  • Registered ops: softmax · linear · rms_norm · layer_norm · ffn · flash_attention · conv2d · embedding_add · rope

ATen Compiler (plena/compiler/aten_compiler.py)

  • Traces nn.Module via torch.export, walks ATen graph
  • FFN fusion detection, residual save/restore pre-pass
  • Dispatches to PLENA ops backends to produce ISA code

Full Test Suite (18 recipes, all passing)

Test Match Rate
softmax, rms_norm, layer_norm, ffn, flash_attention, embedding_add 100%
aten-compiler-{linear,rms_norm,ffn,layer_norm} 100%
aten-compiler-decoder 99%
linear, bmm 93%
conv2d 95%
conv2d-tiled 93%
conv2d-siglip 91%
model-builder 8/8

Real-Model Tests

  • SmolLM2-135M decoder & FFN
  • CLM-60M FFN
  • LLaDA-8B decoder
  • SmolVLM2 vision encoder

Test plan

⚠️ nix-build CI fails due to pre-existing DNS issue. Unrelated to this PR.

booth-algo and others added 2 commits March 31, 2026 14:42
…ler, SubMatrixManager)

3-layer compilation stack for generating PLENA ISA from high-level ops:
- PLENAProgram: tensor proxy API, HBM auto-allocation, scoped naming
- DeveloperCompiler: register allocation, ISA string emission, ASM templates
- SubMatrixManager: VRAM/MRAM/HBM memory layout, sub-block addressing

Supporting infrastructure:
- symbol_table, config_utils, emulator_runner, check_mem
- Rust emulator: new opcodes (V_SHFT_V, H_STORE_V_PART)
- justfile: test recipes for all operator tests
- CLAUDE.md: project context + CI check instructions

Co-Authored-By: Ziqian Gao <zg1223@ic.ac.uk>
ATen operator dispatch system (plena/ops/):
- OpRegistry with CPU (golden reference) and PLENA (ISA generation) backends
- native_ops.yaml declarative op definitions
- Registered ops: softmax, linear, rms_norm, layer_norm, ffn,
  flash_attention, conv2d, embedding_add, rope

ATen compiler (plena/compiler/aten_compiler.py):
- Traces nn.Module via torch.export, walks ATen graph
- FFN fusion detection, residual save/restore pre-pass
- Dispatches to PLENA ops backends to produce ISA code

Test suite (18 test recipes, all passing):
- Primitive op tests, ATen compiler e2e tests
- Real-model: SmolLM2-135M, CLM-60M, LLaDA-8B, SmolVLM2
- Model layer test builder with HF loading + MXFP8 quantization
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant