-
✓ RSR template with full CI/CD (17 workflows)
-
✓ Rust CLI with subcommands (init, validate, generate, build, run, info)
-
✓ Manifest parser (
halideiser.toml) -
✓ Codegen stubs
-
✓ Idris2 ABI module stubs (Types, Layout, Foreign)
-
✓ Zig FFI bridge stubs
-
✓ README with architecture
-
❏ Define
halideiser.tomlschema for pipeline stages (blur, sharpen, resize, convolve, etc.) -
❏ Parse buffer dimensions (width, height, channels, frames)
-
❏ Parse data types (uint8, uint16, float32, float64)
-
❏ Validate stage connectivity — output dimensions match next stage’s input
-
❏ Parse hardware target declarations (cpu, gpu, wasm)
-
❏ Parse scheduling hints (tile sizes, parallelism, vectorisation width)
-
❏ Error diagnostics with source spans pointing into TOML
-
❏ Emit
Funcdefinitions from pipeline stages -
❏ Emit
Varbindings (x, y, c for spatial + channel dimensions) -
❏ Generate
Exprtrees from stage operations (clamp, cast, select, lerp) -
❏ Map common operations: Gaussian blur, box filter, Sobel, resize (bilinear/bicubic)
-
❏ Support multi-stage pipelines with
Funcchaining -
❏ Generate
Buffer<>declarations matching input/output dimensions -
❏ Emit
BoundaryConditions(repeat_edge, constant_exterior, mirror) -
❏ C++ output for Halide AOT compilation
-
❏ Default schedule heuristics per operation type
-
❏
tile(x, y, xi, yi, tx, ty)with configurable tile sizes -
❏
vectorize(xi, width)for SIMD targets (SSE4=4, AVX2=8, AVX-512=16, NEON=4) -
❏
parallelize(y)for multi-core CPU targets -
❏
compute_at/store_atfusion for multi-stage pipelines -
❏
reorderloop variables for cache-optimal traversal -
❏
gpu_blocks/gpu_threadsmapping for CUDA/OpenCL targets -
❏ Auto-tuning loop: measure, perturb schedule, measure again
-
❏ Beam search or genetic algorithm over schedule space
-
❏ Cache tuning results per (pipeline, hardware) pair
-
❏ x86 SSE4.2 + AVX2 backend via Halide
Target -
❏ x86 AVX-512 backend
-
❏ ARM NEON backend (mobile, Raspberry Pi)
-
❏ ARM SVE backend (server ARM)
-
❏ CUDA backend (NVIDIA GPU)
-
❏ OpenCL backend (cross-vendor GPU)
-
❏ WebAssembly backend (browser deployment)
-
❏ Metal backend (Apple GPU)
-
❏ Cross-compilation from any host to any target
-
❏ Fat binary generation (multiple targets in one artifact)
-
❏ Prove buffer dimension compatibility between stages
-
❏ Prove output buffer bounds from input dimensions + operations
-
❏ Prove schedule preserves algorithm semantics (tiling does not change results)
-
❏ Prove vectorisation width divides tile dimension
-
❏ Prove
compute_at/store_atdo not introduce data races -
❏ Prove boundary conditions handle all edge pixels
-
❏ Dependent types for buffer_t layout (stride, extent, min)
-
❏ PanLL panel: pipeline visualisation and schedule explorer
-
❏ BoJ-server cartridge for remote pipeline compilation
-
❏ VeriSimDB backing store for tuning results
-
❏ Zig FFI bridge: call compiled pipelines from any language via C ABI
-
❏ Example gallery: common image/video pipelines with benchmarks
-
❏ Publish to crates.io
-
❏ Integration with iseriser meta-framework