OMC/ROADMAP.json at master · RandomCoder-lab/OMC · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
{
  "$schema_version": "3.0",
  "project": "OMNIcode",
  "generated": "2026-05-15",
  "supersedes": "ROADMAP.json v2.0",
  "north_star": "A transformerless LLM whose attention, positional encoding, OOD gating, and (where the experiments support it) primary computation are built from harmonic primitives instead of softmax + sinusoidal PE + MLPs.",
  "description": "v3 reorganizes OMC's roadmap around the 6-layer architecture defined in README: substrate -> HBit -> JIT -> harmonic libs -> hybrid LLM experiments -> infrastructure. Each layer's next-steps are ranked by leverage toward the transformerless LLM north star. Items that don't move that needle are deferred or deleted. Today's session shipped 14 commits across Sessions A-H + Path A-D + a README reframe; the v3 backlog reflects the gap between current state and the LLM goal.",

  "session_summary_2026_05_15_part_2": {
    "shipped_this_session": [
      "Session A: omnimcode-codegen crate scaffolded; LLVM 18 JIT foundation",
      "Session B: locals + branches + loops + recursion in scalar lowerer",
      "Session C: <2 x i64> dual-band representation; matched-band parity tests",
      "Session D: Interpreter::set_jit_dispatch hook; jit_module dispatching",
      "Session D.5: omnimcode-cli extracted from core; OMC_HBIT_JIT=1 env var",
      "Session E: omc-bench harness; 272x measured speedup factorial(12)",
      "Session F: phi_shadow(x) intrinsic for divergent beta",
      "Session G: omc_harmony() extern + harmony() builtin in JIT",
      "Session H: cross-fn calls in dual-band lowerer (3-phase jit_module)",
      "Path A.1: harmony-gated branch elision benched (95.2% reduction, 5-8% break-even)",
      "Path A.2: f64 support in scalar + dual-band lowerers",
      "Path A.3: bytecode VM bench (VM 2.1x; JIT 119x over VM)",
      "Path A.4: array reads (NewArray + ArrayLen + ArrayIndex)",
      "Path D: array writes (ArrSetNamed + ArrayIndexAssign)",
      "Path B: real-world JIT measurement on NSL-KDD (honest negative)",
      "Path C: README reframed around transformerless LLM goal"
    ],
    "tests_status": {
      "codegen_tests": "41/41 passing",
      "core_unit_tests": "149/149 passing",
      "omc_harmonic_lib_tests": "18/18 passing",
      "engine_parity": "44/45 byte-identical (benchmarks.omc has timing-only diff)"
    },
    "honest_findings": [
      "JIT works structurally but harmonic libs as currently written use dicts + string-keyed freq tables -> only 1/4 user fns JIT'd on NSL-KDD; wall-clock unchanged. Path forward: extend codegen with dict + string support, OR rewrite libs to use array-of-hashed-int.",
      "Cross-fn float passing requires explicit conversion at fn boundary (compiler-side limitation: untyped Op::Add on float bit-pattern is wrong).",
      "AVX-512 widening blocked by no-array-processing OMC fns to fill the wider lanes."
    ]
  },

  "layer_1_substrate": {
    "description": "log_phi_pi_fibonacci as THE base algorithm. 40-entry FIBONACCI table, nearest_attractor_with_dist as canonical lookup. Already shipped + audited.",
    "status": "MATURE",
    "next_steps": [
      {
        "id": "S1",
        "title": "Substrate-coherence metric for arbitrary float operations",
        "effort": "small",
        "priority": "low",
        "rationale": "Currently the substrate gates int operations cleanly. For floats, we have phi_fold but no built-in coherence reading on arbitrary float values. A `phi_coherence(f) -> [0,1000]` builtin would let user code reason about how on-grid a computed float is, useful for the LLM detector layers.",
        "prerequisites": [],
        "moves_toward_llm": "indirect — primitive useful for OOD gating in the model"
      }
    ]
  },

  "layer_2_hbit": {
    "description": "Dual-band executable form of substrate. alpha = classical, beta = phi-shadow. JIT-wired with phi_shadow + harmony intrinsics. Branch-elision via @predict pattern.",
    "status": "FUNCTIONAL — first end-to-end shipping; needs broader op coverage in dual-band mode",
    "next_steps": [
      {
        "id": "H1",
        "title": "Float ops in dual-band: AddFloat/SubFloat/MulFloat parallel-lane versions",
        "effort": "small",
        "priority": "medium",
        "rationale": "Already shipped in Path A.2. Sets the pattern for HBit on numerical workloads. Beta tracks alpha through float math via <2 x f64> bitcast.",
        "prerequisites": [],
        "status": "shipped this session"
      },
      {
        "id": "H2",
        "title": "Harmony reading on float values",
        "effort": "small",
        "priority": "medium",
        "rationale": "Today's harmony() reads alpha/beta as i64 (substrate-attractor metric). For float values, harmony should read float-coherence (phi_fold proximity in float space). New extern omc_harmony_float(alpha: f64, beta: f64) -> i64.",
        "prerequisites": ["S1"],
        "moves_toward_llm": "direct — float harmony is the OOD signal for activations in the model"
      },
      {
        "id": "H3",
        "title": "Predictive correction primitive: when harmony low, snap to nearest attractor",
        "effort": "medium",
        "priority": "medium",
        "rationale": "@predict in SL's HBit demos was framed as 'auto-corrects'. Today's @predict only branches on harmony; a `predict_correct(x)` builtin would, in JIT'd code, return either x (high harmony) or fold(x) (low harmony, snap to grid). This is the architectural primitive for the LLM's activation regularization.",
        "prerequisites": [],
        "moves_toward_llm": "direct — auto-correction of off-grid activations IS the regularization story"
      }
    ]
  },

  "layer_3_jit": {
    "description": "LLVM-backed dual-band JIT. 41 tests; 272x microbench; 119x over VM. Pure-int + array + float covered.",
    "status": "FUNCTIONAL — coverage gaps identified by Path B",
    "next_steps": [
      {
        "id": "J1",
        "title": "Dict support in codegen",
        "effort": "large",
        "priority": "high",
        "rationale": "The single largest gap between 'JIT works' and 'JIT useful for shipped libraries'. harmonic_anomaly's hot path uses dict_set/dict_get; without dict codegen, the hot path falls back to tree-walk. Hash table representation in LLVM, key hashing via extern Rust call, bucket arrays + collision handling. Big task but unblocks every existing harmonic library.",
        "prerequisites": ["A.4 array reads", "A.2 floats"],
        "moves_toward_llm": "indirect — unblocks the harmonic libraries that prove substrate utility, which are the empirical warm-up to the LLM",
        "honest_estimate": "2-3 sessions"
      },
      {
        "id": "J2",
        "title": "String support in codegen",
        "effort": "large",
        "priority": "medium",
        "rationale": "Same story as J1, smaller scope. Needs heap allocation + pointer-based representation. String concat for dict keys. Could share infrastructure with arrays.",
        "prerequisites": [],
        "moves_toward_llm": "indirect — needed for the harmonic libraries' string-keyed dicts; LLM training itself is mostly numerical",
        "honest_estimate": "1-2 sessions"
      },
      {
        "id": "J3",
        "title": "Fallback-to-tree-walk for one builtin within an otherwise-JIT'd fn",
        "effort": "medium",
        "priority": "high",
        "rationale": "Currently if a fn uses ONE unsupported op (e.g., csv_parse, py_call), the whole fn falls back to tree-walk. A 'JIT body but call back into tree-walk for the one unsupported op' mechanism would let many real fns benefit even if they touch one builtin we don't yet cover.",
        "prerequisites": [],
        "moves_toward_llm": "indirect — same payoff axis as J1/J2"
      },
      {
        "id": "J4",
        "title": "Float-typed Div + comparison ops in bytecode compiler",
        "effort": "small",
        "priority": "high",
        "rationale": "Compiler-side limitation flagged in Path A.2. Plain Op::Div is always emitted; JIT treats float bit-pattern as int -> garbage. Needs the compiler to emit DivFloat/EqFloat/etc when both operands are statically typed-float (var_types tracking is already there).",
        "prerequisites": [],
        "moves_toward_llm": "direct — most LLM math is float division and float comparisons"
      },
      {
        "id": "J5",
        "title": "AVX-512 widening: <2 x i64> -> <8 x i64> for array-processing fns",
        "effort": "large",
        "priority": "low",
        "rationale": "Real SIMD payoff requires fns whose loop bodies process arrays in lockstep. With Path A.4 array reads shipped, a vectorized inner-loop pass could emit <8 x i64> ops. Useful but not on the critical path to the LLM (model compute is float-heavy, not int-heavy).",
        "prerequisites": ["array writes"],
        "moves_toward_llm": "indirect"
      },
      {
        "id": "J6",
        "title": "@hbit pragma plumbing: opt-in JIT instead of auto-try-everything",
        "effort": "small",
        "priority": "low",
        "rationale": "Today JIT auto-tries every user fn. Wiring @hbit as the explicit opt-in matches SL's design and gives users a way to debug-mode-disable JIT for specific fns without OMC_HBIT_JIT=0 globally. Mostly a parser pragma + jit_module filter.",
        "prerequisites": [],
        "moves_toward_llm": "minor — code-organization concern"
      }
    ]
  },

  "layer_4_harmonic_libraries": {
    "description": "Substrate-aligned ML libraries. Beat sklearn 10/10 vs 7/10 on credential stuffing. Mixed/honest results on volumetric NSL-KDD. The empirical proof that substrate has utility on real data.",
    "status": "FUNCTIONAL — 3 libraries published; need refactor for JIT compatibility",
    "next_steps": [
      {
        "id": "L1",
        "title": "Rewrite harmonic_anomaly to use array-of-hashed-int instead of dict-of-string-key",
        "effort": "medium",
        "priority": "high",
        "rationale": "The Option-2 fix from docs/jit_real_world.md. Replaces freq_dict[concat_many('', bucket)] = count with parallel arrays freq_keys[i] = hash(bucket), freq_counts[i] = count. Half a session of library refactor. Reward: harmonic_anomaly's hot path JITs end-to-end -> ~250x speedup on NSL-KDD wall-clock.",
        "prerequisites": ["array writes (Path D, shipped)"],
        "moves_toward_llm": "indirect — proves the JIT is real on a real workload, sets the pattern for future substrate-aligned libs",
        "alternative": "Implement J1 (dict codegen) instead and keep the library unchanged"
      },
      {
        "id": "L2",
        "title": "Time-aware harmonic for NAB (revisit)",
        "effort": "medium",
        "priority": "low",
        "rationale": "v2 ROADMAP item still relevant. Current NAB result is 7/19 tied with IF (naive baseline tier). To beat IF needs CUSUM/seasonality/HMM-style components. Not a substrate question; the substrate works fine on the sample-time-series data.",
        "prerequisites": [],
        "moves_toward_llm": "minor"
      }
    ]
  },

  "layer_5_hybrid_llm_experiments": {
    "description": "10 experiments in experiments/hybrid_llm/ measuring where harmonic primitives win and lose vs transformer components. Headline wins: HBit cross-cutting tension AUROC 1.0 (exp 5), compression-gate 34x compression (exp 6). Headline losses: OmniWeight loses softmax on perturbed query (exp 1), multi-channel harmonic PE loses sinusoidal at L>=16 (exp 3).",
    "status": "FOUNDATIONAL — empirical record exists; gap to LLM is finding harmonic primitives that win on PRIMARY computation paths",
    "next_steps": [
      {
        "id": "E1",
        "title": "Look for harmonic-attention primitives that beat softmax on a real task",
        "effort": "very_large",
        "priority": "highest",
        "rationale": "This is THE blocker on transformerless LLM. Experiment 1 showed simple OmniWeight loses to softmax on perturbed-query recovery. The negative finding doesn't say no harmonic attention works — it says THAT one doesn't. Possible directions: (a) attention with per-head harmonic gating on top of softmax (auxiliary, per the 'detector' read), (b) attention with a phi-fold-aware similarity that is NOT just |q-k| (e.g., harmonic-distance), (c) hybrid attention that uses softmax for the hot-path scoring + HBit tension as a residual gate.",
        "prerequisites": [],
        "moves_toward_llm": "DIRECT and CRITICAL — this is the primary gap"
      },
      {
        "id": "E2",
        "title": "Look for harmonic positional encoding that wins beyond L>=16",
        "effort": "large",
        "priority": "highest",
        "rationale": "Same architectural pattern as E1. Experiment 3 showed multi-channel phi-fold PE saturates at 22 unique vectors by L=64; sinusoidal stays distinct to L=64. The harmonic side needs more dimensions of distinctness or a smarter encoding (e.g., (phi-fold mod p1, phi-fold mod p2, ...) for relatively-prime p1, p2 — harmonic CRT-style).",
        "prerequisites": [],
        "moves_toward_llm": "DIRECT — without distinct PE, no sequence model"
      },
      {
        "id": "E3",
        "title": "Train a small hybrid model end-to-end (transformer + HBit OOD gate)",
        "effort": "very_large",
        "priority": "high",
        "rationale": "The 'detector' framing the experiments converged on. Use embedded Python (torch.omc) + HBit tension gate. Training loop in OMC, model parameters in PyTorch. Endpoint: a model that USES HBit for runtime OOD detection during inference. Even before E1/E2 land, this proves the substrate has a place in real LLMs.",
        "prerequisites": ["torch.omc improvements"],
        "moves_toward_llm": "DIRECT — first real model that has substrate-routed components"
      },
      {
        "id": "E4",
        "title": "Compression-gate model at scale (extension of exp 6)",
        "effort": "large",
        "priority": "medium",
        "rationale": "Exp 6 showed 34x compression on a toy. The same library + chain-of-keys structure should compress 9 orders of magnitude at LLM scale (extrapolated). Test on a real medium-vocabulary task: tokenizer-output -> chain-of-keys -> fine-grained predictions. If compression+death-tolerance hold at scale, this is a real architectural primitive.",
        "prerequisites": [],
        "moves_toward_llm": "DIRECT — this might be the substrate's actual home in LLM architecture"
      },
      {
        "id": "E5",
        "title": "Re-validate experiments 0-9 under the substrate-fill",
        "effort": "small",
        "priority": "medium",
        "rationale": "The experiments were authored on the post-substrate-refactor branch. After the substrate fill-in (libraries routed through log_phi_pi_fibonacci end-to-end), some experiment numbers may shift. Worth checking if any negative findings flip.",
        "prerequisites": [],
        "moves_toward_llm": "minor — sanity check"
      }
    ]
  },

  "layer_6_infrastructure": {
    "description": "Self-hosting compiler, package manager, embedded CPython, two-engine parity, LSP, WASM. Plumbing.",
    "status": "MATURE",
    "next_steps": [
      {
        "id": "I1",
        "title": "VM hot-path optimization: re-investigate Op::ArrayIndex inlining + vm_fast_dispatch",
        "effort": "medium",
        "priority": "low",
        "rationale": "Path A.3 measured VM at 2.1x over tree-walk. Memory note says vm_call_builtin's synthetic-arg shim was the prior bottleneck and vm_fast_dispatch + Op::ArrayIndex inlining is the fix. Worth revisiting now that JIT is in. VM still has a niche (no LLVM dependency, lighter binary).",
        "prerequisites": [],
        "moves_toward_llm": "minor"
      },
      {
        "id": "I2",
        "title": "LSP improvements: hover for harmonic primitives showing attractor lookup",
        "effort": "small",
        "priority": "low",
        "rationale": "When the user hovers over a numeric literal in their editor, show 'value=89, on attractor (89), resonance=1.0' or 'value=100, nearest=89 dist 11, resonance=0.083'. Makes the substrate visible while editing.",
        "prerequisites": [],
        "moves_toward_llm": "minor — author productivity"
      },
      {
        "id": "I3",
        "title": "CI: build + test matrix for --features llvm-jit (currently manual)",
        "effort": "small",
        "priority": "medium",
        "rationale": "GitHub Actions config that builds with and without llvm-jit, runs the test suite in both modes. Catches regressions in either path. Bonus: also build the WASM and LSP targets.",
        "prerequisites": [],
        "moves_toward_llm": "minor"
      }
    ]
  },

  "deprecated_or_deleted_from_v2": [
    "Items framed around 'OMC as ML library' standalone — superseded by the transformerless LLM north star. The harmonic libraries remain valuable as evidence the substrate works, but they're not the project's purpose.",
    "Items framed around competing with sklearn/scipy generally — OMC competes by being the substrate for a different kind of model, not a faster sklearn."
  ],

  "principles_v3": [
    "Harmonic primitives need to win on PRIMARY computation paths to enable transformerless LLM. Detector-only wins (HBit AUROC 1.0) are necessary but not sufficient.",
    "JIT speed is real (272x measured) but doesn't apply to libraries that use dicts/strings. Either fix the libs or fix the JIT.",
    "Substrate purity > benchmark numbers. Phase 2 of substrate fill-in cost us the K=500 NSL-KDD win for architectural completeness; that was the right trade.",
    "Honest negative results compound. Experiment 1 (OmniWeight loses to softmax) didn't kill the project — it told us where to look next.",
    "Build the model in pieces, each empirically validated. Don't aim for a complete transformerless LLM in one shot."
  ],

  "anti_goals_v3": [
    "Don't make the harmonic libraries the product. They're proof points.",
    "Don't compete with PyTorch on transformer training. Use it for the parts that work; build harmonic for the parts where the experiments validate substitution.",
    "Don't pretend the JIT is the architecture. The JIT is performance plumbing for the architecture.",
    "Don't scope-creep into general programming language features. OMC is a substrate for a model; if a feature doesn't move that needle, defer it.",
    "Don't ship a 'transformerless LLM' that is secretly just a hybrid. If the model uses softmax attention, say so."
  ]
}