pyAnalyzeLaurel: classify UserError diagnostics as known limitations#1118
pyAnalyzeLaurel: classify UserError diagnostics as known limitations#1118julesmt wants to merge 2 commits intostrata-org:mainfrom
Conversation
When Laurel-to-Core translation fails, pyAnalyzeLaurel previously reported every failure as 'RESULT: Internal error' (exit 3), including legitimate user-facing limitations such as 'calls to procedures are not supported in functions or contracts'. These are UserError-level diagnostics; reporting them as internal errors conflates genuine compiler bugs with 'not yet supported' constructs. Classify by diagnostic type: only StrataBug triggers 'Internal error'. UserError and NotYetImplemented trigger 'Known limitation' (exit 4). If the batch contains both a StrataBug and a UserError, the Strata bug takes priority so real compiler issues are not hidden.
tautschnig
left a comment
There was a problem hiding this comment.
The direction is right — lumping user-facing limitations under Internal error / exit 3 was genuinely confusing, and downgrading NotYetImplemented to Known limitation / exit 4 is an improvement. No prior review threads or issue comments on this PR. The change is small, self-contained, and doesn't touch any other paths. My concerns are all about the classification design and test coverage; I'll call them out below.
1. UserError is collapsed into Known limitation, but the project's exit-code semantics say otherwise. The header comment at StrataMain.lean:55–60 is explicit:
Codes 1–2 are user-actionable (fix the input or the code under analysis).
Codes 3–4 are tool-side (report as a bug or wait for support).
So by the module's own rule:
| Diagnostic type | Intended exit class | This PR |
|---|---|---|
StrataBug |
tool-side / internalError (3) |
✅ internalError |
NotYetImplemented |
tool-side / knownLimitation (4) |
✅ knownLimitation |
UserError |
user-actionable / userError (1) |
knownLimitation (4) |
Warning |
n/a (shouldn't cause failure) | knownLimitation (4) |
The PR's framing argues that existing diagnostics labelled UserError are often "not supported" messages (e.g. PrimitiveOp {op} with {args.length} args is not supported at LaurelToCoreTranslator.lean:224, which is DiagnosticType.UserError). That's true — the diagnostic labelling in the translator is fuzzy. But the right place to fix that fuzziness is the diagnostic labels, not the classifier here. Otherwise a genuine user error (like "calls to procedures are not supported in functions or contracts" that the PR body cites) will land with exit 4 ("tool-side, wait for support") when it should land with exit 1 ("fix your code").
Concrete suggestion: priority-order the severities, matching the Python → Laurel path's tri-fold classification a few lines above (637–640):
let severity :=
if laurelTranslateErrors.any (·.type == .StrataBug) then .StrataBug
else if laurelTranslateErrors.any (·.type == .UserError) then .UserError
else if laurelTranslateErrors.any (·.type == .NotYetImplemented) then .NotYetImplemented
else .Warning -- only reachable if translation returned `none` with warnings-only, which is itself a bug
match severity with
| .StrataBug => exitPyAnalyzeInternalError msg
| .UserError => exitPyAnalyzeUserError msg
| .NotYetImplemented => exitPyAnalyzeKnownLimitation msg
| .Warning => exitPyAnalyzeInternalError msg -- failure with no real diagnosticTakes the same worst-wins shape as the current hasStrataBug check but extends it to the full enum. Addresses the conflation without changing the spirit of the PR.
If the diagnostic labels in the translator really are inaccurate (e.g. .UserError used where .NotYetImplemented would be more honest), that's a separate, prior bug; fixing the labels at the call sites is the right scope. Happy to suggest a follow-up PR that relabels the specific "not supported yet" sites.
2. No tests. This is a behavioural change to exit codes and the RESULT: line — exactly the class of change that shell-based regression tests catch. The repo has precedent for these: StrataTest/Languages/Python/expected_laurel/*.expected snapshots pin down the DETAIL:/RESULT: lines for cases that hit exitPyAnalyzeUserError, and adding one more snapshot file plus a tiny Python fixture that reproducibly triggers DiagnosticType.UserError in Laurel-to-Core would nail down the new behaviour. Two more fixtures:
- A
StrataBugfixture (e.g. a Laurel program that triggers.prim .realtranslation atLaurelToCoreTranslator.lean:107which emitsDiagnosticType.StrataBug) — expectRESULT: Internal error, exit 3. This is the "regression guard": if the classifier ever flips to lump StrataBug into limitations, this test catches it. - A
NotYetImplementedfixture — expectRESULT: Known limitation, exit 4. - A mixed fixture (same run emits both
StrataBugandUserError) — expectRESULT: Internal error, exit 3 (StrataBug wins). This is the only test that actually exercises thehasStrataBugpriority logic.
Without those three, a future "simplify" refactor could silently flip the classification.
3. Edge cases currently unhandled.
- Empty
laurelTranslateErrorswithcoreProgramOption = none. This shouldn't happen (how does translation fail with no diagnostics?) — but if it does,any (· == .StrataBug)isfalse, so the PR classifies it asKnown limitation. That's the wrong classification for a "shouldn't happen" situation; it should beInternal errorsince anone-without-diagnostics is itself a Strata bug. Trivial to add:if laurelTranslateErrors.isEmpty || hasStrataBug then internal else limitation. - Warnings-only failure. Same concern: a list of only
.Warningdiagnostics withcoreProgramOption = noneimplies translation failed silently.
The severity-match suggestion above handles both.
4. Message formatting. s!"Laurel to Core translation failed: {laurelTranslateErrors}" dumps the full List DiagnosticModel via ToString. In an Internal error context that's fine (we want raw diagnostics for debugging). In a Known limitation context, a user-facing DETAIL: line should probably render the diagnostics through DiagnosticModel.format against the known fileMap, the way Python → Laurel user errors at line 637 do. Not blocking; worth a follow-up.
5. Refactoring. The Python → Laurel block at 637–640 and this new block both classify errors; the logic isn't quite the same shape (one matches a PipelineError sum, the other walks a diagnostic list), but a shared helper classifyAndExit : List DiagnosticModel → String → IO α would let both sites (and any future third site) stay in lockstep. Again, follow-up-able.
Co-authored-by: Michael Tautschnig <mt@debian.org>
When Laurel-to-Core translation fails, pyAnalyzeLaurel previously reported every failure as 'RESULT: Internal error' (exit 3), including legitimate user-facing limitations such as 'calls to procedures are not supported in functions or contracts'. These are UserError-level diagnostics; reporting them as internal errors conflates genuine compiler bugs with 'not yet supported' constructs.
Classify by diagnostic type: only StrataBug triggers 'Internal error'. UserError and NotYetImplemented trigger 'Known limitation' (exit 4). If the batch contains both a StrataBug and a UserError, the Strata bug takes priority so real compiler issues are not hidden.
Issue #, if available:
Description of changes:
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.