feat: Add Split-Solve-Reconcile workflow for cloud-based SMT solving#1096
feat: Add Split-Solve-Reconcile workflow for cloud-based SMT solving#1096olivier-aws wants to merge 5 commits intomainfrom
Conversation
Add a split-solve-reconcile (SSR) workflow that decouples VC generation from SMT solving, enabling parallel or cloud-based solver execution. New files: - Manifest.lean: manifest generation capturing obligation metadata - Reconcile.lean: reconciliation of solver results with manifest - StrataMain reconcileCommand: CLI entry point for reconciliation - Scripts/ssr_py.sh: helper script for the SSR workflow - docs/design/SplitSolveReconcile.md: design document Modified files: - Verifier.lean: add VerifyCallbacks, buildVCResult, thread callbacks through getObligationResult/verifySingleEnv/verify - SimpleAPI.lean: install manifest-emitting callbacks when --no-solve - StrataMain.lean: add reconcile command, use verifyWithManifest - README.md: document the SSR workflow
Shell-based test that validates the Split-Solve-Reconcile workflow end-to-end for Python test files. For each test_*.py with expected output, it: 1. Parses Python to Ion 2. Runs pyAnalyzeLaurel --no-solve to generate manifest + .smt2 files 3. Solves each .smt2 with cvc5 4. Runs strata reconcile to produce the final report 5. Compares DETAIL/RESULT summary lines against expected output Results: 202 passed, 0 failed, 14 skipped (known limitations and tests requiring non-default check modes).
|
@keyboardDrummer-bot Please update the PR description and the design doc according to the latest commits in this PR |
Rewrite SplitSolveReconcile.md to match the current implementation which embeds obligation metadata directly in .smt2 files via set-info directives instead of using a separate manifest.json file. Removes references to Manifest.lean, ManifestTest.lean, ReconcileTest.lean, and the JSON manifest format. Documents the set-info directives, the reconcile algorithm, buildVCResult as shared classification logic, and the rationale for the manifest-free approach.
|
Updated the PR description and design doc to reflect the current implementation. Key changes:
|
robin-aws
left a comment
There was a problem hiding this comment.
Overall I like this approach to make --no-solve a complete feature. Added lots of high-level feedback, will take a closer look at the implementation once addressed.
| parallelism — all queries can be dispatched simultaneously — but requires | ||
| decoupling the pipeline into three phases: |
There was a problem hiding this comment.
I like this phrasing of the motivation, but to nit-pick a bit, cloud-based solving could also be far more than just three phases. It could easily extend to a more general dynamically-constructed proof tree, with more iteration on proof goals.
I would just tweak this to point out that you need at least these three types of tasks, even if they aren't necessarily organized into three distinct phases (although they will be after this change)
|
|
||
| 1. **Generate** — run the Strata pipeline up to SMT file creation | ||
| 2. **Solve** — dispatch SMT queries to cloud solvers (external to Strata) | ||
| 3. **Reconcile** — read solver results and produce the final verification report |
There was a problem hiding this comment.
I don't love the term "reconcile" as it implies the guaranteed presence of some kind of conflicts. I would consider "report", "gather", "reduce" (as in Map/Reduce), "aggregate", etc.
| format, keeps each `.smt2` file self-contained, and avoids synchronization issues | ||
| between manifest and SMT files. | ||
|
|
||
| ### Phase 1: Generate (`--no-solve`) |
There was a problem hiding this comment.
Some maintainers are understandably concerned that supporting this option (and this new command that really makes it a complete end-to-end feature) will be limiting in the future, both because it limits the cloud-based solving to a simple three-phase approach (see also my first comment) and because it will be more and more challenging to support new features, if they have to be able to capture all necessary state in set-info commands.
Can we mark both --no-solve and this new command as experimental, and/or something we might remove/completely rework in the future?
| The user runs each `.smt2` file through a solver (locally, in the cloud, etc.) | ||
| and captures the solver's stdout into a corresponding `.result` file: |
There was a problem hiding this comment.
For files with resolved-sat and resolved-val directives, is it legal to not run them and not create a .result file? Or are they expected to create the equivalent .result file? Or do they HAVE to call the solver anyway?
| 2. For each `.smt2` file: | ||
| - Parses `set-info` directives to extract obligation metadata (`SMT2Meta`) | ||
| - Determines which checks were requested (satisfiability, validity) | ||
| - If evaluator-resolved: uses the stored verdict directly |
There was a problem hiding this comment.
Ah here's the answer. Should be mentioned in the description of Phase 2 directly.
This isn't a strict either-or is it? Isn't it possible to resolve the sat check but not the validity check, or vice-versa?
There was a problem hiding this comment.
Why doesn't this call ssr_py.sh?
| if grep -qE 'Known limitation|User error|Internal error' "$expected_file"; then | ||
| echo "SKIP (expected error): $base_name" |
There was a problem hiding this comment.
Not how we should handle such errors. We need to record specifically what's expected for specific test inputs, not just silently swallow errors.
| # ./run_py_ssr_test.sh [--filter <pattern>] [--solver <cmd>] | ||
| # ------------------------------------------------------------------------------ | ||
|
|
||
| set -u |
There was a problem hiding this comment.
Off-topic issue: I'm concerned with how many shell scripts we're adding to this repository. LLMs love spitting them out, but they are also very sloppy about setting the right safety flags. We're also giving up on a great deal of safety by dropping from Lean to bash, even if we don't prove much about this kind of logic.
There was a problem hiding this comment.
See comment below about bash scripts in general - this could easily be a StrataMain.lean command as well, even if it would need to shell-out for each phase.
Summary
Add a Split-Solve-Reconcile (SSR) workflow that decouples VC generation from SMT solving, enabling parallel or cloud-based solver execution.
Architecture
The key design decision is manifest-free reconciliation: instead of emitting a separate
manifest.json, all obligation metadata is embedded directly in the.smt2files via SMT-LIBset-infodirectives. The reconcile phase parses these directives to reconstruct obligation metadata, then pairs each.smt2with its corresponding.resultfile.Three-phase workflow
strata verify --no-solve --vc-directory ./vcs/runs the full pipeline (parse, transform, symbolic eval, SMT encoding) and writes.smt2files with embedded metadata..smt2file through a solver (locally, in parallel, or in the cloud) and captures stdout into a.resultfile.strata reconcile --vc-directory ./vcs/reads.smt2files for obligation metadata, pairs them with.resultfiles, and produces the final verification report.Embedded metadata (
set-infodirectives)Each
.smt2file now includes:(set-info :file "...")— source file path(set-info :start N)/(set-info :stop N)— source location range(set-info :final-message "...")— obligation label/message(set-info :property "...")— property type (assert,cover,divisionByZero,arithmeticOverflow)(set-info :resolved-sat "...")/(set-info :resolved-val "...")— evaluator-resolved results (when the evaluator already decided a check before the solver ran)(set-info :sat-message "...")/(set-info :unsat-message "...")— presence indicates which checks were requestedChanges
New files
Strata/Languages/Core/Reconcile.lean— Reconcile module: parsesset-infometadata from.smt2files, parses.resultfiles, buildsVCResults using the sharedbuildVCResultfunction, and merges results viamergeByAssertion.Scripts/ssr_py.sh— End-to-end SSR helper script for Python analysis (generate → solve → reconcile).StrataTest/Languages/Python/run_py_ssr_test.sh— Integration test validating the SSR workflow against all Python test files.docs/design/SplitSolveReconcile.md— Design document.Modified files
Strata/Languages/Core/Verifier.lean:buildVCResultfunction: single source of truth for turning(satResult, valResult)into a classifiedVCResult, used by both the integrated verifier and the reconcile path.encodeCore/dischargeObligation/getObligationResult: extended withproperty,resolvedSat,resolvedValparameters to emitset-infodirectives in.smt2files.verdictString/propertyString: convert solver results and property types to short strings forset-infoembedding.buildSolverLog,typedVarToSMTFn: made public for reuse by reconcile.getObligationResultrefactored to usebuildVCResult.verifySingleEnv: threads evaluator-resolved results (peSatResult?,peValResult?) through togetObligationResult.Strata/SimpleAPI.lean— Minor refactor ofCore.verifyProgram(do-notation).StrataMain.lean— NewreconcileCommandCLI entry point; added to the Core command group. Namespace fix forverify→Strata.verify.StrataTest/Languages/Core/Tests/SMTEncoderTests.lean— Updated expected output to include the new(set-info :property "assert")directive.README.md— Documents the SSR workflow with usage examples.Testing
set-infooutput.run_py_ssr_test.sh) validates the full workflow end-to-end for Python test files: 202 passed, 0 failed, 14 skipped.By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.