Skip to content

feat: Add Split-Solve-Reconcile workflow for cloud-based SMT solving#1096

Draft
olivier-aws wants to merge 5 commits intomainfrom
SSR_workflow
Draft

feat: Add Split-Solve-Reconcile workflow for cloud-based SMT solving#1096
olivier-aws wants to merge 5 commits intomainfrom
SSR_workflow

Conversation

@olivier-aws
Copy link
Copy Markdown
Contributor

@olivier-aws olivier-aws commented May 1, 2026

Summary

Add a Split-Solve-Reconcile (SSR) workflow that decouples VC generation from SMT solving, enabling parallel or cloud-based solver execution.

Architecture

The key design decision is manifest-free reconciliation: instead of emitting a separate manifest.json, all obligation metadata is embedded directly in the .smt2 files via SMT-LIB set-info directives. The reconcile phase parses these directives to reconstruct obligation metadata, then pairs each .smt2 with its corresponding .result file.

Three-phase workflow

  1. Generatestrata verify --no-solve --vc-directory ./vcs/ runs the full pipeline (parse, transform, symbolic eval, SMT encoding) and writes .smt2 files with embedded metadata.
  2. Solve — The user runs each .smt2 file through a solver (locally, in parallel, or in the cloud) and captures stdout into a .result file.
  3. Reconcilestrata reconcile --vc-directory ./vcs/ reads .smt2 files for obligation metadata, pairs them with .result files, and produces the final verification report.

Embedded metadata (set-info directives)

Each .smt2 file now includes:

  • (set-info :file "...") — source file path
  • (set-info :start N) / (set-info :stop N) — source location range
  • (set-info :final-message "...") — obligation label/message
  • (set-info :property "...") — property type (assert, cover, divisionByZero, arithmeticOverflow)
  • (set-info :resolved-sat "...") / (set-info :resolved-val "...") — evaluator-resolved results (when the evaluator already decided a check before the solver ran)
  • (set-info :sat-message "...") / (set-info :unsat-message "...") — presence indicates which checks were requested

Changes

New files

  • Strata/Languages/Core/Reconcile.lean — Reconcile module: parses set-info metadata from .smt2 files, parses .result files, builds VCResults using the shared buildVCResult function, and merges results via mergeByAssertion.
  • Scripts/ssr_py.sh — End-to-end SSR helper script for Python analysis (generate → solve → reconcile).
  • StrataTest/Languages/Python/run_py_ssr_test.sh — Integration test validating the SSR workflow against all Python test files.
  • docs/design/SplitSolveReconcile.md — Design document.

Modified files

  • Strata/Languages/Core/Verifier.lean:
    • New buildVCResult function: single source of truth for turning (satResult, valResult) into a classified VCResult, used by both the integrated verifier and the reconcile path.
    • encodeCore / dischargeObligation / getObligationResult: extended with property, resolvedSat, resolvedVal parameters to emit set-info directives in .smt2 files.
    • verdictString / propertyString: convert solver results and property types to short strings for set-info embedding.
    • buildSolverLog, typedVarToSMTFn: made public for reuse by reconcile.
    • getObligationResult refactored to use buildVCResult.
    • verifySingleEnv: threads evaluator-resolved results (peSatResult?, peValResult?) through to getObligationResult.
  • Strata/SimpleAPI.lean — Minor refactor of Core.verifyProgram (do-notation).
  • StrataMain.lean — New reconcileCommand CLI entry point; added to the Core command group. Namespace fix for verifyStrata.verify.
  • StrataTest/Languages/Core/Tests/SMTEncoderTests.lean — Updated expected output to include the new (set-info :property "assert") directive.
  • README.md — Documents the SSR workflow with usage examples.

Testing

  • Existing SMT encoder tests updated for new set-info output.
  • Shell-based SSR integration test (run_py_ssr_test.sh) validates the full workflow end-to-end for Python test files: 202 passed, 0 failed, 14 skipped.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Add a split-solve-reconcile (SSR) workflow that decouples VC generation
from SMT solving, enabling parallel or cloud-based solver execution.

New files:
- Manifest.lean: manifest generation capturing obligation metadata
- Reconcile.lean: reconciliation of solver results with manifest
- StrataMain reconcileCommand: CLI entry point for reconciliation
- Scripts/ssr_py.sh: helper script for the SSR workflow
- docs/design/SplitSolveReconcile.md: design document

Modified files:
- Verifier.lean: add VerifyCallbacks, buildVCResult, thread callbacks
  through getObligationResult/verifySingleEnv/verify
- SimpleAPI.lean: install manifest-emitting callbacks when --no-solve
- StrataMain.lean: add reconcile command, use verifyWithManifest
- README.md: document the SSR workflow
@github-actions github-actions Bot added the Core label May 1, 2026
Shell-based test that validates the Split-Solve-Reconcile workflow
end-to-end for Python test files. For each test_*.py with expected
output, it:
1. Parses Python to Ion
2. Runs pyAnalyzeLaurel --no-solve to generate manifest + .smt2 files
3. Solves each .smt2 with cvc5
4. Runs strata reconcile to produce the final report
5. Compares DETAIL/RESULT summary lines against expected output

Results: 202 passed, 0 failed, 14 skipped (known limitations and
tests requiring non-default check modes).
@github-actions github-actions Bot added the Python label May 2, 2026
@olivier-aws
Copy link
Copy Markdown
Contributor Author

@keyboardDrummer-bot Please update the PR description and the design doc according to the latest commits in this PR

Rewrite SplitSolveReconcile.md to match the current implementation which
embeds obligation metadata directly in .smt2 files via set-info directives
instead of using a separate manifest.json file.

Removes references to Manifest.lean, ManifestTest.lean, ReconcileTest.lean,
and the JSON manifest format. Documents the set-info directives, the
reconcile algorithm, buildVCResult as shared classification logic, and
the rationale for the manifest-free approach.
@keyboardDrummer-bot
Copy link
Copy Markdown
Collaborator

Updated the PR description and design doc to reflect the current implementation.

Key changes:

  • PR description: Rewritten to describe the manifest-free architecture. Documents the set-info directives embedded in .smt2 files, the buildVCResult shared classification function, and the actual file-level changes (no more Manifest.lean, ManifestTest.lean, or ReconcileTest.lean).

  • Design doc (docs/design/SplitSolveReconcile.md): Rewritten from scratch to match the current implementation:

    • Removed the entire manifest JSON format section and all references to manifest.json
    • Documented the set-info directive approach with a table of all directives
    • Updated the reconcile algorithm to show how metadata is parsed from .smt2 files
    • Updated implementation section to describe Reconcile.lean, buildVCResult, and the CLI command
    • Updated design decisions to explain why set-info embedding was chosen over a separate manifest
    • Updated risks/mitigations and future extensions accordingly

Copy link
Copy Markdown
Contributor

@robin-aws robin-aws left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I like this approach to make --no-solve a complete feature. Added lots of high-level feedback, will take a closer look at the implementation once addressed.

Comment on lines +8 to +9
parallelism — all queries can be dispatched simultaneously — but requires
decoupling the pipeline into three phases:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this phrasing of the motivation, but to nit-pick a bit, cloud-based solving could also be far more than just three phases. It could easily extend to a more general dynamically-constructed proof tree, with more iteration on proof goals.

I would just tweak this to point out that you need at least these three types of tasks, even if they aren't necessarily organized into three distinct phases (although they will be after this change)


1. **Generate** — run the Strata pipeline up to SMT file creation
2. **Solve** — dispatch SMT queries to cloud solvers (external to Strata)
3. **Reconcile** — read solver results and produce the final verification report
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love the term "reconcile" as it implies the guaranteed presence of some kind of conflicts. I would consider "report", "gather", "reduce" (as in Map/Reduce), "aggregate", etc.

format, keeps each `.smt2` file self-contained, and avoids synchronization issues
between manifest and SMT files.

### Phase 1: Generate (`--no-solve`)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some maintainers are understandably concerned that supporting this option (and this new command that really makes it a complete end-to-end feature) will be limiting in the future, both because it limits the cloud-based solving to a simple three-phase approach (see also my first comment) and because it will be more and more challenging to support new features, if they have to be able to capture all necessary state in set-info commands.

Can we mark both --no-solve and this new command as experimental, and/or something we might remove/completely rework in the future?

Comment on lines +66 to +67
The user runs each `.smt2` file through a solver (locally, in the cloud, etc.)
and captures the solver's stdout into a corresponding `.result` file:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For files with resolved-sat and resolved-val directives, is it legal to not run them and not create a .result file? Or are they expected to create the equivalent .result file? Or do they HAVE to call the solver anyway?

2. For each `.smt2` file:
- Parses `set-info` directives to extract obligation metadata (`SMT2Meta`)
- Determines which checks were requested (satisfiability, validity)
- If evaluator-resolved: uses the stored verdict directly
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah here's the answer. Should be mentioned in the description of Phase 2 directly.

This isn't a strict either-or is it? Isn't it possible to resolve the sat check but not the validity check, or vice-versa?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why doesn't this call ssr_py.sh?

Comment thread Strata/SimpleAPI.lean
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pointless change?

Comment on lines +99 to +100
if grep -qE 'Known limitation|User error|Internal error' "$expected_file"; then
echo "SKIP (expected error): $base_name"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not how we should handle such errors. We need to record specifically what's expected for specific test inputs, not just silently swallow errors.

# ./run_py_ssr_test.sh [--filter <pattern>] [--solver <cmd>]
# ------------------------------------------------------------------------------

set -u
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Off-topic issue: I'm concerned with how many shell scripts we're adding to this repository. LLMs love spitting them out, but they are also very sloppy about setting the right safety flags. We're also giving up on a great deal of safety by dropping from Lean to bash, even if we don't prove much about this kind of logic.

Comment thread Scripts/ssr_py.sh
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment below about bash scripts in general - this could easily be a StrataMain.lean command as well, even if it would need to shell-out for each phase.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants