Add getIonDeserializer% elaborator for generic Ion deserialization#7
Add getIonDeserializer% elaborator for generic Ion deserialization#7keyboardDrummer-bot wants to merge 7 commits intomainfrom
getIonDeserializer% elaborator for generic Ion deserialization#7Conversation
Implements a term-level elaborator that inspects Lean inductive and structure types at compile time and generates a ByteArray → Except Std.Format α deserializer. Encoding conventions: - Structures → Ion structs with field names as keys - Single-constructor inductives → Ion structs with _0, _1, … keys - Multi-constructor inductives → Ion sexps (CtorName arg1 arg2 …) - Supported leaf types: Nat, Int, String, Bool Closes #5
Float should be
Yes. Feel free to mark things as partial for now, so you don't need to provide termination proofs.
No, this is fine. |
|
I've pushed a commit addressing both requests: Float support: Added Nested and recursive types: Refactored the elaborator to:
For recursive types (e.g., New tests added:
All tests compile and type-check. The full |
- Add readFloat runtime helper (accepts Ion float and int values) - Add Float as a supported leaf type in mkFieldRead and mkIndexRead - Support nested types: fields that are structures/inductives generate readers via let rec bindings in dependency order - Support recursive types: self-referencing types work when the enclosing definition is marked partial - Add tests for Float (Measurement), nested (Line with Point fields), and recursive (Tree) types
11ed932 to
26a4c13
Compare
|
Thanks for the feedback! I've pushed changes to address both requests: Float support: Added Nested/recursive types: The elaborator now handles fields whose types are themselves structures or inductives by recursively generating readers for them. For recursive types like -- Nested types work automatically
structure Line where
start : Point
stop : Point
def deserializeLine : ByteArray → Except Std.Format Line :=
getIonDeserializer% Line
-- Recursive types require `partial` on the definition
inductive Tree where
| leaf (val : Nat)
| node (left : Tree) (right : Tree)
partial def deserializeTree : ByteArray → Except Std.Format Tree :=
getIonDeserializer% TreeAll existing and new tests build successfully. |
26a4c13 to
29bfcc7
Compare
|
I've pushed the requested changes to this branch. Here's a summary: ChangesFloat leaf type support
Nested type support
Recursive type support
Tests added
All existing and new tests build cleanly with no warnings. |
|
Great work. Now an additional task. There is code in this repository (here https://github.com/strata-org/Strata/blob/main/Strata/DDM/Integration/Java/Gen.lean) to generate a Java Ion serializer based on a DDM file. I would to replace that code so that it takes a Lean type instead of a DDM file, and it should use the same Ion format as the deserializer that we've already implemented. There are existing tests for the Java serializer here: https://github.com/strata-org/Strata/tree/main/StrataTest/DDM/Integration/Java You will need to download a file using |
|
I've analyzed the existing code and have a few questions before proceeding: Understanding the task: You want to replace
Questions:
My current best guess is that you want a |
yes, rewrite the test as much as is necessary
The full Java type hierarchy and serialization
Regen the test files
Match the simpler format
Good guess! |
- Rewrite Strata/DDM/Integration/Java/Gen.lean to generate Java source
files from Lean types instead of DDM Dialect values
- New getIonSerializer% term elaborator inspects Lean inductive/structure
types at compile time and generates:
- Sealed interfaces for multi-constructor inductives
- Records for structures and single-constructor inductives
- Ion serialization matching getIonDeserializer% format:
- Structures → Ion struct with field name keys
- Single-ctor inductives → Ion struct with _0, _1, ... keys
- Multi-ctor inductives → Ion sexp (CtorName arg1 arg2 ...)
- Supported leaf types: Nat, Int, Float, String, Bool
- Nested and recursive types supported automatically
- Remove old javaGen CLI command from StrataMain.lean
- Rewrite tests to use Lean types (Point, Color, Shape, Person, Line, Tree)
- Add Java compilation test and Ion roundtrip test
(Java serializes → Lean deserializes → verify match)
|
I've pushed the implementation. Here's what changed: New
|
| Lean type | Java output |
|---|---|
| Structure | public record Point(long x, long y) with toIon method |
| Single-ctor inductive | public record Wrapper(...) with toIon using _0, _1 struct keys |
| Multi-ctor inductive | sealed interface Shape with nested records per constructor |
Ion format matches getIonDeserializer% exactly:
- Structures → Ion struct with field name keys
- Single-ctor inductives → Ion struct with
_0,_1, … keys - Multi-ctor inductives → Ion sexp
(ConstructorName arg₁ arg₂ …) - Leaf types:
Nat/Int→ion.newInt(),Float→ion.newFloat(),String→ion.newString(),Bool→ion.newBool()
Nested and recursive types are handled automatically (BFS discovers all reachable compound types).
Tests rewritten
All 10 tests in StrataTest/DDM/Integration/Java/TestGen.lean now use Lean types:
- Structure (Point) — record with field-name-keyed Ion struct
- Multi-ctor enum (Color) — sealed interface with zero-field records
- Multi-ctor with fields (Shape) — sealed interface with field records
- Mixed leaf types (Person) — String, Nat, Bool mapping
- Nested structure (Line containing Point) — generates files for both types
- Recursive type (Tree) — sealed interface with self-referencing records
- Package name — correct
packagedeclaration - File I/O —
writeJavaFilescreates correct directory structure - Java compilation — generated code compiles with
javac+ ion-java - Roundtrip — Java serializes a Point to Ion binary, Lean deserializes it back, values match
Other changes
- Removed the
javaGenCLI command fromStrataMain.lean(no longer applicable — generation is now compile-time via the elaborator) - Updated
regenerate-testdata.shandtestdata/README.md - Full
Stratalibrary andStrataTestsuite build cleanly
|
It still needs to be possible to generate Java source files for serialization by invoking the Strata CLI. I think removing the javaGen command also removed this capability. Could you check and reintroduce it if needed? We will use the command to generate Java sources in a Java repository. |
|
You're right — the The challenge is that A concrete approach:
However, since the package name is a parameter to
The simplest working approach: make Question: Which Lean types do you need the CLI to support? Is it a fixed set (e.g., Laurel AST types), or should it be extensible? And is the package name always known at build time, or does it need to be a runtime parameter? |
It is a fixed set. It can just be the Laurel AST types for now. The package name should be a runtime parameter. |
|
Thanks for the clarification. Before I implement this, I need to flag a compatibility issue. The Laurel AST types (
The current elaborator only supports Options:
Which approach do you prefer? Option 1 seems like the right long-term answer, but I want to confirm before investing in it. If option 1, should I add |
|
@keyboardDrummer-bot can you merge the changes from main, and then regarding your previous question. We should support the types:
Support for these two types should no longer be supported since they no longer occur in the Laurel AST:
|
# Conflicts: # StrataMain.lean # StrataTestExtra/DDM/Integration/Java/TestGen.lean # StrataTestExtra/DDM/Integration/Java/regenerate-testdata.sh
…rializer - Add readDecimal, readList, readOption runtime helpers to IonDeserializer - Extend getIonDeserializer% to handle List α, Option α, and Strata.Decimal - Extend getIonSerializer% to generate Java code for List (java.util.List), Option (nullable), and Decimal (java.math.BigDecimal) - Move old DDM-based Java generator to GenDDM.lean to preserve javaGen CLI - Remove support for MetaData, MetaDataElem, Core.Expression, and Array (no longer in Laurel AST) - Add tests for new types in StrataTest/Util/IonDeserializer.lean
|
I've merged main and added support for the requested types. Here's a summary: Merge from mainResolved conflicts in New type support in
|
|
@keyboardDrummer-bot can you investigate the build failures? |
The javac invocation was inside the HashMap iteration loop, causing it to compile partial sets of files. With non-deterministic HashMap ordering, this could fail when a file referencing another type was compiled before that type's source file was written. Also added the output directory to -cp so javac can find compiled classes from other files in the same package, and added cleanup at the end.
|
I investigated both CI failures: "Build and test Lean (stable)" — caused by a bug in "Run internal benchmarks of Strata" — this job uses AWS CodeBuild with secrets from the upstream |
The CheckImports linter requires all modules under Strata/ to be transitively imported by Strata.lean. The new IonDeserializer module was missing from the import list, causing the lint step to fail in CI.
|
The only failing check is "Run internal benchmarks of Strata". This is the same infrastructure issue as before — the benchmark job triggers AWS CodeBuild against All code-related checks pass:
I also verified locally: |
Can you create a PR against upstream Strata that updates CI configuration so that this check is skipped or always passes for branches outside of the repository? @keyboardDrummer-bot |
|
Created strata-org#1094 against upstream. It adds a single |
|
@keyboardDrummer-bot Please cut this PR in the Strata upstream repository, with a branch in that repo as well, and reference strata-org/jverify#405 in the description under testing to indicate that this change will work for JVerify |
|
Created upstream PR: strata-org#1095 The branch |
Closes #5
Summary
Adds a term-level elaborator
getIonDeserializer%that uses Lean metaprogramming to generate an Ion deserializer for arbitrary Lean inductive and structure types at compile time. The generated function has typeByteArray → Except Std.Format α.Design
Since
Typeis erased at runtime in Lean, the deserializer is generated at elaboration time by inspecting the type's constructors and fields in the Lean environment. The elaborator produces syntax that is then type-checked normally.Ion encoding conventions
_0,_1, …(ConstructorName arg₁ arg₂ …)Supported leaf types
Nat,Int,Float,String,BoolNested and recursive types
Fields whose types are themselves structures or inductives are deserialized recursively. For recursive types (e.g.,
Tree), the generated code useslet recbindings — the enclosing definition must be markedpartial.Usage
Files
Strata/Util/IonDeserializer.lean— Runtime helpers and thegetIonDeserializer%elaboratorStrataTest/Util/IonDeserializer.lean— Tests verifying the elaborator produces well-typed deserializers for structures, multi-constructor inductives, various leaf types, nested structures, and recursive types