Conversation
| /-- | ||
| The result type of values created by Verso's {lit}`#doc` and {lit}`#docs` commands. A value of type | ||
| {name}`DocThunk` represents a partially-serialized document of type {lean}`Part` that can be turned | ||
| into a value by invoking the `DocThunk.force` method. The internal structure of a {name}`DocThunk` |
There was a problem hiding this comment.
Why not make the constructors private instead of relying on a documentation admonishment?
| | serialized | ||
| (construct : DocReconstruction → Part genre) | ||
| (docReconstructionData : String := "{}") | ||
| (replacementMetadata? : Option (Option genre.PartMetadata)) |
There was a problem hiding this comment.
A comment on this field that indicates why it exists would be useful.
There was a problem hiding this comment.
I originally had one but don't actually know how to put a comment on a field of a branch of a datatype. Or should the comment go on serialized?
| -/ | ||
| | serialized | ||
| (construct : DocReconstruction → Part genre) | ||
| (docReconstructionData : String := "{}") |
There was a problem hiding this comment.
The docstring for this constructor should say a little bit about what this string is expected to contain.
If we're saving lots of strings with serialized JSON in them, perhaps we can make things easier by defining a wrapper structure? Something like:
structure SerializedDocReconstruction where
json : String
def SerializedDocStructure.empty : SerializedDocReconstruction where
json := "{}"
This should make the type signatures easier to read, and may prevent some confusion.
| /-- | ||
| Replace the top-level metadata in a {name}`DocThunk` without forcing the thunk. | ||
| -/ | ||
| public def DocThunk.withMetadata (metadata? : Option genre.PartMetadata) : DocThunk genre → DocThunk genre |
There was a problem hiding this comment.
I wonder if this wouldn't be more compositional with a constructor that replaces metadata instead of a field on serialized. Do we expect there to be lots of calls that might lead to a measurable overhead?
| into a value by invoking the `DocThunk.force` method. The internal structure of a {name}`DocThunk` | ||
| should not be relied upon. | ||
| -/ | ||
| public inductive DocThunk (genre : Genre) where |
There was a problem hiding this comment.
One way in which this is very much unlike Thunk is that repeated calls to force will do repeated work. I think this name is better than VersoDoc, which says basically nothing, but I wonder whether DocThunk is the right name given that we don't cache the result.
|
|
||
| namespace Verso.Doc | ||
| open Lean | ||
|
|
There was a problem hiding this comment.
I think a "why" here as a module docstring would be valuable. Something like:
| /-! | |
| In {given}`g : Genre`, the Verso elaborator does not directly emit a {lean}`Part g`. This is because included | |
| Lean code samples tend to be highly repetitive, especially in long proofs with growing contexts. This repetition | |
| can lead to performance problems. | |
| Instead, the elaborator deduplicates the included code samples via hash-consing. The hash-cons table is | |
| saved as a string literal containing a JSON serialization, because large Lean constant data structures generate large C code compared to the string, and the overhead of deserializing it at run-time is negligible compared to the overhead of compiling very large C files (most Verso document code is run only once per compilation). | |
| -/ |
| namespace Verso.Doc | ||
| open Lean | ||
|
|
||
| public structure DocReconstruction where |
There was a problem hiding this comment.
A docstring here would be nice:
| public structure DocReconstruction where | |
| /-- | |
| A deduplication table for code samples embedded in a document, so repeated code (e.g. hypotheses in proof | |
| states) does not lead to code bloat. | |
| -/ | |
| public structure DocReconstruction where |
Should the constructor of this structure be private?
Co-authored-by: David Thrane Christiansen <david@lean-fro.org>
Co-authored-by: David Thrane Christiansen <david@lean-fro.org>
This PR is a refactoring of some of the intermediate data structure created in #594.
VersoDocwas an extremely generic name, andDocThunk, with a correspondingDocThunk.force : DocThunk genre -> Part genrefunction, better communicates what is happening here: computation of a part is getting deferred during the metaprogramming phase until compiled-program-execution time.The structure of a
DocThunkis also changed somewhat, to better represent and support the operation of replacing metadata without evaluating the thunk, and to recognize that some applications buildPartdata directly and so don't have a meaningful serialization step.