Fix a bug in measuring running times in a few places, more fine-grained running time report#1124
Fix a bug in measuring running times in a few places, more fine-grained running time report#1124aqjune-aws wants to merge 9 commits intomainfrom
Conversation
| @@ -301,29 +302,37 @@ When two-sided checking is enabled, the generated SMT file will contain two | |||
| and the return value includes both decisions. | |||
There was a problem hiding this comment.
Could you update docstring to discuss HashMap addition (maybe use an alias as well)?
| /-- Measure the wall-clock nanoseconds taken by a pure expression. | ||
| Takes a thunk to ensure the expression is not evaluated before timing starts. | ||
| The `t1 == 0` trick prevents hoisting `result` above t1. | ||
| Writing `result` to an `IO.Ref` forces its evaluation before t2 is read. -/ | ||
| public def measureNanos [Inhabited α] (expr : Unit → α) : BaseIO (α × Nat) := do | ||
| let ref ← IO.mkRef (default : α) | ||
| let t1 ← IO.monoNanosNow | ||
| let result := if t1 == 0 then default else expr () | ||
| ref.set result | ||
| let t2 ← IO.monoNanosNow | ||
| pure (result, t2 - t1) |
There was a problem hiding this comment.
This looks really hacky. How about something annotated with noinline?
/-- Measure the wall-clock nanoseconds taken by an IO action.
Writing `result` to an `IO.Ref` forces its evaluation before t2 is read. -/
@[noinline] def measureNanosM (act : BaseIO α) : BaseIO (α × Nat) := do
let t1 ← IO.monoNanosNow
let result ← act
let t2 ← IO.monoNanosNow
pure (result, t2 - t1)
Then measureNanos would become:
/-- Measure the wall-clock nanoseconds taken by a pure expression.
Takes a thunk to ensure the expression is not evaluated before timing starts.
The `t1 == 0` trick prevents hoisting `result` above t1.
Writing `result` to an `IO.Ref` forces its evaluation before t2 is read. -/
@[noinline]
public def measureNanos (expr : Unit → α) : BaseIO (α × Nat) :=
measureNanosM (pure (expr ())
The second noinline is necessary so that the expr doesn't get pre-evaluated.
There was a problem hiding this comment.
It seems the version still has the problem with measuring running times. lake exe strata verify --profile Examples/HeapReasoning.core.st prints 0ms on the Core transformations.
I tried another version of measureNanos which uses @[noinline] but also relies on IO.mkRef. I think this looks better than the previous one because there is an order between IO.monoNanosNow and ref.set.
|
|
||
| /-- Run an action and return its result along with the elapsed nanoseconds. -/ | ||
| @[inline] public def recordNanos {m α} [Monad m] [MonadLiftT BaseIO m] | ||
| (key : String) (timing : Std.HashMap String Nat) (action : m α) : m (α × Std.HashMap String Nat) := do |
There was a problem hiding this comment.
I think using an abbreviation with a name (NanoTimings?) instead of a hashmap directly would clarify code.
There was a problem hiding this comment.
Sounds great, I named it as TimingInfo. Like the collected stats do, I added simple inductive data structure that represents the items and created ToString instances using existing #derive_prefixed_toString.
… 0' trick, rename HashMAp to TimingsInfo
|
Diffs on Github will look better if ignoring whitespace changes is toggled on. |
MikaelMayer
left a comment
There was a problem hiding this comment.
🤖 Good PR — the reordering bug is real and the fix via measureNanos with @[noinline] + IO.Ref is sound. The fine-grained timing breakdown is useful.
A few observations below.
| let result ← action | ||
| let t1 ← IO.monoNanosNow | ||
| pure (result, timing.insert key (t1 - t0)) | ||
|
|
There was a problem hiding this comment.
🤖 recordNanos is marked @[inline] and measures a monadic action using the same t0/t1 bracketing pattern that the PR description says Lean can reorder for pure code. Since action here is already monadic (sequenced by >>=), the reordering issue shouldn't apply — but the @[inline] means the compiler could inline this and then reorder the timestamp reads relative to the inlined body in theory. Is there a reason this isn't also @[noinline] like measureNanos? If the monadic bind is sufficient to prevent reordering, a brief comment explaining why would help future readers understand the asymmetry.
|
|
||
| -- If evaluator resolved both checks, we're done, unless we always want to generate SMT queries | ||
| if not options.alwaysGenerateSMT then | ||
| if let (some peSat, some peVal) := (peSatResult?, peValResult?) then |
There was a problem hiding this comment.
🤖 The pattern timingNs.insert key (timingNs.getD key 0 + val) is repeated 4 times in this function. Consider a small helper like:
def TimingInfo.add (t : TimingInfo) (key : String) (ns : Nat) : TimingInfo :=
t.insert key (t.getD key 0 + ns)This would reduce noise and make the accumulation intent clearer.
| let mut state : Transform.CoreTransformState := { Transform.CoreTransformState.emp with factory := some factory } | ||
| let mut step := 0 | ||
| have : Inhabited (Except Transform.Err Program × Transform.CoreTransformState) := | ||
| ⟨(.error default, Transform.CoreTransformState.emp)⟩ |
There was a problem hiding this comment.
🤖 This have : Inhabited ... instance is needed because measureNanos requires [Inhabited α]. Since Transform.runWith is presumably a pure function being timed, this makes sense — but the Inhabited instance uses .error default which means if the IO.Ref default value somehow leaks through, you'd get a silent error result rather than a crash. Worth a brief comment explaining this is only needed to satisfy measureNanos's constraint and the default is never observed.
| | preprocessObligations | ||
| | smtEncoding | ||
| | solverFileWriting | ||
|
|
There was a problem hiding this comment.
🤖 Nit: The key "verifySingleEnv.solverFileWriting" is misleading — it measures the entire getObligationResult call which includes both SMT file writing and solver execution. The sub-timings (dischargeObligation.encodeSMT, dischargeObligation.runSolver) break it down correctly, but the parent key name suggests only file I/O. Consider renaming to something like solverTotal or dischargeTotal.
There was a problem hiding this comment.
This is intentional since once switched to FFI there won't be file writing :)
MikaelMayer
left a comment
There was a problem hiding this comment.
🔍 LGTM — all comments addressed. The TimingInfo.add helper, the Inhabited comment, and the @[inline] removal from recordNanos all look good. The solverFileWriting naming nit was minor and I'm fine leaving it as-is since the sub-timings break it down correctly.
MikaelMayer
left a comment
There was a problem hiding this comment.
🔍 LGTM — all comments addressed. The TimingInfo.add helper is used in all 4 spots, @[inline] removed from recordNanos, and the Inhabited comment clarifies intent.
| /-- Measure the wall-clock nanoseconds taken by a pure expression. | ||
| The `@[noinline]` prevents the compiler from hoisting `expr ()` out of | ||
| this function. The `IO.Ref.set` forces evaluation of `expr ()` between | ||
| the two timestamp reads (IO actions are sequenced). -/ | ||
| @[noinline] public def measureNanos [Inhabited α] (expr : Unit → α) : BaseIO (α × Nat) := do | ||
| let ref ← IO.mkRef (default : α) | ||
| let t1 ← IO.monoNanosNow | ||
| ref.set (expr ()) | ||
| let t2 ← IO.monoNanosNow | ||
| let result ← ref.get | ||
| pure (result, t2 - t1) |
There was a problem hiding this comment.
The docstring could be more precise about what evaluation is actually forced. IO.Ref.set stores whatever value is passed to it; Lean's evaluation strategy will reduce expr () to WHNF before the set call runs, but anything lazy inside the returned value (e.g., unevaluated thunks in a List, an Array of unforced elements, a lazy closure field) will not be touched by this function. For simple types like Nat or small ADTs in the leaf position this is fine, but callers measuring e.g. the cost of building a large structure need to either ensure the structure is strict or arrange deep evaluation themselves. Worth a one-line caveat:
| /-- Measure the wall-clock nanoseconds taken by a pure expression. | |
| The `@[noinline]` prevents the compiler from hoisting `expr ()` out of | |
| this function. The `IO.Ref.set` forces evaluation of `expr ()` between | |
| the two timestamp reads (IO actions are sequenced). -/ | |
| @[noinline] public def measureNanos [Inhabited α] (expr : Unit → α) : BaseIO (α × Nat) := do | |
| let ref ← IO.mkRef (default : α) | |
| let t1 ← IO.monoNanosNow | |
| ref.set (expr ()) | |
| let t2 ← IO.monoNanosNow | |
| let result ← ref.get | |
| pure (result, t2 - t1) | |
| /-- Measure the wall-clock nanoseconds taken by a pure expression. | |
| The `@[noinline]` prevents the compiler from hoisting `expr ()` out of | |
| this function. The `IO.Ref.set` forces evaluation of `expr ()` between | |
| the two timestamp reads (IO actions are sequenced). | |
| Note: only WHNF of `expr ()` is forced. Lazy sub-structures of the returned | |
| value are not evaluated and their cost is not counted. If you need to time | |
| deep evaluation, make the returned value strict (or force it yourself before | |
| the second timestamp read). -/ | |
| @[noinline] public def measureNanos [Inhabited α] (expr : Unit → α) : BaseIO (α × Nat) := do | |
| let ref ← IO.mkRef (default : α) | |
| let t1 ← IO.monoNanosNow | |
| ref.set (expr ()) | |
| let t2 ← IO.monoNanosNow | |
| let result ← ref.get | |
| pure (result, t2 - t1) |
| /-- Run an action and record its elapsed nanoseconds into a `TimingInfo` under the given key. -/ | ||
| public def recordNanos {m α} [Monad m] [MonadLiftT BaseIO m] | ||
| (key : String) (timing : TimingInfo) (action : m α) : m (α × TimingInfo) := do | ||
| let t0 ← IO.monoNanosNow | ||
| let result ← action | ||
| let t1 ← IO.monoNanosNow | ||
| pure (result, timing.insert key (t1 - t0)) |
There was a problem hiding this comment.
Two concerns here, both about the contract:
-
Reordering safety comment still owed. @MikaelMayer asked in the previous round why this one doesn't need
@[noinline]whilemeasureNanosdoes. The annotation was removed but the explanation wasn't added. A short comment would help future readers understand the asymmetry — the answer, I believe, is that becauseaction : m αis a monadic bind, the IO primitives it wraps are sequenced by>>=, so reordering cannot moveIO.monoNanosNowacrossaction. But that's a non-obvious property of the compiler. -
Latent pure-thunk hazard, symmetric with the
measureNanosbug. If a caller writesrecordNanos "foo" t (pure expensivePureComputation)then
expensivePureComputationis a pure thunk insidepure; the bind sequences the IO part but doesn't force the thunk. The recorded time will be near-zero even though the actual work hasn't happened yet. All current call sites are fine (they wrap IO-heavy work), but this is exactly the trap this PR exists to close elsewhere, so documenting it — or, better, forcing viaIO.Refwith[Inhabited α]likemeasureNanosdoes — would prevent future regressions.
Minor: recordNanos uses timing.insert (overwrite), but the parallel merge loop in verifySingleEnv uses TimingInfo.add (accumulate). If the same key ever gets measured twice in the same timing scope (currently not the case, but easy to introduce), recordNanos would silently throw away the earlier reading. Prefer TimingInfo.add here too; for a first-write .getD k 0 = 0, so semantics are preserved, and repeated-key usage becomes safe.
| /-- Run an action and record its elapsed nanoseconds into a `TimingInfo` under the given key. -/ | |
| public def recordNanos {m α} [Monad m] [MonadLiftT BaseIO m] | |
| (key : String) (timing : TimingInfo) (action : m α) : m (α × TimingInfo) := do | |
| let t0 ← IO.monoNanosNow | |
| let result ← action | |
| let t1 ← IO.monoNanosNow | |
| pure (result, timing.insert key (t1 - t0)) | |
| /-- Run an action and record its elapsed nanoseconds into a `TimingInfo` under | |
| the given key. The measurement is accumulated via `TimingInfo.add`, so | |
| repeated calls with the same key sum their durations. | |
| Unlike `measureNanos`, this does not need `@[noinline]`: `action` is a | |
| monadic computation, so its IO primitives are sequenced by `>>=` and cannot | |
| be reordered across `IO.monoNanosNow`. | |
| Caveat: if the `α` returned by `action` is a lazy pure value (e.g. | |
| `pure (expensivePureComputation)`), the thunk is not forced here, and the | |
| recorded time will reflect only the IO sequencing, not the pure work. For | |
| timing pure computations, use `measureNanos` (which forces via `IO.Ref`). -/ | |
| public def recordNanos {m α} [Monad m] [MonadLiftT BaseIO m] | |
| (key : String) (timing : TimingInfo) (action : m α) : m (α × TimingInfo) := do | |
| let t0 ← IO.monoNanosNow | |
| let result ← action | |
| let t1 ← IO.monoNanosNow | |
| pure (result, timing.add key (t1 - t0)) |
| /-- Accumulate nanoseconds into an existing key (defaulting to 0). -/ | ||
| @[inline] public def TimingInfo.add (t : TimingInfo) (key : String) (ns : Nat) : TimingInfo := | ||
| t.insert key (t.getD key 0 + ns) |
There was a problem hiding this comment.
Two adjacent requests:
(a) Add a mergeAdd helper. The codebase now has two places merging sub-timings into a parent TimingInfo, using different shapes:
Strata/DL/Imperative/SMTUtils.lean:336—encTiming.fold (init := timing) fun acc k v => acc.insert k v(overwrites on collision)Strata/Languages/Core/Verifier.lean:1160—for (key, val) in timing do timingNs := timingNs.add key val(accumulates on collision)
These two behaviours are not equivalent (the first loses information when a key repeats across obligations). The Verifier.lean version is the semantically correct one for an accumulator spanning multiple calls. A shared helper would make the intent explicit and eliminate the asymmetry:
/-- Merge `b` into `a` by summing values for overlapping keys. -/
@[inline] public def TimingInfo.mergeAdd (a b : TimingInfo) : TimingInfo :=
b.fold (init := a) fun acc k v => acc.add k v(b) Pin down add's semantics with a couple of tiny theorems. Since TimingInfo.add is load-bearing for every per-phase total reported to the user, it's worth stating — and, ideally, proving — its key lemmas. Rough shape (names and exact Std.HashMap lemma references will need adjustment):
@[simp] theorem TimingInfo.add_getD_self (t : TimingInfo) (k : String) (v : Nat) :
(t.add k v).getD k 0 = t.getD k 0 + v := by
simp [TimingInfo.add, Std.HashMap.getD_insert_self]
@[simp] theorem TimingInfo.add_getD_of_ne (t : TimingInfo) (k k' : String) (v : Nat)
(h : k ≠ k') : (t.add k v).getD k' 0 = t.getD k' 0 := by
simp [TimingInfo.add, Std.HashMap.getD_insert_of_ne h]
theorem TimingInfo.mergeAdd_getD (a b : TimingInfo) (k : String) :
(TimingInfo.mergeAdd a b).getD k 0 = a.getD k 0 + b.getD k 0 := by
-- induction on the fold; uses the two lemmas above
sorryEven just the first two @[simp] lemmas would catch future typos in a rewrite of add and are a natural place to put #guard-style sanity tests in StrataTest/Util/.
| ref.set (expr ()) | ||
| let t2 ← IO.monoNanosNow | ||
| let result ← ref.get | ||
| pure (result, t2 - t1) |
There was a problem hiding this comment.
Missing regression test for the PR's headline fix. The whole point of this PR is that the previous naïve pattern silently reported 0 ms for heavyweight pure work. Nothing in StrataTest/ would catch if a future edit to measureNanos (e.g. dropping the @[noinline] or the IO.Ref indirection) reintroduced the bug.
Proposed test in a new StrataTest/Util/Profile.lean:
import Strata.Util.Profile
/-- Regression: `measureNanos` must actually observe time for CPU-bound pure
work. If this reports 0 ns, the `@[noinline]` / `IO.Ref.set` trick has
regressed and timing will silently be lost again. -/
def measureNanos_nonzero : BaseIO Bool := do
let (_, ns) ← measureNanos fun () =>
-- Large enough to comfortably exceed monoNanosNow resolution on any host.
(List.range 200000).foldl (· + ·) 0
pure (ns > 0)
/-- `#guard` an `IO Bool` by running it. -/
#eval show IO Unit from do
unless (← measureNanos_nonzero.toIO) do
throw <| IO.userError "measureNanos regressed to 0 ns"And a small guard-level test for TimingInfo.add:
#guard
let t := ({} : TimingInfo).add "a" 5 |>.add "b" 7 |>.add "a" 3
t.getD "a" 0 == 8 && t.getD "b" 0 == 7 && t.getD "missing" 0 == 0| recordNanos (toString Timing.encodeSMT) timing do | ||
| encodeSMT.run solver | ||
| -- Merge encodeCore's internal timing into ours | ||
| let timing := encTiming.fold (init := timing) fun acc k v => acc.insert k v |
There was a problem hiding this comment.
If the TimingInfo.mergeAdd helper I suggested on Profile.lean lands, this becomes:
| let timing := encTiming.fold (init := timing) fun acc k v => acc.insert k v | |
| -- Merge encodeCore's internal timing into ours | |
| let timing := TimingInfo.mergeAdd timing encTiming |
Semantic note: using insert here (overwrite) happens to be OK today because each encodeCore key is set exactly once per call and the outer timing doesn't yet contain any encodeCore.* key. But it's the wrong operation for a generic merge — it silently loses the parent value on any key collision — and mergeAdd would stay correct if the invariant ever changes.
| | preprocessObligations | ||
| | smtEncoding | ||
| | solverFileWriting | ||
|
|
There was a problem hiding this comment.
Revisiting the earlier thread on this name: I understand the forward-looking argument (post-FFI there is no file writing), but the current code makes this key cover a t4…t5 window around getObligationResult, which encompasses SMT encoding and solver execution and file I/O. A reader of the profile output today sees Solver/file writing: 1605ms next to runSolver: 1313ms and reasonably wonders how file writing alone could take ~300ms.
Concrete suggestion: pick a name that is already accurate today and stays accurate post-FFI, e.g. dischargeTotal or obligationTotal. The print label can be changed independently later. This is a small rename; doesn't need to block the PR, but leaving it makes the output confusing in the meantime.
| timingNs := timingNs.add (toString Verifier.Timing.solverFileWriting) (t5 - t4) | ||
|
|
||
| for (key, val) in timing do | ||
| timingNs := timingNs.add key val |
There was a problem hiding this comment.
With a TimingInfo.mergeAdd helper (see comment on Profile.lean), this becomes a one-liner and matches the pattern used in dischargeObligation:
| timingNs := timingNs.add (toString Verifier.Timing.solverFileWriting) (t5 - t4) | |
| for (key, val) in timing do | |
| timingNs := timingNs.add key val | |
| timingNs := timingNs.add (toString Verifier.Timing.solverFileWriting) (t5 - t4) | |
| timingNs := TimingInfo.mergeAdd timingNs timing |
| let _ ← (IO.println s!"[profile] Preprocess obligations: {nsToMs (timingNs.getD (toString Verifier.Timing.preprocessObligations) 0)}ms" |>.toBaseIO) | ||
| let _ ← (IO.println s!"[profile] SMT encoding: {nsToMs (timingNs.getD (toString Verifier.Timing.smtEncoding) 0)}ms" |>.toBaseIO) | ||
| -- Print encodeCore.* sub-timings | ||
| let encodeCoreEntries := timingNs.toList.filter (·.1.startsWith "encodeCore.") | ||
| |>.mergeSort (fun a b => a.1 < b.1) | ||
| for (key, val) in encodeCoreEntries do | ||
| let suffix := key.drop "encodeCore.".length | ||
| let _ ← (IO.println s!"[profile] {suffix}: {nsToMs val}ms" |>.toBaseIO) | ||
| -- Print dischargeObligation-level timings (encodeCore total, runSolver) | ||
| let _ ← (IO.println s!"[profile] encodeCore: {nsToMs (timingNs.getD (toString Imperative.SMT.Timing.encodeSMT) 0)}ms" |>.toBaseIO) | ||
| let _ ← (IO.println s!"[profile] runSolver: {nsToMs (timingNs.getD (toString Imperative.SMT.Timing.runSolver) 0)}ms" |>.toBaseIO) | ||
| let _ ← (IO.println s!"[profile] Solver/file writing: {nsToMs (timingNs.getD (toString Verifier.Timing.solverFileWriting) 0)}ms" |>.toBaseIO) | ||
| let _ ← (IO.println s!"[profile] Obligations: {obligations.size} total, {peResolvedCount} resolved by evaluator" |>.toBaseIO) |
There was a problem hiding this comment.
The printed hierarchy is confusing. On HeapReasoning.core.st this produces:
[profile] SMT encoding: 300ms
[profile] defineFunTerms: 152ms
[profile] emitDatatypes: 0ms
[profile] encodeAssumptions: 2ms
...
[profile] encodeCore: 267ms
[profile] runSolver: 1313ms
[profile] Solver/file writing: 1605ms
The encodeCore.* sub-entries appear under — and indented deeper than — SMT encoding, but they actually belong to encodeCore (which belongs to Solver/file writing). SMT encoding (= ProofObligation.toSMTTerms, building Term objects in memory) is a sibling, not a parent, of encodeCore (= emitting SMT-LIB from those Term objects). The current order is also children-before-parent, which is unusual.
Suggested reordering, parent-first with children indented consistently beneath:
| let _ ← (IO.println s!"[profile] Preprocess obligations: {nsToMs (timingNs.getD (toString Verifier.Timing.preprocessObligations) 0)}ms" |>.toBaseIO) | |
| let _ ← (IO.println s!"[profile] SMT encoding: {nsToMs (timingNs.getD (toString Verifier.Timing.smtEncoding) 0)}ms" |>.toBaseIO) | |
| -- Print encodeCore.* sub-timings | |
| let encodeCoreEntries := timingNs.toList.filter (·.1.startsWith "encodeCore.") | |
| |>.mergeSort (fun a b => a.1 < b.1) | |
| for (key, val) in encodeCoreEntries do | |
| let suffix := key.drop "encodeCore.".length | |
| let _ ← (IO.println s!"[profile] {suffix}: {nsToMs val}ms" |>.toBaseIO) | |
| -- Print dischargeObligation-level timings (encodeCore total, runSolver) | |
| let _ ← (IO.println s!"[profile] encodeCore: {nsToMs (timingNs.getD (toString Imperative.SMT.Timing.encodeSMT) 0)}ms" |>.toBaseIO) | |
| let _ ← (IO.println s!"[profile] runSolver: {nsToMs (timingNs.getD (toString Imperative.SMT.Timing.runSolver) 0)}ms" |>.toBaseIO) | |
| let _ ← (IO.println s!"[profile] Solver/file writing: {nsToMs (timingNs.getD (toString Verifier.Timing.solverFileWriting) 0)}ms" |>.toBaseIO) | |
| let _ ← (IO.println s!"[profile] Obligations: {obligations.size} total, {peResolvedCount} resolved by evaluator" |>.toBaseIO) | |
| if profile then | |
| let _ ← (IO.println s!"[profile] Preprocess obligations: {nsToMs (timingNs.getD (toString Verifier.Timing.preprocessObligations) 0)}ms" |>.toBaseIO) | |
| let _ ← (IO.println s!"[profile] SMT term construction: {nsToMs (timingNs.getD (toString Verifier.Timing.smtEncoding) 0)}ms" |>.toBaseIO) | |
| let _ ← (IO.println s!"[profile] Solver/file writing: {nsToMs (timingNs.getD (toString Verifier.Timing.solverFileWriting) 0)}ms" |>.toBaseIO) | |
| let _ ← (IO.println s!"[profile] encodeCore: {nsToMs (timingNs.getD (toString Imperative.SMT.Timing.encodeSMT) 0)}ms" |>.toBaseIO) | |
| let encodeCorePrefix := "encodeCore." | |
| let encodeCoreEntries := timingNs.toList.filter (·.1.startsWith encodeCorePrefix) | |
| |>.mergeSort (fun a b => a.1 < b.1) | |
| for (key, val) in encodeCoreEntries do | |
| let suffix := key.drop encodeCorePrefix.length | |
| let _ ← (IO.println s!"[profile] {suffix}: {nsToMs val}ms" |>.toBaseIO) | |
| let _ ← (IO.println s!"[profile] runSolver: {nsToMs (timingNs.getD (toString Imperative.SMT.Timing.runSolver) 0)}ms" |>.toBaseIO) | |
| let _ ← (IO.println s!"[profile] Obligations: {obligations.size} total, {peResolvedCount} resolved by evaluator" |>.toBaseIO) | |
| return (results, stats) |
I also renamed the user-visible label from SMT encoding to SMT term construction to disambiguate it from encodeCore (both phases are forms of "SMT encoding"). And I lifted "encodeCore." into a local to avoid the literal being duplicated on two lines.
| let _ ← (IO.println s!"[profile] Preprocess obligations: {nsToMs (timingNs.getD (toString Verifier.Timing.preprocessObligations) 0)}ms" |>.toBaseIO) | ||
| let _ ← (IO.println s!"[profile] SMT encoding: {nsToMs (timingNs.getD (toString Verifier.Timing.smtEncoding) 0)}ms" |>.toBaseIO) | ||
| -- Print encodeCore.* sub-timings | ||
| let encodeCoreEntries := timingNs.toList.filter (·.1.startsWith "encodeCore.") |
There was a problem hiding this comment.
The string "encodeCore." is also the prefix argument to #derive_prefixed_toString Timing "encodeCore" at line 47. If the enum prefix is ever renamed (say, to smtEncode), the profiler silently drops its sub-timings and no one notices until someone eyeballs the output. Either:
- factor out a
def Timing.prefix : String := "encodeCore."next to theinductive Timing, or - derive the prefix from a representative constructor's
toStringat runtime.
The #derive_prefixed_toString elaborator could also emit such a constant alongside the ToString instance — worth considering if any other Stats enum in the codebase develops the same coupling.
| inductive Timing where | ||
| | encodeSMT | ||
| | runSolver | ||
|
|
There was a problem hiding this comment.
Naming nit: this local Timing enum lives in the same file-local scope as the other SMTUtils helpers and shares a plain name with Strata.SMT.Encoder.Timing (at Verifier.lean:34) and Strata.Languages.Core.Verifier.Timing (at Verifier.lean:1053). All three are disambiguated by namespace and by their #derive_prefixed_toString prefix, so there's no correctness issue, but callers end up writing Imperative.SMT.Timing.encodeSMT / Verifier.Timing.smtEncoding / bare Timing.prelude depending on context, which is noisy. If you want to collapse the noise, a single module Strata.Util.ProfileKeys collecting all profile enums (or at least more specific per-use names like DischargeTiming / EncodeCoreTiming / VerifyTiming) would help.
Lean sometimes reorders invocation of current time function with other code:
into:
And this made the SMT encoding time (the ProofObligation object -> SMT Term, Definition, etc) appearing as zero ms, but actually it was taking around 20% of the whole VC dischare process.
Also, this pull request adds much more fine-grained reports on running times: it reports on (1) each Core transform, and (2) each SMT query generation.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.