Skip to content

Abstract MutationState storage behind a trait with multiple implementations #42

@CodeEnPlace

Description

@CodeEnPlace

Issue: Abstract MutationState storage behind a trait

Context

Currently Session uses FacetDiskStore<MutationHash, State> directly — a single concrete implementation that reads/writes JSON files to disk on every operation. No caching, no batching, no alternative backends. Additionally, resolve_mutation lives on Session and re-walks the entire source tree to resolve an UnvalidatedHash prefix to a Mutation, which is wasteful when the store already contains all known mutations.

Proposal

Extract a MutationStateStore trait with multiple implementations sharing identical semantics but different performance characteristics. The store should support:

  1. Lookup by MutationHash — the primary key
  2. Lookup by compound key(Twig, Span, MutantKind, String) i.e. (path, subst_span, kind, substitution) — the natural unique properties of a mutation
  3. Hash prefix resolution — absorb resolve_mutation logic (currently on Session, using UnvalidatedHash::validate) into the store, since the store already knows all hashes

Trait

MutationHash should only be constructable via the store — callers obtain one through resolve_hash (from a user-provided prefix string) or iteration. Since a MutationHash is proof that the state exists, get doesn't return Option.

pub trait MutationStateStore {
    fn get(&self, key: &MutationHash) -> &State;
    fn get_by_mutation(&self, twig: &Twig, span: &Span, kind: &MutantKind, subst: &str) -> &State;
    fn resolve_hash(&self, unvalidated: UnvalidatedHash) -> Result<MutationHash, HashError>;
    fn set(&mut self, key: &MutationHash, val: State) -> Result<(), std::io::Error>;
    fn remove(&mut self, key: MutationHash) -> State;
    fn keys(&self) -> Vec<MutationHash>;
}

Multi-key indexing with iddqd

Use iddqd::BiHashMap (or TriHashMap if we want hash + compound-key + something else) for the in-memory implementations. BiHashMap provides a bijective map with two keys — lookup by either MutationHash or the compound key tuple in O(1). The BiHashItem trait is implemented on the value type, extracting both keys from it.

This replaces the current pattern where Session builds throwaway Vec<Mutation> + Vec<MutationHash> just to resolve a prefix.

Implementations

Name Reads Writes Use case
NaiveStore Disk every time Disk every time Current behavior, wrapped FacetDiskStore. Baseline for correctness.
MemoryStore BiHashMap BiHashMap (no disk) Tests, benchmarks. Fast, ephemeral.
EagerStore BiHashMap (loaded at startup) Disk + BiHashMap Hot reads, durable writes. Good for bough run where state is read repeatedly.
LazyStore Disk on first access, then cached in BiHashMap Disk + cache Cold-start friendly. Only loads entries actually accessed.

Property-based tests

Use proptest to generate random sequences of operations, execute against all implementations in parallel, assert identical results at every step:

for each generated operation sequence:
    let mut naive = NaiveStore::new(dir1);
    let mut memory = MemoryStore::new();
    let mut eager = EagerStore::new(dir2);
    let mut lazy = LazyStore::new(dir3);

    for op in operations:
        apply op to all four
        assert get/get_by_mutation/keys/resolve_hash produce identical results

Deprecations

This new system supersedes:

  • FacetDiskStore — generic key-value disk store. Only ever used for MutationHash -> State. The generic Key: Display + FromStr / Val: Facet machinery is unnecessary; the new store is purpose-built for mutation state.
  • Much of bough-typed-hash — the generic TypedHash/TypedHashable/HashInto derive machinery was over-engineered for what turned out to be a single use case. MutationHash can be a simpler newtype with hashing logic colocated in the store, rather than spread across a separate crate with proc macros.

Both should be removed once the migration is complete.

Migration path

  1. Define trait + MemoryStore + NaiveStore as fresh implementations (not wrapping FacetDiskStore)
  2. PBT suite comparing MemoryStore vs NaiveStore
  3. Add EagerStore + LazyStore, same PBT suite
  4. PBT suite covering invariant violations (double-vend, use-after-remove, etc.)
  5. Once thoroughly tested: replace FacetDiskStore + Session::resolve_mutation with new store
  6. Remove FacetDiskStore, trim bough-typed-hash

Design invariant

MutationHash is a proof-of-existence token. It can only be obtained from the store (via resolve_hash or keys), never constructed externally. This means get and remove are infallible — if you have a MutationHash, the entry exists.

Double-vend problem

Two calls to resolve_hash with different prefixes can produce the same MutationHash. If caller A removes the entry, caller B's hash becomes a dangling proof. resolve_hash should track which hashes have been vended and return HashError::AlreadyVended if the same hash is resolved again. This ensures at most one live MutationHash exists per entry, preserving the proof-of-existence guarantee through remove.

Panics as contracts

Store implementations are permitted to panic!() on invariant violations (e.g. get with an invalid hash, remove with an already-consumed hash). These are programmer errors, not runtime conditions. The PBT suite should cover these cases — generating sequences that attempt double-vend, use-after-remove, etc. and asserting the expected panics.

Resolved decisions

  • EagerStore writes synchronously — simpler, matches NaiveStore semantics for PBT equivalence, write volume is low.
  • get_by_mutation takes all four components (twig, span, kind, subst) — kind is cheap to include and makes the lookup explicit.

Blocking: get return type

Should get return &State or owned State? Returning references is efficient for in-memory stores but forces NaiveStore to cache internally (defeating its purpose). Returning owned values is uniform but clones unnecessarily for in-memory stores. Needs resolution before implementation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions