diff --git a/.claude/CLAUDE.md b/.claude/CLAUDE.md index 1b226b2..9926469 100644 --- a/.claude/CLAUDE.md +++ b/.claude/CLAUDE.md @@ -2,244 +2,63 @@ ## Mission -OpenGALib is a reusable Lean 4 mathematical software library stack — Algebraic, Tensor, Riemannian, GeometricMeasureTheory — engineered for long-term software value. Paper-specific formalization sub-projects live in separate repos and consume this stack via `require OpenGALib from ".."`. - -The Lean code is the primary mathematical artifact. Cited papers (Petersen, Pitts, Simon, Allard, Wickramasekera) supply ground truth; prose presentations are renderings of the Lean development, not its source. When code and prose disagree, the code is authoritative; refactoring code does not require co-editing prose. +OpenGALib is a reusable Lean 4 mathematical software library. Paper-specific formalization projects consume it via `require OpenGALib from ".."`. The Lean code is authoritative — when code and prose disagree, the code wins. ## Architecture -Single Lake package `OpenGALib`, organised under directory tree: +Single Lake package `OpenGALib`. Math layers (dependency order, `≺` reads "is depended on by"): ``` -OpenGALib/ -├── Algebraic/ ← BilinearForm, Riesz, RatVector -│ ↑ -├── Tensor/ ← DifferentialForm, MultilinearSection, Alternating -│ ↑ -├── Riemannian/ ← Connection, Curvature, Metric, SecondFundamentalForm -│ ↑ -└── GeometricMeasureTheory/ ← Variation, HasNormal, Stable, Varifold +Algebraic ≺ Tensor ≺ Riemannian ≺ { Comparison, GeometricMeasureTheory } + MetricGeometry (Layer-1 parallel; metric-only stack) ``` -`OpenGALib` is a package name, never a namespace prefix. Content lives in concept-level namespaces matching Mathlib idiom (`namespace Riemannian`, `namespace BilinearForm`, `namespace DifferentialForm`). Mathlib-extension lemmas live in the Mathlib namespace they extend (`namespace ContinuousLinearMap`, etc.) so dot-notation works. - -Layer separation: each layer must not reference paper-specific concepts. Each layer is a candidate for spin-out as a standalone Lean library. - -Namespaces and package names are concept-level, not person-level. No `Wickramasekera`, no `AlmgrenPitts` as top-level. People appear in citations and docstrings, not in namespace structure. - -### Folder organization - -Within each layer, every math concept gets its own folder. Concept folders use **content-named anchor files** (`Connection/LeviCivita.lean`, `Curvature/RiemannCurvature.lean`, `Metric/RiemannianMetric.lean`), never role-named (no `Basic.lean`, `Defs.lean`, `Foundation.lean`). Folder name carries domain meaning ("screaming architecture", Robert C. Martin); anchor name carries the specific math concept. - -Standalone concepts (single file, no sub-modules) live inside the layer's relevant folder: `Riemannian/Operators/Gradient.lean`, `Riemannian/Instances/EuclideanSpace.lean`. No top-level scattered files alongside concept folders. - -There is one role-named folder, `Util/`. Two-tier layout: top-level `OpenGALib/Util/` for cross-layer Eng (notation, attributes, Mathlib-extension lemmas); per-layer `OpenGALib//Util/` for layer-scoped Eng (`Riemannian/Util/MusicalIso.lean`, `Riemannian/Util/MetricInnerSmoothness.lean`). Files inside `Util/` are content-named — the folder carries the role. - -### Software-engineering principles applied +Infrastructure folders: `Util/` (engineering helpers, depended on by all, depends on none), `Bridges/` (one-directional adapters between OpenGA and Mathlib; currently `RiemannianToLength`), `Tests/` (linter `#guard_msgs` regression). -Folder organization, anchor purity, and the Math/Eng/Mixed split stand on well-known software-design principles. References for further reading when redesign questions arise: +`OpenGALib` is a package name, never a namespace prefix. Use concept-level namespaces (`namespace Riemannian`, `namespace BilinearForm`); Mathlib-extension lemmas live in the Mathlib namespace they extend. Anchor files are content-named (`Connection/LeviCivita.lean`, `Curvature/RiemannCurvature.lean`) — never role-named (no `Basic.lean`, `Defs.lean`, `Foundation.lean`). -- **Information hiding** (Parnas 1972, *On the Criteria to Be Used in Decomposing Systems into Modules*) — each module hides a single design decision. `Util/` sub-modules hide "Mathlib API form mismatches", "chart-pullback glue", "simp normal-form bridges". -- **Package by feature, not layer** — folders correspond to math concepts (Connection, Curvature), not framework roles (Defs, Lemmas, Helpers). Avoids "shotgun surgery" — one math change touching many role folders. -- **Screaming architecture** (Robert C. Martin) — folder names shout the domain (Riemannian geometry), not the framework (Lean 4 / Mathlib). -- **Common Closure Principle** — files that coevolve live in the same folder. Koszul, RieszExtraction, LeviCivita ⇒ `Connection/`. -- **Deep modules** (Ousterhout 2018, *A Philosophy of Software Design*) — anchor files have simple Math interfaces hiding complex proofs; `Util/` files hide engineering tax. -- **Stable Dependencies Principle** — `Algebraic ≺ Tensor ≺ Riemannian ≺ GMT`. `Util/` sits below everything in its layer (most stable, most depended-on). -- **Miller 1956 (7 ± 2)** — keep folder fan-out ≤ ~9 entries per level. +`Util/` is the only role-named folder. Two tiers: top-level `OpenGALib/Util/` for cross-layer engineering helpers; per-layer `OpenGALib//Util/` for layer-scoped helpers. Files inside `Util/` are content-named. -These principles motivate, but do not override, project-specific conventions. When they conflict with explicit OpenGALib rules (e.g. `Util/` is role-named, against package-by-feature purity), the explicit rule wins. - -### Layer autonomy - -Each Layer is OpenGA-owned: types, classes, headline statements, and conventions are chosen for OpenGA's purposes. Layers are not constrained to match upstream Mathlib formulations even where Mathlib has a similar primitive. Mathlib tools are consumed directly when available, but always through an OpenGA-namespace wrapper that fixes the OpenGA-side signature. - -The policy: - -- **Use Mathlib primitives in proof bodies** — `eVariationOn`, `hausdorffMeasure`, `IsRiemannianManifold`, `riemannianEDist`, `Manifold.pathELength`, etc. are invoked directly. We do not re-implement what already exists. -- **Wrap them under OpenGA names at the API surface** — `OpenGA.pathLength` wraps `eVariationOn` even though the two are definitionally equal. The OpenGA-side name is what the rest of OpenGA references; the Mathlib name is an implementation detail. -- **No re-export aliases, no copying of infrastructure** — adding a `def` whose only purpose is to expose a Mathlib symbol under a different name is rejected. The wrapper exists because the OpenGA signature differs in arguments / hypotheses / target conventions, not merely in name. -- **Bridges connect, they do not subsume** — when a layer needs a Mathlib-side concept (Riemannian, GMT), the connection lives in `OpenGALib/Bridges/To.lean` as a `def` or `instance`; the bridge is one-directional and does not absorb Mathlib's API into OpenGA's namespace. - -The first concrete instance is **Layer 1**: `OpenGALib/MetricGeometry/` — `MetricMeasureSpace`, `LengthSpace`, `GeodesicSpace`. Each is OpenGA-defined, with Mathlib's `eVariationOn` / `MeasureTheory.Measure` / `IsRiemannianManifold` reached through wrappers (`pathLength`) or bridges (`Bridges/RiemannianToLength`). Subsequent layers (Layer 2 Geometric Conditions, Layer 3a Riemannian extensions, Layer 3c GMT) follow the same pattern. - -Ground truth: manifesto §5.1 (relation to Mathlib) and §5.4 (metric measure space as foundational type). +Mathlib primitives are used directly in proof bodies but wrapped at the API surface under OpenGA names; no re-export `def`s for renaming. ## Working stance -### Self-build is the default - -When Mathlib lacks a primitive, exposes a non-firing scoped instance, has a typeclass diamond (lean4#13063), or surfaces an API in a non-applicable form, the response is framework self-build. Self-built primitives are first-class library content, not workarounds. The framework does not synchronize to Mathlib's PR cycle. Mathlib catch-up, when it happens, is a deprecation opportunity for selective subsets — not a milestone planned around. - -Canonical pattern: locate the API surface (grep + read source), build framework analog with explicit instances + helpers + bridge typeclasses, drop Mathlib dependency from the cascade. `InnerProductBridge` (Phase 1.6) and `RiemannianMetric` (Phase 4.7) are template examples. +**Self-build, don't retreat.** When Mathlib lacks a primitive or surfaces an API in a non-applicable form, build the framework analog — self-built primitives are first-class library content, not workarounds. Mechanical build errors (unknown identifier, missing import, typeclass propagation) keep fixing; real blockers report and ask. Do not propose fallback, revert, or simplification mid-task; the user decides scope changes. Treat as a smell: "blocked by Mathlib upstream", "strategic decision needed", "exceeds atomic-commit scope", "specialized expertise required", or any time-/LOC-/session-budget framing. -Bridge investment for cross-cutting blockers: when a single typeclass gap, scoped-instance non-firing, or API mismatch blocks multiple primitives, framework builds the explicit instance / accessor / typeclass cascade once and unblocks everything downstream. - -### Continue, do not retreat - -For build errors mid-task: -- Mechanical errors (unknown identifier, missing `open`, typeclass propagation): continue fixing -- Genuine blockers: report current state, ask for direction - -Do not propose fallback, revert, or simplification mid-task. User decides scope changes. - -### Retreat triggers - -Real retreat triggers (all rare): -- Mathlib lemma genuinely missing AND framework self-build hits Lean kernel impossibility (universe inconsistency, unfixable def-eq diamond after 5+ documented workarounds, type-level constraint that no spike resolves) -- Self-build requires a Lean kernel primitive that does not exist - -Rejected framings (all surfaced by Claude Code in the past, all wrong): -- "This is a Mathlib infrastructure issue, only fixable via PR" -- "Blocked by Mathlib upstream catch-up" / "Deferred until Mathlib's X matures" -- "Wait for Mathlib chart-machinery to evolve" -- "Specialized chart-machinery / vector bundle / typeclass-cascade expertise required" -- "Bundle-section conversion exceeds atomic-commit scope" -- "Hand off to expert session" / "Multi-session work needed" / "Dedicated focused session" -- "Strategic decision needed before proceeding" -- "Mathlib API in wrong form, can't directly apply" -- LOC count, component count, expected spike iteration count -- Time framing of any unit (今晚 / 几小时 / 几天 / multi-week / "session budget") - -When any of these surface, the correct response is depth-audit Mathlib + framework self-build, not user escalation. Tasks are sized by mathematical content and architectural correctness, not by LOC, time, or session count. - -### Atomic commits - -Commit once at task end, or fail-and-report without committing. Working directory broken state mid-task is normal and does not affect origin/main. Do not commit mid-refactor. +**Atomic commits.** Commit once at task end, or fail-and-report without committing. Broken working directory mid-task is normal. No mid-refactor commits. ## Code quality -A passing build is day zero, not done. The criteria below apply to every declaration written or refactored. Framework self-imposes Mathlib conventions (simp normal form, ext lemmas, docstring style, typeclass conventions) as a quality bar, independent of any PR intent. - -### Math / Eng / Mixed tagging - -Tag every declaration at the start of its docstring: +**Math / Eng / Mixed tagging.** Every declaration's docstring begins with one of: ```lean /-- **Math.** Paper-side definition of [concept]. ... -/ /-- **Eng.** Type-theoretic glue, no paper analogue. ... -/ -/-- **Mixed.** Math: [statement]. Eng: [glue carried in the proof]. ... -/ +/-- **Mixed.** Math: [statement]. Eng: [glue in the proof]. ... -/ ``` -Criteria: -- **Math** — directly corresponds to a paper / textbook concept; rename map is `s/_/ /` reading test -- **Eng** — basepoint mismatches, index translations, chart helpers, bound-carrying boilerplate, simp-normal-form bridges -- **Mixed** — math statement on top of an engineering proof body - -Without explicit tagging, decisions about which helpers to extract, inline, or rename lack any criterion. Engineering tax becomes invisible to casual reading. Tags are greppable: `Grep "\*\*Eng\.\*\*"` lists the engineering surface. - -### Natural-language reading test - -Names must pass: replace underscores with spaces, read aloud, result should be a clear mathematical statement. Failures signal an unclear role (too specific, too abstract, placeholder word, unexpanded abbreviation) and trigger rename, not silent acceptance. - -### Design is subtraction - -- No bridge wrappers from old API to new -- No `@[deprecated]` aliases retained for compatibility -- No double APIs for the same concept -- Hard migrations across all call sites in a single atomic change - -Bridges and aliases are technical debt by default. The Phase 4.7 cascade (dropping `RiemannianBundle` end-to-end with no bridge, retiring `InnerProductBridge.lean`) is the template. - -### Same object, multiple views - -When a mathematical object has multiple natural type-theoretic presentations (bundle section / function-form / chart pullback; matrix / digraph / active set), connect them by bridge lemmas and keep all views first-class. Do not collapse to one "canonical" view forcing others through it. - -### Engineering tax encapsulation - -Anchor files expose only `**Math.**`-tagged declarations (paper-side definitions, theorems, notations). All `**Eng.**` and `**Mixed.**` declarations — the "technical infrastructure" supporting proofs — live in `Util/` sub-modules, never inline in the anchor. Mathematical reading of the anchor stays uncluttered; the Eng surface is searchable via the `Util/` folder. - -Engineering tax (bound-carrying boilerplate, index translations, chart-pullback wrappers, basepoint-mismatch wrappers, simp-normal-form bridges, `@[simp]` def-unfolds) is unavoidable, but its location is chosen. Push it out of the math-anchor file into a `Util/` sub-module. - -Two-tier `Util/` layout, following Mathlib idiom: - -- **Top-level** `OpenGALib/Util/` — Eng helpers shared across multiple layers (Mathlib-extension `mfderiv` lemmas, notation, tactics, attributes). -- **Per-layer** `OpenGALib//Util/` — Eng helpers scoped to a single layer (e.g. `Riemannian/Util/CotangentFunctional.lean`, `Riemannian/Util/MusicalIso.lean`). Per-layer `Util/` builds on top-level `Util/`. - -Files inside `Util/` are content-named, never role-named: `MusicalIso.lean`, `CotangentFunctional.lean`, `ChartJacobianSmooth.lean`, `ConnectionLaplacianSimp.lean`. The folder name `Util/` carries the role; individual files describe their content. No `Helpers`, `Base`, `Foundation` suffixes inside `Util/`. - -### Signature-reads-as-paper criterion - -Headline lemmas (`bochner_weitzenboeck`, `leviCivitaConnection_exists`, `firstVariation_*`) are stable when their Lean signature reads as the textbook sentence with no engineering tax exposed. If side-condition predicates and index translations sit inline alongside the mathematical content in the signature, the structural pass is incomplete. UX optimizations (`@[simp]` / `@[ext]` / `@[simps]` / `abbrev` / naming polish) apply only past this point — premature polish on evolving interfaces gets discarded by the next refactor. - -## Fitness functions - -Architectural rules are enforced by Lean-native linters in `OpenGALib/Util/Linter/`. They fire during elaboration (LSP shows inline warnings; `lake build` emits them) and gate every push / pull request via GitHub Actions. - -Background: Neal Ford et al., *Building Evolutionary Architectures* (2017) — coined "fitness functions" for executable architectural tests. OpenGALib adapts the pattern to Lean. - -Current linters (`OpenGALib/Util/Linter/`): - -- **`MathTag.lean`** (`linter.openGA.mathTag`, default `true`, baseline `0`) — every declaration's docstring must begin with `**Math.**`, `**Eng.**`, or `**Mixed.**`. -- **`AnchorPurity.lean`** (`linter.openGA.anchorPurity`, default `true`, baseline `0`) — `**Eng.**` / `**Mixed.**` declarations forbidden outside `Util/` directories. Two principled exemptions reflect the rule's true scope (anchor's *exposed* math API, not internal/synthesis plumbing): typeclass `instance` declarations (Lean synthesis requires co-location with the type) and `private` declarations (not part of the anchor's exposed API). -- **`Naming.lean`** (`linter.openGA.naming`, default `true`, baseline `0`) — forbid bare initialisms `CLM`, `NACG`, `IPS` in declaration names; require Mathlib-style full names (`ContinuousLinearMap`, `NormedAddCommGroup`, `InnerProductSpace`). - -All three bundled into the linter set `linter.openGA` (defined in `OpenGALib/Util/Linter.lean`). `set_option linter.openGA false` silences the whole set; individual options toggle each linter in isolation. - -Unit tests live under `OpenGALib/Tests/Linter/` and use `#guard_msgs (warning) in` to capture the expected warning verbatim — if a linter ever stops firing on its intended trigger, the test build fails. Tests live outside `Util/` so the AnchorPurity path-check actually triggers. - -CI implementation (`.github/workflows/ci.yml`): the build job greps `lake build` output for each linter's warning prefix, fails if count exceeds the hardcoded baseline. Baselines only ever decrease; never grow without explicit justification (same discipline as the sorry count gate). - -Adding a new linter: drop `OpenGALib/Util/Linter/.lean` (template: `MathTag.lean`), add the option to the `register_linter_set` in `Util/Linter.lean`, add a `#guard_msgs` test in `Tests/Linter/.lean`, add the baseline check in `ci.yml`. The `Util/Linter/README.md` walks the full template. - -## Unused-import hygiene (`shake`) - -`lake exe shake OpenGALib --no-downstream` (from the `batteries` package) detects unused imports and missing transitive-relied-on imports. False positives (tactic / notation / instance side-effect modules) are filtered via `scripts/noshake.json` — Mathlib's universal `ignoreImport` baseline plus OpenGALib-specific side-effect modules. - -CI baselines the count (currently `35`); growth fails the build. Cleanup is gradual: each PR either holds the count or reduces it. To reduce, apply shake's suggestion (`remove [old]`, `add [new]`) to the flagged file; if the full build then breaks downstream, add the right explicit imports to the broken consumer file (the Mathlib-standard "make every file declare its actual dependencies" pattern) and re-verify. - -Shake's `--fix` flag is known to over-apply (removes more than reported) on our codebase, so manual / scripted apply with full-build verification is the supported workflow. - -## Sorry discipline - -Every sorry / opaque / placeholder is categorized: -- PRE-PAPER (Mathlib gap, framework self-build owns repair) -- CITED-BLACK-BOX (theorem quoted as given, body never proven) -- PAPER-INTERNAL (paper proof obligation) -- CONJECTURAL (open mathematics) - -Each carries a repair plan in its docstring: missing API or framework primitive, repair trigger, repair owner. Generic "blocked by Mathlib" annotations decay into permanent technical debt and are rejected. - -GMT primitives align with Pitts 1981 / Simon 1983 / Allard 1972 and cite source via `**Ground truth**: ...`. Cited theorem statements (Wic14, CLS22, DLT, CL03, Pitts, Allard) are strict-aligned with paper §X verbatim. - -Substantive chain proofs (headline theorems and the bridge lemmas they compose through) remain 0-sorry, non-circular. Refactors preserve this invariant. - -Never silently weaken a statement to remove a sorry. Either prove, leave sorry'd, or document blocker with repair plan. - -## Refactor protocol +Math names must pass the natural-language reading test: `s/_/ /` reads as a clear mathematical statement. -Refactor is strategic re-audit triggered by accumulated architectural debt or new mathematical insight, not implementation work. +**Anchor purity.** Anchor files expose only `**Math.**` declarations; `**Eng.**` and `**Mixed.**` live in `Util/` sub-modules. Exemptions: `instance` (synthesis needs co-location) and `private` (not exposed API). -1. Strategic question batch first (before any code change): hierarchy correct? Concept boundaries placed correctly between layers? Sub-namespace divisions clean? Dependency graph cycle-free? -2. Plan from first principles, not from current state. Current state triggered the refactor; planning anchored on it misses the architectural fix. -3. Execute in atomic chunks with build verify + 0-sorry preservation per commit. -4. Allow strategy adjustment mid-execution. Implementation surfaces details invisible during planning. Do not push through with stale plan. -5. Audit again after refactor. Refactor is recurring ritual. +**Hard migrations.** Rewrite all call sites in one atomic change. No bridge wrappers, no `@[deprecated]` aliases, no double APIs for the same concept. -Playbook detail in `docs/REFACTOR_PLAYBOOK.md`. +**Multiple views.** An object's natural presentations (bundle / function / chart-pullback; matrix / digraph) are connected by bridge lemmas and stay first-class — don't force a canonical view. -## Phase plan +**Signature reads as paper.** Headline lemmas are stable when the Lean signature reads as the textbook sentence with no engineering tax inline. UX polish (`@[simp]`, `@[ext]`, naming) applies only past this point. -Done: -- Phase 0–1.6: architecture lock, Layer A+B real grounding, Riemannian primitives (9 of 9 real, zero existence axioms) -- Phase 4 / 4.5 / 4.7: Levi-Civita Koszul construction, framework typeclass redesign, `RiemannianMetric` cascade end-to-end, `RiemannianBundle` dropped -- Bochner stack (commit `520b9e6`, 2026-05-13): `bochner_weitzenboeck` + `leviCivitaConnection_exists` (all 3 conjuncts) + `manifoldGradient_smooth_of_smooth` unconditional. Riemannian sorry count = 0. New file `MusicalIso.lean` (~1100 LOC) houses the chart-Gram-matrix machinery. +## Discipline -In flight: -- Engineering optimization pass anchored on Bochner. Apply Math/Eng/Mixed tagging, NL reading test, and signature-reads-as-paper criterion to the Bochner stack first; propagate outward to Connection.lean and the Riemannian package. +**Sorries** are categorized (PRE-PAPER / CITED-BLACK-BOX / PAPER-INTERNAL / CONJECTURAL), each with a repair plan in its docstring. Substantive chain proofs stay 0-sorry, non-circular. Never silently weaken a statement to remove a sorry. CI baselines the count against `docs/SORRY_CATALOG.md`. -Remaining: -- Phase 2: Round 5 cited theorem strict alignment Items 4–9 (DLT13, `exists_minmaxLimit`, `isStationary_of_minmaxLimit`, `locallyStable_of_oneSidedMinimizing`, `interpolation_lemma`, `isRectifiable_of_isStationary_of_density_pos`) -- Phase 3: Isoperimetric production-grade lib (parallel to Riemannian) -- Phase 5 (event-triggered): UX optimization once interfaces stabilize per signature-reads-as-paper -- Phase 6: final pre-release polish (CI, doc-gen4, README, references.bib) +**Shake** (`lake exe shake OpenGALib --no-downstream`) catches unused imports; CI baselines the count, PRs hold or reduce, manual apply (`--fix` over-applies). -## Identity +**Refactor protocol** — `docs/REFACTOR_PLAYBOOK.md`. Plan from first principles, atomic chunks, build-verify per commit. -Xinze Li (Moqian), 5th-year math PhD, University of Toronto, advisor Yevgeny Liokumovich. Communication in Chinese. Avoids em-dashes and AI-style phrasing in chat. +All architectural rules baseline-enforced by linters in `OpenGALib/Util/Linter/`; adding a new linter → `OpenGALib/Util/Linter/README.md`. -Role division: -- Moqian (总指挥): direction, scope, refactor triggers -- Claude chat (参谋): translates direction into executable Claude Code prompts -- Claude Code (executor): mechanical execution, build verification +## Role division -Strategic decisions belong to Moqian. Translation belongs to Claude chat. Mechanical work belongs to Claude Code. Claude Code does not escalate scope decisions: when the closure path is mechanical self-build, the response is depth-audit + iterate, not "strategic decision needed". +- **Moqian** (总指挥) — direction, scope, refactor triggers +- **Claude chat** (参谋) — translates direction into executable Claude Code prompts +- **Claude Code** (executor) — mechanical execution, build verification