Skip to content

feat(corpus-v3): Wave 7 — web projection layer (1886 /id/cid######/ pages)#271

Merged
ThorFuchs merged 1 commit into
mainfrom
feat/corpus-v3-projection-layer
May 23, 2026
Merged

feat(corpus-v3): Wave 7 — web projection layer (1886 /id/cid######/ pages)#271
ThorFuchs merged 1 commit into
mainfrom
feat/corpus-v3-projection-layer

Conversation

@ThorFuchs
Copy link
Copy Markdown
Collaborator

Summary

The first wave that surfaces Corpus v3 to the public site. Adds /id/cid######/ landing pages for all 1886 public Corpus Items currently in the corpus-v3 repo, each with relation panels, identifier boxes, and JSON-LD.

The v2 corpus surfaces (Registry, /publications/, etc.) remain live and untouched until cutover (Wave 9 of the Corpus v3 build-out roadmap).

What ships

Layer File Purpose
Sync scripts/sync_corpus_v3_to_site.py Mirrors corpus-v3/manifests/ → site/_data/corpus_v3/. Atomic write + validation gate.
Generate scripts/generate_corpus_v3_pages.py Emits one Markdown collection document per public item under _corpus_v3_items/cid######.md. Idempotent.
Layout _layouts/corpus-v3-item.html Conditional sections for atomic / artifact / formal / result / proof / tombstone modes.
Identifier box _includes/corpus-v3/identifier-box.html CID + primary alias + legacy aliases + external identifiers (DOI/Wikidata/GitHub/arXiv)
Relation panels _includes/corpus-v3/relation-panels.html Upstream / Formalized by / Appears in / Supports / Downstream — chip-style with /id/{cid}/ links
JSON-LD _includes/corpus-v3/jsonld.html schema.org annotation per item; type-mapped (Book/Chapter/ScholarlyArticle/CreativeWork/DefinedTerm)
Config _config.yml New corpus_v3_items collection with permalink: /id/:name/
Data _data/corpus_v3/ (7 files) Synced from corpus-v3/manifests/
Pages _corpus_v3_items/ (1886 files) Generated Markdown collection documents

Scale

  • 1922 Corpus Items in the corpus-v3 repo at this commit
  • 1886 public pages generated (36 private/internal items correctly skipped)
  • All 6 payload classes covered (atomic · artifact · formal · result · proof · page)
  • All 6 proof modes represented (prose · semi_formal · lean_formalized · conditional · sketch · external_reference)
  • 2992+ aliases preserved across items — every v2 legacy ID still findable

End-to-end flow

corpus-v3 (canonical YAML)
  ↓ (scripts/sync_corpus_v3_to_site.py)
site/_data/corpus_v3/*.yaml  (synced manifests)
  ↓ (scripts/generate_corpus_v3_pages.py)
site/_corpus_v3_items/cid######.md  (Jekyll collection docs)
  ↓ (Jekyll build)
public/id/cid######/  (rendered HTML with relation panels + JSON-LD)

Per-item page structure (per web addendum §4)

  1. Hero card — title, CID, primary alias, status, version
  2. Tombstone banner (deprecated_public only)
  3. Payload section — markdown/latex/artifact-card/page
  4. Proof block (proof items only)
  5. Result block (result items only)
  6. Formalization block (formal_* items + proofs with Lean backing)
  7. Identifier box — full provenance
  8. Relation panels — 5 panels: upstream / formalized_by / appears_in / supports / downstream
  9. Version + history meta-list
  10. Status disclaimer + JSON-LD <script> embed

Sample pages this PR will surface

  • /id/cid000001/ — Book II monograph
  • /id/cid000010/ — Master Constant Calibration theorem (THM0001) with formalized_by → cid000020 (the TauLib formal theorem)
  • /id/cid000022/ — Proof of master constant calibration (semi-formal with structured steps using prrp://)
  • /id/cid000003/ — Construction Spine dossier
  • /id/cid000004/ — Categorical AI paper
  • /id/cid000021/ — RSL0001 dark-sector readout result
  • /id/cid001234/ — any imported registry theorem from Wave 4a
  • /id/cid005078/ — any imported TauLib formal_module from Wave 5
  • /id/cid006108/ — PRF0005 (lean_formalized proof of FTA on τ-Idx)

Deferred to Wave 7b (follow-up PR)

  • Result page filtered views (/results/physics/, /results/life/, etc.) — Jekyll generator from corpus_v3_items collection with layer/domain filters
  • Pagefind search integration extension to index corpus_v3_items
  • /corpus/graph/ public landing page summarizing the graph
  • More elaborate tombstone styling

Test plan

  • CI green (build + Atlas integrity unaffected — no atlas counter rules touch corpus_v3 collection)
  • /id/cid000010/ renders with theorem layout + relation panels showing cid000007 (DEF0001) → cid000008 (LEM0001) → cid000009 (PRP0001) as dependencies + cid000020 (FTH0001) as formalized_by
  • /id/cid000022/ (PRF0001) renders with proof block + structured steps + chip-row showing prrp://def0001@v1 etc.
  • /id/cid000023/ (Book I monograph BOK0002) renders with artifact card
  • Identifier box on each shows all legacy aliases (II.T25, S024, H3, etc. on THM0001)
  • JSON-LD validates as schema.org structured data

🤖 Generated with Claude Code

… Items

Adds the projection layer that makes the Corpus v3 graph (1922 items in
the corpus-v3 repo) visible on the public site as /id/cid######/ pages
with relation panels, identifier boxes, and JSON-LD.

This is the first wave that surfaces v3 to the public site. The v2
corpus surfaces (Registry, /publications/, etc.) remain live and
untouched until cutover (Wave 9).

What lands

  scripts/sync_corpus_v3_to_site.py
    Mirrors corpus-v3/manifests/ → site/_data/corpus_v3/. Atomic write
    with validation gate (parses, checks expected top-level key). Reads
    from PRRP_CORPUS_V3_ROOT env or defaults to ../corpus-v3.
    7 files synced: alias-index, cid-index, transition, item-types,
    relation-vocabulary, visibility-values, lifecycle-statuses.

  scripts/generate_corpus_v3_pages.py
    Reads corpus-v3/items/ and emits one Markdown collection document
    per public item under _corpus_v3_items/cid######.md with the full
    item record inlined as frontmatter. Atomic write, idempotent. Wave
    7a default emits the 1886 visibility-public + deprecated-public
    items (36 private/internal items skipped).

  _layouts/corpus-v3-item.html
    Single layout, conditional sections for atomic / artifact / formal /
    result / proof / tombstone modes. Renders:
      - hero card with title, CID, primary alias, status, version
      - tombstone banner (when visibility deprecated_public)
      - payload section (markdown / latex / artifact card / page)
      - proof block (when type: proof) — chips for mode/status/
        formality/version-pinning, conditional warning, structured
        steps with prrp:// usage badges
      - result block (when type: result) — class/strength chips,
        commentary (short/technical/scope_limits), inspection route,
        falsification surface
      - formalization block (when formalization metadata present) —
        module, declaration, commit, lean_toolchain, axiom audit
      - identifier box (see include)
      - relation panels (see include)
      - version + history meta-list
      - status disclaimer + JSON-LD embed

  _includes/corpus-v3/identifier-box.html
    Compact identifier card. CID + primary alias + type + status +
    visibility + version + all legacy aliases + external identifiers
    (DOI/Wikidata/GitHub/arXiv with hyperlinks) + release lines.
    Web addendum §4 + doctrine §20.

  _includes/corpus-v3/relation-panels.html
    Five panels per web addendum §6: Upstream dependencies / Formalized
    by / Appears in / Supports & Tested by / Downstream uses (declared).
    Aggregates from canonical relations[] + convenience-field shortcuts
    (depends_on, formalized_by, appears_in). Each CID is a chip link to
    /id/{cid}/. Plus a Sources meta-list.

  _includes/corpus-v3/jsonld.html
    schema.org JSON-LD embed for every item. Type mapping:
      monograph → schema:Book
      chapter → schema:Chapter
      paper/note/dossier → schema:ScholarlyArticle
      release_packet/code/proof_package → schema:CreativeWork
      page items → schema:CreativeWork
      everything else → schema:DefinedTerm
    Custom pr: namespace for Corpus-specific predicates (layer, domain,
    formalizedBy, dependsOn, status, visibility, currentVersion,
    releaseLines).

  _config.yml
    Adds corpus_v3_items collection with permalink: /id/:name/.

  _data/corpus_v3/
    Synced manifests (7 YAML files + 1 transition manifest).

  _corpus_v3_items/
    1886 generated Markdown collection documents — one per public
    Corpus Item. Each becomes /id/cid######/ in the built site.

Doctrinal milestones reached

  - Every public Corpus v3 item now has a permanent /id/cid######/ URL
  - All 2992+ aliases tracked across items remain discoverable
  - prrp:// references in proof items resolve to internal CIDs which
    in turn route to /id/cid######/
  - JSON-LD makes Corpus Items machine-readable for search engines
    and downstream consumers
  - The relation panel pattern + identifier box pattern are now reusable
    for the Result lane refactor in a later wave

Wave 7b (deferred to follow-up): Result page filtered views (e.g.,
/results/physics filtered by layers contains E1), Pagefind search
integration extension to index corpus_v3 collection, /corpus/graph/
public landing page, more elaborate tombstone polish.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ThorFuchs ThorFuchs requested a review from AnSoFuchs as a code owner May 23, 2026 20:08

import argparse
import os
import shutil
import shutil
import sys
from pathlib import Path
from typing import Any
@ThorFuchs ThorFuchs merged commit 7557891 into main May 23, 2026
5 checks passed
@ThorFuchs ThorFuchs deleted the feat/corpus-v3-projection-layer branch May 23, 2026 20:18
ThorFuchs added a commit that referenced this pull request May 23, 2026
…rects + /corpus/graph/ stats + Pagefind facets (#272)

Builds on Wave 7 (PR #271) with four discoverability and navigability
enhancements that surface the Corpus v3 graph more deeply on the public
site.

What lands

  scripts/generate_corpus_v3_pages.py
    Extended with build_reverse_index() — computes the incoming-edge
    graph by walking every item's relations[] + convenience fields
    (depends_on, contains, formalized_by, appears_in, part_of, proves,
    supports, formalizes). Result: 1417 items have downstream links,
    5681 total reverse edges tracked. Each page now carries the
    downstream_uses[] frontmatter for the relation panel.

  _includes/corpus-v3/relation-panels.html
    Adds a "Downstream uses (computed)" panel rendering the reverse
    index. Each chip shows source primary alias + source type, with
    full tooltip (title + predicate) and 25-item cap with overflow note.
    Critical Wikipedia-style discoverability per web addendum §6.

  scripts/generate_corpus_v3_alias_redirects.py
    New generator emitting one Jekyll redirect page per public typed
    alias under _corpus_v3_alias_redirects/{alias-lower}.md → /id/{cid}/.
    Skips aliases that collide with reserved top-level paths (verify,
    publications, etc.). 1893 redirects generated, 0 collisions.
    Doctrine §14.3 — alias short routes for citation continuity.

  _config.yml
    Adds corpus_v3_alias_redirects collection with permalink: /:name/.

  corpus/graph/index.md
    Adds "Corpus v3 — live graph stats" section reading from
    site.data.corpus_v3.cid-index. Renders total Corpus Items + total
    aliases + total v2→v3 transitions as a totals chip. Type
    distribution as 2-column ul. Six representative sample item pages
    for orientation. Discipline pointer back to the Charter.

  _layouts/corpus-v3-item.html
    Adds data-pagefind-filter (lane, type, status) and data-pagefind-meta
    (cid, primary alias) on the article element. Enables Pagefind UI to
    filter search results by item type/status and surface CID + primary
    alias in search hit cards.

Sample reverse-index links surfaced

  THM0001 (master constant calibration)
    BOK0001 monograph (has_part) — Book II contains this theorem
    CHP0001 chapter (has_part) — Chapter 5 derives it
    NOT0001 research note (summarizes)
    PRF0001 proof (proves) + PRF0002 (uses)
    FTH0001 formal theorem (formalizes)
    ...etc

  DEF0001 (earned boundary constants)
    BOK0001 + CHP0001 (introduces / has_part)
    LEM0001 (depends_on) + PRP0001 (depends_on)
    THM0001 (depends_on)
    PRF0001-PRF0006 (uses) — every proof item uses this
    ...etc

Total: 1893 alias redirects ready to ship (e.g., /thm0001/, /def0001/,
/pap0001/, /dos0001/, /mod0001/, /fth0001/ — every typed alias).

Net diff: ~1900 files modified (mostly downstream_uses[] frontmatter
additions to existing _corpus_v3_items/*.md) + 5 new files (alias
redirects script, ~1893 redirect MD files, _config.yml addition,
relation-panels.html update, /corpus/graph/ append).

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant