OEA: Structured Recursive Calibration for Generative Stability

Author: Tristen Pierson, BitConcepts Research

What This Is

An empirical study of whether recursive generative stability depends more on directional calibration and epistemic filtering than on retrieval augmentation or generic decoding constraints.

The OEA (Ontology, Epistemic, Agentic) framework is a three-layer generation-time protocol tested across 4 language models (82M to 1.5B parameters) and 3 architecture families (GPT-2, GPT-Neo, Qwen). Key result: inverting the calibration signal degrades log-probability by -0.55 to -1.37 nats, while correct calibration improves it by +0.62 to +1.63 nats.

Read the paper on Academia.edu

Quick Start

Prerequisites: Python 3.11+ and pip.

pip install -r requirements-lock.txt

Run all bigram experiments (about 2 minutes, no GPU needed):

bash scripts/run_all_experiments.sh

Run real LLM experiments (GPU recommended, 10-30 min per model):

python experiments/real_lm_experiment.py --model distilgpt2
python experiments/real_lm_experiment.py --model gpt2
python experiments/real_lm_experiment.py --model EleutherAI/gpt-neo-125M
python experiments/real_lm_experiment.py --model Qwen/Qwen2.5-1.5B

CPU is supported with reduced config: add --n-seeds 3 --n-iterations 5 --gen-tokens 40.

Verify result integrity:

python experiments/verify_manifest.py

Build the manuscript PDF (requires MiKTeX or TeX Live):

scripts/build_pdf.cmd

See REPRODUCE.md for the full step-by-step guide.

GPU Support

The experiment harness auto-detects the best available device (cuda > rocm > xpu > mps > cpu). Use --device <backend> to override.

Hardware	Install command	Test status
NVIDIA CUDA 12.1	`pip install torch==2.3.1+cu121 --index-url https://download.pytorch.org/whl/cu121`	✅ Verified (RTX 4070 SUPER, Win 11)
NVIDIA CUDA 12.4+	`pip install torch --index-url https://download.pytorch.org/whl/cu124`	✅ Verified
CPU only	`pip install torch --index-url https://download.pytorch.org/whl/cpu`	✅ Verified
AMD ROCm 6.x	`pip install torch --index-url https://download.pytorch.org/whl/rocm6.3`	⚠️ Community-tested
Intel Arc / Xe XPU	`pip install torch --index-url https://download.pytorch.org/whl/xpu`	⚠️ Community-tested
Apple Silicon (MPS)	`pip install torch` (macOS 13+, auto-detected)	⚠️ Community-tested

CI note: GPU paths are not tested in CI — GitHub-hosted runners have no GPU hardware. Only CPU-based unit tests and the LaTeX compile run automatically. If you run on ROCm, XPU, or MPS, please report your result (pass or fail) using the Hardware Compatibility template.

Docker

Image	GPU	Status	Build command
`Dockerfile`	CPU only	✅ Verified	`docker build -t oea-framework .`
`Dockerfile.cuda`	NVIDIA CUDA 12.1	✅ Verified	`docker build -f Dockerfile.cuda -t oea-framework-cuda .`
`Dockerfile.rocm`	AMD ROCm 6.x	⚠️ Community-tested	`docker build -f Dockerfile.rocm -t oea-framework-rocm .`
`Dockerfile.xpu`	Intel Arc / Xe XPU	⚠️ Community-tested	`docker build -f Dockerfile.xpu -t oea-framework-xpu .`
Apple MPS	❌ Not Docker-compatible	N/A — use native install	—

ROCm requires --device /dev/kfd --device /dev/dri --group-add render --group-add video at runtime (Linux only). XPU requires --device /dev/dri at runtime (Linux only). For Apple Silicon, install natively — MPS is not accessible from inside Docker containers.

Report ROCm/XPU/MPS results via the Hardware Compatibility template.

Repository Structure

arxiv/
  main.tex              LaTeX manuscript (14 pages)
  references.bib        13 verified citations
  figures/              3 publication figures

experiments/
  credibility_suite.py       Bigram-proxy ablation harness (12 variants)
  real_lm_experiment.py      Real LLM recursive stability experiment
  baseline_competition.py    OEA vs 5 non-OEA controls
  recursive_memory_drift.py  30-step recursive memory benchmark
  generate_figures.py        Generates all publication figures
  verify_manifest.py         SHA-256 artifact integrity checker
  manifest.json              Hashes for all committed results
  data/                      Public-domain corpora

results/                     Committed experiment artifacts
scripts/                     Setup, build, and run scripts
tests/                       12 unit tests (pytest)
REPRODUCE.md                 Step-by-step reproduction guide
Dockerfile                   CPU reproducibility container
Dockerfile.cuda              NVIDIA CUDA 12.1 GPU container (verified)
Dockerfile.rocm              AMD ROCm 6.x GPU container (community-tested)
Dockerfile.xpu               Intel Arc / Xe XPU container (community-tested)

Experiments

Experiment	What it tests	Runtime
Credibility suite	12-variant ablation, 648 runs each	~90s (CPU)
Real LLM validation	4 models, 4 variants, 10 seeds x 10 iterations	~10-30 min/model (GPU)
Memory drift	30-step recursive summarization, 20 seeds	~5s (CPU)
Baseline competition	OEA vs temperature, top-k, entropy, repetition, RAG-only	~5s (CPU)

Metrics

Log-probability — mean per-token log-prob under frozen reference model (primary metric)
ROUGE-L recall — seed-corpus content preservation (independent of log-prob)
JSD — Jensen-Shannon divergence from seed distribution
TRR / FRR — true/false rejection rates for out-of-vocabulary token detection

Citation

@misc{pierson2026oea,
  title={OEA: Structured Recursive Calibration for Generative Stability},
  author={Pierson, Tristen},
  year={2026},
  howpublished={https://github.com/BitConcepts/oea-framework-paper}
}

License

Code: MIT | Paper: CC BY 4.0

Name		Name	Last commit message	Last commit date
Latest commit History 86 Commits
.github		.github
.specsmith		.specsmith
arxiv		arxiv
docs		docs
experiments		experiments
reports		reports
results		results
scripts		scripts
tests		tests
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
.markdownlint.json		.markdownlint.json
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CITATION.cff		CITATION.cff
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
Dockerfile.cuda		Dockerfile.cuda
Dockerfile.rocm		Dockerfile.rocm
Dockerfile.xpu		Dockerfile.xpu
LEDGER.md		LEDGER.md
LICENSE		LICENSE
README.md		README.md
REPRODUCE.md		REPRODUCE.md
SECURITY.md		SECURITY.md
pytest.ini		pytest.ini
requirements-lock.txt		requirements-lock.txt
requirements.txt		requirements.txt
scaffold.yml		scaffold.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OEA: Structured Recursive Calibration for Generative Stability

What This Is

Quick Start

GPU Support

Docker

Repository Structure

Experiments

Metrics

Citation

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

OEA: Structured Recursive Calibration for Generative Stability

What This Is

Quick Start

GPU Support

Docker

Repository Structure

Experiments

Metrics

Citation

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages