Run CV experiments reproducibly across Colab, Kaggle, browser workflows, and GPU VMs, then decide whether a result is strong enough to promote.
CV Repro Lab Skills packages two public OpenClaw/Codex skills:
data-science-cv-repro-lab: run experiments, capture browser and notebook evidence, and bundle results for reviewsota-agent: define the benchmark, rank candidates, and decide whether a claimed gain is real
Use both together when you want one planning lane and one execution lane:
sota-agentfreezes the benchmark, candidate list, rerun policy, and claim rules before more compute gets spentdata-science-cv-repro-labexecutes runs across Colab, Kaggle, browser-heavy workflows, or VMs and captures the evidence needed to promote or reject the result
Use it when you need:
- Colab, Kaggle, or VM execution discipline
- browser evidence and validation scorecards
- dataset manifests, run cards, and promotion bundles
- reproducible artifact capture for a real training or export lane
Use it when you need:
- a fixed benchmark before spending more compute
- literature triage and candidate ranking
- ablation discipline and rerun policy for small deltas
- an honest claim decision instead of benchmark theater
Install from ClawHub or copy the skill folders into $CODEX_HOME/skills/.
mkdir -p "$CODEX_HOME/skills"
rsync -a skill/data-science-cv-repro-lab/ "$CODEX_HOME/skills/data-science-cv-repro-lab/"
rsync -a skill/sota-agent/ "$CODEX_HOME/skills/sota-agent/"- CV Repro Lab on ClawHub (
v1.9.1) - SOTA Agent on ClawHub (
v1.4.1) - Portfolio entry
- added an explicit improvement harness for plateau recovery and score-improvement work
- added a review-dashboard manifest for synced QA runs, benchmark panels, runtime sweeps, and audit surfaces
- expanded run cards and candidate/program records with reruns, slices, agent threads, and auth policy
- added explicit dashboard, source-audit, and leakage-audit references to the SOTA claim surface
- added redacted public summary rendering for the richer machine-readable records
- made OAuth-backed ChatGPT/Codex paths the default public story instead of API-key-first tooling
These skills are strongest when the user already has a real CV or DS workflow and wants a drop-in research harness around it. Good fits include:
- derm or segmentation plateau recovery
- browser-heavy notebook workflows
- benchmark campaigns that need stronger promotion gates
- public or reusable experiment-management patterns across repos
ClawHub is public. Keep the published skill bundles free of:
- absolute local paths
- browser profile names
- private notebook URLs
- secrets, tokens, and customer identifiers
- internal hostnames or VM labels
Private specializations should stay in local override skills, not in the public package.
Report security issues privately using the contact path in SECURITY.md instead of filing a public issue with sensitive details.
python3 -m py_compile \
skill/data-science-cv-repro-lab/scripts/*.py \
skill/sota-agent/scripts/*.py
python3 skill/data-science-cv-repro-lab/scripts/init_cv_improvement_harness.py \
--out /tmp/cv-harness.json \
--task-id demo \
--candidate-family baseline-recovery
python3 skill/data-science-cv-repro-lab/scripts/init_cv_review_dashboard_manifest.py \
--out /tmp/cv-dashboard.json \
--dashboard-id demo-dashboard \
--title "Demo review dashboard"
python3 skill/sota-agent/scripts/init_sota_program.py \
--out /tmp/sota-program.json \
--campaign-id demo \
--task demo \
--metric scoreThe public bundle is informed by: