Skip to content

SentinelOps-CI/dataset-safety-specs

Repository files navigation

Dataset Safety Specifications

Formal methods and runtime enforcement for safe data and model pipelines.

dataset-safety-specs combines Lean specifications with generated runtime guards, compliance tooling, and strict engineering gates so safety guarantees remain auditable from proof to production.

At a Glance

Area What it provides
Formal specifications Lean modules for lineage, policy, optimizer invariants, and shape safety
Guard generation Lean-first extraction to Rust and Python guard libraries
Runtime enforcement Python safety kernel, compliance bundle generation, operational endpoints
Delivery controls Strict CI, proof gating, dependency audits, and security scanning

Why This Exists

Most safety and compliance systems break down at boundaries: specification differs from implementation, CI tolerates regressions, and runtime behavior drifts from intent. This repository addresses that by keeping one formal source of truth and enforcing translation and validation rigorously.

Quick Start

git clone https://github.com/SentinelOps-CI/dataset-safety-specs.git
cd dataset-safety-specs

# Install dev toolchain
pip install -r requirements-dev.txt

# Build formal project
lake build

# Run local quality gates
pre-commit run --all-files
lake exe test_suite
lake exe benchmark_suite
python python/regression_tests.py

Common Workflows

Generate Guards (Canonical)

lake exe extract_guard

Outputs:

  • extracted/rust for Rust guard artifacts
  • extracted/python for Python guard artifacts
  • generated runtime server entrypoint (ds_guard.server) for operational health endpoints

Verify Model Shape Safety

lake exe shapesafe_verify model.onnx

Build Distribution Bundles

./bundle.sh bundle

Repository Layout

Path Purpose
src/DatasetSafetySpecs Lean source-of-truth specifications and extraction logic
python Runtime integrations, parsers, safety kernel, and compliance tooling
extracted Generated artifacts and deployment packaging
.github/workflows CI, formal verification, security, and release workflows

Runtime Operations

The generated/runtime server supports:

  • /health
  • /ready
  • /metrics

These endpoints are aligned with Docker/Kubernetes deployment manifests and monitoring configuration.

Quality and Security Standards

  • Required CI checks fail hard (no permissive pass-through behavior).
  • Production proof-gated modules must not contain unresolved placeholder proofs.
  • Security checks include static analysis, dependency audits, and secret scanning.
  • Dependency drift is managed with automated update PRs.

Documentation

  • Module/API reference: docs/index.md
  • Contribution guide: CONTRIBUTING.md
  • Risk register: RISK_REGISTER.md
  • Security policy: SECURITY.md
  • Release history: CHANGELOG.md

License

Licensed under MIT. See LICENSE.

About

Formal verification framework for dataset lineage, policy compliance, and training-time safety guarantees.

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Contributors