Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions config/fixops.overlay.yml
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,10 @@
"enforce_ticket_sync": false,
"capture_feedback": false
},
"signing": {
"provider": "env",
"rotation_sla_days": 45
},
"modules": {
"guardrails": {"enabled": true},
"context_engine": {"enabled": true},
Expand Down Expand Up @@ -282,6 +286,17 @@
}
}
},
"policy_engine": {
"opa": {
"enabled": false,
"url": "https://opa.fixops.local:8181",
"policy_package": "fixops",
"health_path": "/health?bundles",
"bundle_status_path": "/v1/bundles/fixops/status",
"auth_token_env": "FIXOPS_OPA_TOKEN",
"request_timeout_seconds": 5
}
},
"analytics": {
"baseline": {
"findings_per_interval": 120,
Expand Down
76 changes: 14 additions & 62 deletions docs/decisionfactory_alignment.md
Original file line number Diff line number Diff line change
@@ -1,78 +1,30 @@
# DecisionFactory.ai Alignment Status

This document tracks the implementation status of the DecisionFactory.ai requirements against the current FixOps blended enterprise codebase. Each section references the authoritative source files that were reviewed to determine coverage.
This document tracks the implementation status of the DecisionFactory.ai requirements across the FixOps codebase. To reduce cognitive load, the alignment work is now split into three parts that can be reviewed independently:

## 1. Evidence must be RSA-SHA256 signed (non-repudiation)
- **Status:** ✅ Implemented
- **Notes:** `EvidenceLake.store_evidence` now applies `rsa_sign` from `src/utils/crypto.py` and persists the resulting Base64 signature, algorithm metadata, and key fingerprint alongside the existing SHA-256 checksum. Retrieval verifies both the checksum and signature before returning evidence records.
- References: `src/services/evidence_lake.py`, `src/utils/crypto.py`
- **Part 1 – Implemented capabilities:** Everything DecisionFactory.ai already gets out-of-the-box.
- **Part 2 – Partially implemented capabilities:** Workstreams that are in motion but still have visible gaps.
- **Part 3 – Missing capabilities:** Features that have not yet been started.

## 2. OPA/Rego policy-as-code runtime (demo+enterprise)
- **Status:** ❌ Missing
- **Notes:** The policy engine still uses a hand-rolled evaluator (`_evaluate_rego_rule`) and never instantiates the production OPA adapter in `src/services/real_opa_engine.py`. No policy input payload is sent to an OPA instance, and there are no automated tests covering Rego bundles.
- References: `src/services/policy_engine.py`, `src/services/real_opa_engine.py`

## 3. Explainability with SHAP/LIME alongside LLM narratives
- **Status:** ❌ Missing
- **Notes:** Processing relies on LLM-driven explanations without any SHAP/LIME feature attribution artefacts. There is no `xai_shap.py` module and no `/processing/explain` endpoint that emits attribution vectors.
- References: `src/services/processing_layer.py`, `src/api/v1/processing_layer.py`

## 4. RL/MDP learning loop for actions (defer/patch/accept)
- **Status:** ❌ Missing
- **Notes:** `enhanced_decision_engine` lacks any reinforcement-learning policy hooks. There is no `rl_policy.py`, experience logging, or `FEATURE_RL` toggle.
- References: `src/services/enhanced_decision_engine.py`

## 5. VEX ingestion (SPDX/CycloneDX) to suppress `not_affected`
- **Status:** ❌ Missing
- **Notes:** SBOM parsing ignores VEX data and there is no `vex_parser`. Findings with vendor `NOT_AFFECTED` assertions remain untouched during triage.
- References: `src/services/sbom_parser.py`

## 6. EPSS/KEV should influence SSVC/Markov transitions
- **Status:** ⚠️ Partial
- **Notes:** Feed ingestion captures counts, but the processing layer does not adjust SSVC priors or Markov transition probabilities based on EPSS percentiles or KEV membership.
- References: `src/services/feeds_service.py`, `src/services/processing_layer.py`
---

## 7. Policy gate must BLOCK any KEV finding unless waived
- **Status:** ⚠️ Partial
- **Notes:** `/policy/evaluate` blocks when KEV findings coincide with high/critical severity, yet it lacks waiver handling and does not enforce a hard block for all KEV detections as required.
- References: `src/api/v1/policy.py`
## Part 1 – Implemented capabilities ✅

## 8. Evidence export: signed JSON + printable PDF bundle
- **Status:** ❌ Missing
- **Notes:** There is no exporter that assembles a signed JSON + PDF package or a `/evidence/{id}/download` route. Evidence storage ends with database persistence only.
- References: `src/services/evidence_lake.py`
See [`Part 1 – Implemented capabilities`](decisionfactory_alignment/part-1-implemented.md) for the full breakdown of production-ready features.

## 9. Key management: KMS/HSM integration and rotation policy
- **Status:** ⚠️ Partial
- **Notes:** `EnvKeyProvider` implements RSA keys and stubs exist for AWS/Azure, but rotation routines, provider configuration flags, and operational documentation remain incomplete relative to the design brief.
- References: `src/utils/crypto.py`, `src/config/settings.py`, `docs/SECURITY.md`
---

## 10. Multi-tenant RBAC (owner, approver, auditor, integrator)
- **Status:** ❌ Missing
- **Notes:** User models do not reference tenants, nor are role checks enforced on policy/evidence/feed APIs as described.
- References: `src/models/user.py`, `src/api/v1/auth.py`
## Part 2 – Partially implemented capabilities ⚠️

## 11. Observability: Prometheus metrics for hot path
- **Status:** ⚠️ Partial
- **Notes:** Health endpoints exist, yet there is no Prometheus exporter capturing the enumerated latency/counter metrics or a bundled Grafana dashboard.
- References: `src/api/v1/monitoring.py`, `src/services/metrics.py`
See [`Part 2 – Partially implemented capabilities`](decisionfactory_alignment/part-2-partial.md) for the detailed list of in-flight workstreams and the remaining gaps to close.

## 12. CLI demo/enterprise overlays
- **Status:** ⚠️ Partial
- **Notes:** CLI overlays toggle demo vs enterprise modes, but flags for signing provider, RL, SHAP, and OPA URL are absent.
- References: `fixops/cli.py`, `config/*.overlay.yml`
---

## 13. CI/CD adapters & Postman collections kept in sync
- **Status:** ⚠️ Partial
- **Notes:** Collections exist but do not include KEV hard-block, SHAP evidence, or signed download test cases. Negative signature validation scenarios are missing.
- References: `src/api/v1/cicd.py`, `postman/FixOps-CICD-Tests.postman_collection.json`
## Part 3 – Missing capabilities ❌

## 14. Kubernetes manifests reflect new env vars and readiness
- **Status:** ⚠️ Partial
- **Notes:** Manifests do not surface `SIGNING_PROVIDER`, `KEY_ID`, `OPA_URL`, or `FEATURE_RL` environment variables. Probe configuration remains unchanged.
- References: `kubernetes/*.yaml`
See [`Part 3 – Missing capabilities`](decisionfactory_alignment/part-3-missing.md) for the six DecisionFactory.ai requirements that still need to be built from scratch.

---

### Summary
The RSA signing pathway has been implemented, but the remaining DecisionFactory.ai alignment items—OPA/Rego integration, SHAP explainability, RL policy learning, VEX ingestion, enriched policy gating, and operational overlays—are still outstanding.
RSA signing is fully aligned today. The remaining work concentrates on production OPA/Rego enforcement, net-new explainability and RL automation, VEX ingestion, richer evidence exports, and operational surface area (policy gating, EPSS/KEV-aware scoring hardening, key management backends, observability, CLI/Kubernetes configurability, and CI/CD test coverage).
9 changes: 9 additions & 0 deletions docs/decisionfactory_alignment/part-1-implemented.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Part 1 – Implemented capabilities ✅

> These requirements are fulfilled in production builds today. Each entry highlights the runtime behaviour and where to find the supporting code.

### 1. Evidence must be RSA-SHA256 signed (non-repudiation)
- **Status:** ✅ Implemented
- **Notes:** Evidence records are serialized in a canonical order, signed with RSA-SHA256, and stored with the Base64 signature, signing algorithm, and public-key fingerprint. Retrieval verifies both the hash and the signature before returning the record to callers.
- References: `fixops-blended-enterprise/src/services/evidence_lake.py`, `fixops-blended-enterprise/src/utils/crypto.py`

66 changes: 66 additions & 0 deletions docs/decisionfactory_alignment/part-2-partial.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# Part 2 – Partially implemented capabilities ⚠️

> These are the "in-flight" items: some coverage exists, but the DecisionFactory.ai specification still calls for additional functionality.

### 6. EPSS/KEV should influence SSVC/Markov transitions
- **Status:** ⚠️ Partial
- **Current coverage:** Feed refresh jobs persist EPSS/KEV snapshots, and the processing layer adjusts Markov transitions and exploitation priors when the data is present, so the probabilistic core is wired for real signals.
- **Missing work:**
- Guarantee that EPSS/KEV inputs reach every decision path (REST + batch) with regression tests covering the hand-off.
- Add validation that proves fallback heuristics engage when scientific libraries (pgmpy, pomegranate, mchmm) are unavailable.
- Publish operator runbooks documenting how to enable and monitor EPSS/KEV ingestion in production.
- **References:** `fixops-blended-enterprise/src/services/feeds_service.py`, `fixops-blended-enterprise/src/services/processing_layer.py`, `fixops-blended-enterprise/src/services/decision_engine.py`

### 7. Policy gate must BLOCK any KEV finding unless waived
- **Status:** ⚠️ Partial
- **Current coverage:** `/policy/evaluate` escalates KEV-tagged findings to hard blocks when they also carry high or critical severities, so the enforcement logic is wired into the runtime path.
- **Missing work:**
- Implement a waiver object (API + persistence) so platform security can temporarily suppress a KEV block with auditable approval metadata.
- Promote KEV detections to hard blocks regardless of severity unless an approved waiver exists.
- Extend regression suites to prove the deny-by-default behaviour and successful waiver usage.
- **References:** `fixops-blended-enterprise/src/api/v1/policy.py`

### 9. Key management: KMS/HSM integration and rotation policy
- **Status:** ⚠️ Partial
- **Current coverage:** The environment-backed `EnvKeyProvider` ships with RSA signing, on-demand rotation, and operator documentation that spells out how to rotate local keys.
- **Missing work:**
- Flesh out the AWS KMS and Azure Key Vault providers so they can load, rotate, and attest to keys managed remotely.
- Surface configuration flags in settings overlays/CLI to allow tenant-level provider selection.
- Automate rotation health checks and alerts to satisfy the DecisionFactory.ai rotation SLAs.
- **References:** `fixops-blended-enterprise/src/utils/crypto.py`, `docs/SECURITY.md`

### 11. Observability: Prometheus metrics for hot path
- **Status:** ⚠️ Partial
- **Current coverage:** A `/metrics` endpoint exposes counters for decision verdicts, enabling Prometheus scrapes of core automation throughput.
- **Missing work:**
- Instrument HTTP request latency and error ratios for decision, evidence, and policy endpoints.
- Publish histograms and gauges that map directly to the DecisionFactory.ai hot-path metrics checklist.
- Provide a Grafana dashboard (JSON + screenshots) so adopters can deploy a ready-made view.
- **References:** `fixops-blended-enterprise/src/main.py`, `fixops-blended-enterprise/src/services/metrics.py`

### 12. CLI demo/enterprise overlays
- **Status:** ⚠️ Partial
- **Current coverage:** The CLI profiles and overlay YAML let operators toggle demo vs. enterprise modules and core automation settings.
- **Missing work:**
- Introduce switches/fields for selecting the signing provider, enabling RL/SHAP experiments, and pointing to external OPA endpoints.
- Validate overlay schema updates with automated tests to ensure flags round-trip into the runtime configuration.
- Document overlay examples for each DecisionFactory.ai deployment persona.
- **References:** `fixops/fixops/cli.py`, `config/fixops.overlay.yml`

### 13. CI/CD adapters & Postman collections kept in sync
- **Status:** ⚠️ Partial
- **Current coverage:** Postman suites already cover health checks, baseline decision outcomes, and happy-path CI/CD interactions.
- **Missing work:**
- Add KEV hard-block scenarios, signed evidence retrieval flows, and negative signature verification tests.
- Keep the CI/CD adapters and Postman collections versioned together with automation that fails when they drift.
- Capture regression data for RL/SHAP toggles so new explainability features remain exercised.
- **References:** `fixops-blended-enterprise/postman/POSTMAN_COMPLETION.md`

### 14. Kubernetes manifests reflect new env vars and readiness
- **Status:** ⚠️ Partial
- **Current coverage:** Deployments ship readiness probes and surface the legacy secret/env var set.
- **Missing work:**
- Add ConfigMap entries and deployment wiring for `SIGNING_PROVIDER`, `KEY_ID`, `OPA_SERVER_URL`, and the proposed RL/SHAP feature toggles.
- Ensure the manifests expose liveness/readiness gates for the new metrics and policy services.
- Provide Helm/Kustomize overlays (or manifest snippets) that map to DecisionFactory.ai’s reference environments.
- **References:** `fixops-blended-enterprise/kubernetes/backend-deployment.yaml`, `fixops-blended-enterprise/kubernetes/configmap.yaml`
62 changes: 62 additions & 0 deletions docs/decisionfactory_alignment/part-3-missing.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# DecisionFactory Alignment — Part 3: Missing Capabilities ❌

The following DecisionFactory.ai requirements have not yet been started in FixOps. Six distinct capability areas remain open, each requiring net-new implementation work.

## 2. OPA/Rego policy-as-code runtime (demo + enterprise)
- **Status:** ❌ Missing
- **Why it matters:** DecisionFactory.ai assumes every deployment enforces policies via production OPA/Rego bundles, so skipping the real adapter leaves policy evaluations non-compliant.
- **What to build:**
- Instantiate the `RealOPAEngine` client in non-demo modes and ship configuration for pointing at external OPA endpoints.
- Implement policy input marshalling plus health checks that prove Rego bundles load and evaluate requests end-to-end.
- Add automated tests and documentation covering policy bundle deployment and failure handling.
- Evidence: `fixops-blended-enterprise/src/services/policy_engine.py` still executes inline helpers while `fixops-blended-enterprise/src/services/real_opa_engine.py` remains unused in non-demo flows.

## 3. Explainability with SHAP/LIME alongside LLM narratives
- **Status:** ❌ Missing
- **Why it matters:** DecisionFactory.ai expects both deterministic narratives and data-driven feature attribution so security reviewers can validate each recommendation.
- **What to build:**
- Introduce a SHAP/LIME service that can run against the decision engine’s feature vectors.
- Provide storage and API responses that return attribution artefacts with each decision/evidence record.
- Update documentation and demos so explainability toggles are visible to operators.
- Evidence: repository search returns no SHAP/LIME modules.

## 4. RL/MDP learning loop for actions (defer/patch/accept)
- **Status:** ❌ Missing
- **Why it matters:** DecisionFactory.ai highlights a reinforcement-learning control loop that continuously tunes defer/patch/accept policies based on outcomes.
- **What to build:**
- Capture experience tuples from deployment outcomes and store them for training.
- Implement policy evaluation + improvement routines (e.g., Q-learning or policy gradients) and expose a feature toggle for rollout.
- Instrument observability hooks so RL performance can be reviewed.
- Evidence: repository search returns no reinforcement learning hooks.

## 5. VEX ingestion (SPDX/CycloneDX) to suppress `not_affected`
- **Status:** ❌ Missing
- **Why it matters:** Without VEX ingestion, customers cannot rely on supplier attestations to automatically downgrade unaffected findings.
- **What to build:**
- Parse SPDX/CycloneDX VEX documents and merge supplier assertions into the evidence store.
- Wire suppression logic into decision evaluation so `not_affected` findings skip remediation queues.
- Add regression tests and documentation covering VEX ingestion workflows.
- Evidence: repository search shows only documentation mentions of VEX without runtime ingestion.

## 8. Evidence export: signed JSON + printable PDF bundle
- **Status:** ❌ Missing
- **Why it matters:** Auditors demand tamper-evident artefacts plus a human-readable packet when exporting DecisionFactory evidence.
- **What to build:**
- Assemble a bundle generator that signs JSON payloads, renders a PDF summary, and packages them for download.
- Publish a `/evidence/{id}/download` endpoint that enforces RBAC and streams the signed bundle.
- Verify signatures during export tests and document the operational flow.
- References: `fixops/evidence.py`, `fixops-blended-enterprise/src/api/v1`

## 10. Multi-tenant RBAC (owner, approver, auditor, integrator)
- **Status:** ❌ Missing
- **Why it matters:** DecisionFactory.ai scopes access by tenant and persona; without that mapping, shared environments lack the minimum access guarantees.
- **What to build:**
- Extend the user/tenant data model with the owner/approver/auditor/integrator roles.
- Enforce role checks across decision, evidence, policy, and configuration APIs.
- Provide migration scripts and admin tooling so operators can assign roles safely.
- References: `fixops-blended-enterprise/src/models/user.py`

---

### Snapshot
Six capability tracks remain missing. Closing them requires production OPA/Rego enforcement, net-new explainability tooling, a reinforcement-learning decision loop, VEX suppression support, signed evidence exports, and multi-tenant RBAC aligned with the DecisionFactory.ai role taxonomy.
40 changes: 40 additions & 0 deletions fastapi/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,10 @@ def Depends(dependency: Callable[..., Any] | None = None) -> Callable[..., Any]
return dependency


def Query(default: Any = None, description: str | None = None) -> Any:
return default


def File(default: Any) -> Any:
return default

Expand Down Expand Up @@ -109,6 +113,32 @@ def invoke(self, params: Mapping[str, str], body: Optional[Dict[str, Any]]) -> A
return self.endpoint(**kwargs)


class APIRouter:
def __init__(self, prefix: str = "", tags: Optional[List[str]] | None = None) -> None:
self.prefix = prefix or ""
self.tags = tags or []
self._routes: List[_Route] = []

def _register(self, method: str, path: str) -> Callable[[Callable[..., Any]], Callable[..., Any]]:
full_path = f"{self.prefix}{path}"

def decorator(func: Callable[..., Any]) -> Callable[..., Any]:
self._routes.append(_Route(method, full_path, func))
return func

return decorator

def post(self, path: str, **_: Any) -> Callable[[Callable[..., Any]], Callable[..., Any]]:
return self._register("POST", path)

def get(self, path: str, **_: Any) -> Callable[[Callable[..., Any]], Callable[..., Any]]:
return self._register("GET", path)

def add_api_route(self, path: str, endpoint: Callable[..., Any], methods: Optional[List[str]] = None, **_: Any) -> None:
for method in methods or ["GET"]:
self._routes.append(_Route(method, f"{self.prefix}{path}", endpoint))


class FastAPI:
def __init__(self, title: str | None = None, version: str | None = None) -> None:
self.title = title
Expand Down Expand Up @@ -141,14 +171,24 @@ def _handle(self, method: str, path: str, body: Optional[Dict[str, Any]]) -> Any
raise HTTPException(status_code=404, detail="Not Found")


class _StatusCodes:
HTTP_201_CREATED = 201


status = _StatusCodes()


from .testclient import TestClient # noqa: E402 (import after FastAPI definition)

__all__ = [
"FastAPI",
"APIRouter",
"HTTPException",
"Depends",
"Query",
"File",
"UploadFile",
"RequestValidationError",
"status",
"TestClient",
]
Loading