REST API server for FailSafe - expose failure management and risk detection via HTTP.
A production-ready REST API for submitting, querying, and analyzing failure reports from any system or language. Includes authentication, rate limiting, and full audit trails.
npm install -g @failsafe/api
Start the server:
failsafe-api --port 3000 --db ./failures.db
Or with environment variables:
export FAILSAFE_PORT=3000 export FAILSAFE_DB=./failures.db failsafe-api
POST /api/failures - Report new failure
{ "title": "Model timeout", "description": "Claude exceeded timeout limit", "severity": "high", "systemId": "my-system", "agentId": "claude", "failureCategory": "timeout", "context": { "timeoutMs": 300000 } }
Response:
{ "id": "fail_abc123", "fingerprint": "sha256...", "timestamp": "2024-03-01T12:00:00Z", "isDuplicate": false }
GET /api/failures - List failures
Query params: systemId - Filter by system severity - Filter by severity (critical, high, medium, low) category - Filter by failure category limit - Results per page (default 20) offset - Pagination offset after - ISO timestamp, only failures after this time
GET /api/failures/:id - Get failure details
Returns full failure report with all context.
GET /api/systems - List all systems with failure stats
GET /api/systems/:systemId - System statistics
{ "systemId": "my-system", "totalFailures": 156, "criticalCount": 12, "highCount": 34, "byCategory": { "timeout": 45, "misunderstanding": 67, "reasoning_error": 44 }, "trend": "increasing" }
GET /api/patterns - Analyze failure patterns
Query params: systemId - System to analyze timeRange - Time window (1h, 24h, 7d, 30d)
Returns:
{ "patterns": [ { "pattern": "timeout on complex queries", "frequency": 45, "affectedSystems": 3, "trend": "increasing", "preventionSignal": "Add query complexity limit" } ] }
GET /api/taxonomy - Get failure taxonomy
Returns full failure taxonomy with categories and descriptions.
--port - Server port (default 3000) --db - Database file path (required) --host - Bind address (default localhost) --auth-key - API key for authentication --log-level - Log level (debug, info, warn, error)
With auth-key set, include in all requests:
curl -H "X-API-Key: your-key" http://localhost:3000/api/failures
npm run dev - Development mode with hot reload npm run build - Build for production npm test - Run tests npm run lint - Lint code
Docker:
docker build -t failsafe-api . docker run -p 3000:3000 -v /data:/data failsafe-api
Environment:
FAILSAFE_PORT=3000 FAILSAFE_DB=/data/failures.db FAILSAFE_NODE_ENV=production FAILSAFE_LOG_LEVEL=info
- Handles 1000+ reports/second
- Sub-100ms query response
- Automatic indexing
- Compression enabled
- Main README: ../../README.md
- Taxonomy: ../../spec/taxonomy/failure-taxonomy-v0.1.md
- Quickstart: ../../spec/failsafe-quickstart.md
- Full API Ref: ../../spec/failsafe-api-reference.md
MIT - See LICENSE