Skip to content

kozz36/python-ai-backend-specialist

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

🐍 python-ai-backend-specialist

Specialized backend architecture skill for AI-native Python backends. Covers SSE streaming, Pydantic AI + LangGraph orchestration, MCP integration, tiered memory architecture, advanced RAG/GraphRAG, and Zero-Trust security for autonomous agents. Based on a 390-line research document analyzing the 2026 AI backend ecosystem, cross-checked against live sources (Playwright verification).

License

Why This Exists

AI agents now write backend code autonomously. Without an opinionated, validated skill reference, agents hallucinate obsolete patterns, overuse bidirectional transports for one-way token streaming, choose standalone vector databases without relational context, or inject .env files into agent environments.

This skill is a validated, opinionated reference for Python backends powering LLM agents, AI APIs, streaming inference, and production AI infrastructure. It defines non-negotiable architectural rules for 2026.

Built from a 390-line research document — "Arquitectura de Backends Nativos de IA: Estándares, Protocolos y Patrones de Producción (Abril 2026)" — then cross-checked against live sources.


📦 Versions

Version File Size When to Use
v3.0 (Current) versions/v3.0/SKILL.md ~55 lines + references Compact runtime skill with curated references/ and May 2026 source index
v1.0 (Historical) versions/v1.0/SKILL.md ~373 lines Preserved for backward compatibility; verify claims against v3 before reuse
v1.0-lite (Historical) versions/v1.0-lite/SKILL.md ~250 lines Preserved for backward compatibility; v3.0 replaces it for active runtime ingestion

What's Covered in v3.0 (May 2026)

  • SSE token streamingEventSourceResponse default for one-way streams; WebSockets remain valid for bidirectional realtime use cases
  • OpenTelemetry — production observability baseline; GenAI semantic conventions are Development and require deliberate opt-in
  • LangGraph 1.2.x + Pydantic AI 1.96.x — stateful graph workflows + typed agent framework
  • MCP Integration — Streamable HTTP transport, JSON-RPC 2.0 error handling
  • Tiered Memory (L0/L1/L2) — Redis for episodic, PostgreSQL for semantic/relational
  • Advanced RAG/GraphRAG — Hybrid retrieval, RRF, knowledge graphs, vector quantization
  • Zero-Trust Security — Firecracker MicroVMs, Credential Brokering Proxies, A-JWT
  • Staff-Level Snippet — FastAPI + Pydantic AI + SSE + OTel + A-JWT in one module

🚀 Quick Start

For AI Agents (Cursor, Claude Code, etc.)

# Clone into your skills directory
git clone https://github.com/kozz36/python-ai-backend-specialist.git

# Use the version that matches your need:
# - Full  → detailed architectural planning
# - Lite  → rapid stack selection under constraints

For Human Architects

Open versions/v3.0/SKILL.md for the runtime contract, then use versions/v3.0/references/technical-reference.md for detailed matrices. Key reference areas:

  • Section 1 — Transport & Streaming (SSE vs WebSockets decision matrix)
  • Section 5 — Tiered Memory Architecture (L0/L1/L2)
  • Section 7 — Zero-Trust Security (Firecracker, Credential Proxies, A-JWT)
  • Section 8 — Staff-Level Integration Snippet (production-ready FastAPI module)
  • Section 9 — Architectural Dictates (5 non-negotiable rules)

📁 Structure

versions/
├── v1.0/
│   └── SKILL.md              # Historical full reference
├── v1.0-lite/
│   └── SKILL.md              # Historical condensed reference
└── v3.0/
    ├── SKILL.md              # Current compact runtime contract
    └── references/
        ├── technical-reference.md
        └── source-index.md
docs/
├── CHANGELOG.md              # Verified version history
└── CONTRIBUTING.md           # How to contribute improvements

🔍 Verification Methodology

Every claim in this skill was sourced from the 390-line research document and validated against live sources where possible:

Claim Verification Method Status
SSE as default for one-way LLM token streams FastAPI SSE docs + transport tradeoff review ✅ Default, not exclusive
EventSourceResponse in FastAPI 0.135.0+ GitHub issues, FastAPI release notes ✅ Real
OpenTelemetry GenAI semconv opentelemetry.io specs ⚠️ Development status; opt in deliberately
LangGraph 1.2.x graph workflows PyPI + LangGraph docs ✅ Real
Pydantic AI type-safe agents pydantic.dev docs, Real Python tutorial ✅ Real
MCP Streamable HTTP transport WorkOS blog, MCP spec ✅ Real
pgvector + pgvectorscale + pgai GitHub repos, SoftwareSeni blog ✅ Real
GraphRAG multi-hop retrieval Project-specific eval required ⚠️ Do not reuse exact benchmark numbers without source re-verification
Firecracker <125ms cold start Northflank blog, Firecracker docs ✅ Real
A-JWT IETF draft datatracker.ietf.org ✅ Real (draft-goswami-agentic-jwt)

Limitation: Version numbers and performance claims reflect the research document's citations as of April–May 2026. Always verify against live sources before production deployment. See docs/CONTRIBUTING.md for verification requirements.


🤝 Contributing

This skill is maintained as a living document. See docs/CONTRIBUTING.md for:

  • How to propose additions (new frameworks, updated versions)
  • Verification requirements before merging
  • Style guide (tables > narrative, decision trees > lists)

📝 License

Apache-2.0


Maintained by: @kozz36
Research base: "Arquitectura de Backends Nativos de IA: Estándares, Protocolos y Patrones de Producción (Abril 2026)" (390-line ecosystem analysis, 2026)

Releases

No releases published

Packages

 
 
 

Contributors