35 patterns | 7 categories | Python + TypeScript | Benchmarks | 4 framework integrations
Context engineering is the discipline of building the right information environment so an LLM can solve your actual problem. It was named by Tobi Lutke and Andrej Karpathy in 2025, and it's quickly becoming the single most important skill in AI engineering.
This is not another blog post or awesome-list. This is a pattern catalog: 35 battle-tested patterns with runnable code, decision frameworks, and documented anti-patterns. Pick a problem, find the pattern, ship it.
- Why Context Engineering
- Quick Start
- Pattern Catalog
- Interactive Decision Tree
- Pattern Structure
- Anti-Patterns
- Benchmarks
- Framework Integrations
- Examples
- Roadmap
- Contributing
- Star History
- License
Most LLM failures aren't model failures -- they're context failures. You gave the model the wrong information, too much information, or information in the wrong structure. Context engineering fixes this systematically.
Anthropic, Manus, LangChain, and others have published foundational articles on the topic. But until now, there was no single resource that combines a comprehensive taxonomy + runnable code + decision frameworks for practitioners who ship AI to production.
Find the right pattern for your problem:
Your agent is forgetting things mid-conversation?
--> Conversation Compaction (#7) or Episodic Memory (#11)
Your RAG pipeline returns relevant chunks but the LLM still hallucinates?
--> RAG Context Assembly (#5) or Few-Shot Curation (#3)
Your system prompt is a wall of text and the model ignores half of it?
--> System Prompt Architecture (#1) or Progressive Disclosure (#2)
Your agent calls the wrong tools?
--> Semantic Tool Selection (#6) or Observation Masking (#8)
Your multi-agent system produces inconsistent results?
--> Sub-Agent Delegation (#9) or Multi-Agent Context Orchestration (#10)
Your context window is filling up and responses are degrading?
--> KV-Cache Optimization (#13) or Context Rot Detection (#15)
Your agent keeps repeating the same mistakes?
--> Error Preservation (#14) or Filesystem-as-Memory (#12)
Or use the Interactive Decision Tree for a guided walkthrough.
| # | Pattern | Description | Complexity |
|---|---|---|---|
| 1 | System Prompt Architecture | Structure system prompts for maximum instruction adherence | Low |
| 2 | Progressive Disclosure | Reveal context incrementally based on task state | Medium |
| 3 | Few-Shot Curation | Select and order examples for optimal in-context learning | Medium |
| 4 | Dynamic Persona Assembly | Compose agent personas from trait modules at runtime | Medium |
| 5 | Schema-Guided Generation | Constrain output with schemas for structured, validated responses | Low |
| 6 | Template Composition | Build prompts from reusable template fragments with inheritance | Medium |
| 7 | Constraint Injection | Dynamically inject rules based on environment, tier, or compliance | Low |
| # | Pattern | Description | Complexity |
|---|---|---|---|
| 8 | Just-in-Time Retrieval | Fetch context only when the model signals it needs it | Medium |
| 9 | RAG Context Assembly | Assemble retrieved chunks into coherent, structured context | High |
| 10 | Semantic Tool Selection | Dynamically select which tools to present based on the task | Medium |
| 11 | Hybrid Search Fusion | Combine keyword, semantic, and graph retrieval with rank fusion | High |
| 12 | Context-Aware Re-ranking | Re-rank results using full conversation context, not just the query | High |
| 13 | Temporal Context Selection | Prioritize recent and version-correct context with time-decay | Medium |
| # | Pattern | Description | Complexity |
|---|---|---|---|
| 14 | Conversation Compaction | Summarize conversation history without losing critical details | Medium |
| 15 | Observation Masking | Filter tool outputs to keep only what matters | Low |
| 16 | Hierarchical Summarization | Multi-tier summaries: full detail recent, compressed older | Medium |
| 17 | Token Budget Allocation | Budget context window across competing components | Medium |
| 18 | Lossy Context Distillation | Extract only task-relevant facts, discard everything else | High |
| # | Pattern | Description | Complexity |
|---|---|---|---|
| 19 | Sub-Agent Delegation | Spawn focused sub-agents with minimal, task-specific context | High |
| 20 | Multi-Agent Context Orchestration | Coordinate context flow across multiple collaborating agents | High |
| 21 | Sandbox Contexts | Disposable environments for risky or exploratory operations | Medium |
| 22 | Role-Based Context Partitioning | Filter context visibility based on the agent's current role | Medium |
| # | Pattern | Description | Complexity |
|---|---|---|---|
| 23 | Episodic Memory | Store and retrieve task-specific memories across sessions | Medium |
| 24 | Filesystem-as-Memory | Use structured files as durable, inspectable agent memory | Low |
| 25 | Semantic Memory Indexing | Vector-indexed retrieval across all stored knowledge | High |
| 26 | Cross-Session State Sync | Synchronize agent state across concurrent sessions | High |
| 27 | Memory Consolidation | Merge, deduplicate, and prune accumulated memories | Medium |
| # | Pattern | Description | Complexity |
|---|---|---|---|
| 28 | KV-Cache Optimization | Structure prompts to maximize key-value cache hit rates | Medium |
| 29 | Error Preservation | Persist error context to prevent repeated failures | Low |
| 30 | Prompt Caching Strategies | Multi-level caching for prompts, responses, and components | High |
| 31 | Parallel Context Assembly | Fetch context from multiple sources concurrently | Medium |
| 32 | Incremental Context Updates | Patch context with diffs instead of rebuilding from scratch | Medium |
| # | Pattern | Description | Complexity |
|---|---|---|---|
| 33 | Context Rot Detection | Detect when accumulated context degrades model performance | High |
| 34 | Context Coverage Analysis | Check if context contains all info needed for the current query | Medium |
| 35 | Ablation Testing | Measure each context component's contribution to output quality | High |
Not sure which pattern to use? The Interactive Decision Tree walks you through a series of questions about your problem and recommends the best pattern.
What are you trying to solve?
|
|-- Agent isn't following instructions --> Construction patterns
|-- Agent lacks the right knowledge --> Retrieval patterns
|-- Context window filling up --> Compression patterns
|-- Cross-contamination between tasks --> Isolation patterns
|-- Agent forgets between sessions --> Persistence patterns
|-- Slow or expensive inference --> Optimization patterns
|-- Quality degrading over time --> Evaluation patterns
Every pattern follows a consistent template so you can evaluate and implement quickly:
patterns/<category>/<pattern-name>.md # Full pattern documentation with inline code
Each pattern includes:
| Section | Purpose |
|---|---|
| Problem | The specific failure mode this pattern addresses |
| Context | When you'd encounter this problem |
| Solution | The pattern itself, with architecture diagram |
| Implementation | Step-by-step guide with code |
| Decision Tree | When to use this vs. alternatives |
| Anti-Patterns | Common mistakes when applying this pattern |
| Metrics | How to measure if it's working |
| References | Papers, blog posts, prior art |
Knowing what NOT to do is just as important. The anti-patterns directory documents common context engineering mistakes:
- The Kitchen Sink -- Dumping everything into the system prompt
- Context Amnesia -- Losing critical details during compaction
- The Echo Chamber -- Agent outputs become repetitive over long sessions
- Stale Context Poisoning -- Retrieved context is outdated but presented as current
- Tool Schema Overload -- Including all tool schemas regardless of relevance
- The Infinite Loop -- Retrying failures with no new information
- Context Isolation Neglect -- Running all work in a single context window
Measure how well your context engineering is working with 5 benchmarks:
| Benchmark | What It Measures | Low Score Means |
|---|---|---|
| Needle in Haystack | Fact retrieval across context positions | Apply Progressive Disclosure |
| Instruction Adherence | System prompt rule compliance | Apply System Prompt Architecture |
| Compression Fidelity | Info preservation after compaction | Apply Conversation Compaction |
| Retrieval Relevance | Retrieved chunk usefulness | Apply RAG Context Assembly |
| Token Efficiency | Signal-to-noise ratio | Apply Observation Masking |
# Python
cd benchmarks/python && pip install -r requirements.txt
python runner.py --all --model gpt-4o
# TypeScript
cd benchmarks/typescript && npm install
npx tsx src/runner.ts --all --model gpt-4oSee the benchmarks README for score interpretation and full docs.
Apply handbook patterns using your framework of choice:
| Framework | Patterns | Languages |
|---|---|---|
| LangChain | Progressive Disclosure, Conversation Compaction, RAG Assembly, Tool Selection, Sub-Agent Delegation | Python, TypeScript |
| LlamaIndex | RAG Assembly, Episodic Memory, Context Rot Detection | Python, TypeScript |
| Semantic Kernel | System Prompt Architecture, Tool Selection, KV-Cache Optimization | Python |
| Vercel AI SDK | Progressive Disclosure, Conversation Compaction, Error Preservation | TypeScript |
See the integrations README for setup guides and full docs.
Every pattern ships with runnable examples in both Python and TypeScript.
Python:
cd examples/python
pip install -r requirements.txt
python run_example.py --pattern system-prompt-architectureTypeScript:
cd examples/typescript
npm install
npx tsx run-example.ts --pattern system-prompt-architectureBrowse all examples in the examples directory.
- v1.0 -- 15 core patterns with Python + TypeScript examples
- v1.1 -- Interactive decision tree (HTML/JS)
- v1.2 -- Anti-patterns documentation (7 anti-patterns)
- v2.0 -- 20 additional patterns (35 total)
- v2.1 -- Benchmark suite for context quality evaluation
- v2.2 -- Framework integrations (LangChain, LlamaIndex, Semantic Kernel, Vercel AI SDK)
- v3.0 -- Visual context debugger
Context engineering is a young discipline and evolving fast. Contributions are welcome.
Ways to contribute:
- Add a new pattern (use the pattern template)
- Improve an existing pattern's examples or documentation
- Add an anti-pattern you've encountered in production
- Port examples to additional languages (Go, Rust, Java)
- Fix bugs or improve clarity
See CONTRIBUTING.md for guidelines.
MIT License. See LICENSE for details.
Built for developers who ship AI to production.