Context Engineering Handbook

The practitioner's guide to building effective context for AI agents and LLM applications.

35 patterns | 7 categories | Python + TypeScript | Benchmarks | 4 framework integrations

Context engineering is the discipline of building the right information environment so an LLM can solve your actual problem. It was named by Tobi Lutke and Andrej Karpathy in 2025, and it's quickly becoming the single most important skill in AI engineering.

This is not another blog post or awesome-list. This is a pattern catalog: 35 battle-tested patterns with runnable code, decision frameworks, and documented anti-patterns. Pick a problem, find the pattern, ship it.

Why Context Engineering

Most LLM failures aren't model failures -- they're context failures. You gave the model the wrong information, too much information, or information in the wrong structure. Context engineering fixes this systematically.

Anthropic, Manus, LangChain, and others have published foundational articles on the topic. But until now, there was no single resource that combines a comprehensive taxonomy + runnable code + decision frameworks for practitioners who ship AI to production.

Quick Start

Find the right pattern for your problem:

Your agent is forgetting things mid-conversation?
  --> Conversation Compaction (#7) or Episodic Memory (#11)

Your RAG pipeline returns relevant chunks but the LLM still hallucinates?
  --> RAG Context Assembly (#5) or Few-Shot Curation (#3)

Your system prompt is a wall of text and the model ignores half of it?
  --> System Prompt Architecture (#1) or Progressive Disclosure (#2)

Your agent calls the wrong tools?
  --> Semantic Tool Selection (#6) or Observation Masking (#8)

Your multi-agent system produces inconsistent results?
  --> Sub-Agent Delegation (#9) or Multi-Agent Context Orchestration (#10)

Your context window is filling up and responses are degrading?
  --> KV-Cache Optimization (#13) or Context Rot Detection (#15)

Your agent keeps repeating the same mistakes?
  --> Error Preservation (#14) or Filesystem-as-Memory (#12)

Or use the Interactive Decision Tree for a guided walkthrough.

Pattern Catalog

Construction -- Building context from scratch

#	Pattern	Description	Complexity
1	System Prompt Architecture	Structure system prompts for maximum instruction adherence	Low
2	Progressive Disclosure	Reveal context incrementally based on task state	Medium
3	Few-Shot Curation	Select and order examples for optimal in-context learning	Medium
4	Dynamic Persona Assembly	Compose agent personas from trait modules at runtime	Medium
5	Schema-Guided Generation	Constrain output with schemas for structured, validated responses	Low
6	Template Composition	Build prompts from reusable template fragments with inheritance	Medium
7	Constraint Injection	Dynamically inject rules based on environment, tier, or compliance	Low

Retrieval -- Pulling the right context at the right time

#	Pattern	Description	Complexity
8	Just-in-Time Retrieval	Fetch context only when the model signals it needs it	Medium
9	RAG Context Assembly	Assemble retrieved chunks into coherent, structured context	High
10	Semantic Tool Selection	Dynamically select which tools to present based on the task	Medium
11	Hybrid Search Fusion	Combine keyword, semantic, and graph retrieval with rank fusion	High
12	Context-Aware Re-ranking	Re-rank results using full conversation context, not just the query	High
13	Temporal Context Selection	Prioritize recent and version-correct context with time-decay	Medium

Compression -- Fitting more signal into fewer tokens

#	Pattern	Description	Complexity
14	Conversation Compaction	Summarize conversation history without losing critical details	Medium
15	Observation Masking	Filter tool outputs to keep only what matters	Low
16	Hierarchical Summarization	Multi-tier summaries: full detail recent, compressed older	Medium
17	Token Budget Allocation	Budget context window across competing components	Medium
18	Lossy Context Distillation	Extract only task-relevant facts, discard everything else	High

Isolation -- Scoping context to prevent contamination

#	Pattern	Description	Complexity
19	Sub-Agent Delegation	Spawn focused sub-agents with minimal, task-specific context	High
20	Multi-Agent Context Orchestration	Coordinate context flow across multiple collaborating agents	High
21	Sandbox Contexts	Disposable environments for risky or exploratory operations	Medium
22	Role-Based Context Partitioning	Filter context visibility based on the agent's current role	Medium

Persistence -- Remembering across sessions and runs

#	Pattern	Description	Complexity
23	Episodic Memory	Store and retrieve task-specific memories across sessions	Medium
24	Filesystem-as-Memory	Use structured files as durable, inspectable agent memory	Low
25	Semantic Memory Indexing	Vector-indexed retrieval across all stored knowledge	High
26	Cross-Session State Sync	Synchronize agent state across concurrent sessions	High
27	Memory Consolidation	Merge, deduplicate, and prune accumulated memories	Medium

Optimization -- Squeezing more performance from your context budget

#	Pattern	Description	Complexity
28	KV-Cache Optimization	Structure prompts to maximize key-value cache hit rates	Medium
29	Error Preservation	Persist error context to prevent repeated failures	Low
30	Prompt Caching Strategies	Multi-level caching for prompts, responses, and components	High
31	Parallel Context Assembly	Fetch context from multiple sources concurrently	Medium
32	Incremental Context Updates	Patch context with diffs instead of rebuilding from scratch	Medium

Evaluation -- Measuring context quality over time

#	Pattern	Description	Complexity
33	Context Rot Detection	Detect when accumulated context degrades model performance	High
34	Context Coverage Analysis	Check if context contains all info needed for the current query	Medium
35	Ablation Testing	Measure each context component's contribution to output quality	High

Interactive Decision Tree

Not sure which pattern to use? The Interactive Decision Tree walks you through a series of questions about your problem and recommends the best pattern.

What are you trying to solve?
  |
  |-- Agent isn't following instructions --> Construction patterns
  |-- Agent lacks the right knowledge   --> Retrieval patterns
  |-- Context window filling up         --> Compression patterns
  |-- Cross-contamination between tasks --> Isolation patterns
  |-- Agent forgets between sessions    --> Persistence patterns
  |-- Slow or expensive inference       --> Optimization patterns
  |-- Quality degrading over time       --> Evaluation patterns

How Each Pattern is Structured

Every pattern follows a consistent template so you can evaluate and implement quickly:

patterns/<category>/<pattern-name>.md    # Full pattern documentation with inline code

Each pattern includes:

Section	Purpose
Problem	The specific failure mode this pattern addresses
Context	When you'd encounter this problem
Solution	The pattern itself, with architecture diagram
Implementation	Step-by-step guide with code
Decision Tree	When to use this vs. alternatives
Anti-Patterns	Common mistakes when applying this pattern
Metrics	How to measure if it's working
References	Papers, blog posts, prior art

Anti-Patterns

Knowing what NOT to do is just as important. The anti-patterns directory documents common context engineering mistakes:

The Kitchen Sink -- Dumping everything into the system prompt
Context Amnesia -- Losing critical details during compaction
The Echo Chamber -- Agent outputs become repetitive over long sessions
Stale Context Poisoning -- Retrieved context is outdated but presented as current
Tool Schema Overload -- Including all tool schemas regardless of relevance
The Infinite Loop -- Retrying failures with no new information
Context Isolation Neglect -- Running all work in a single context window

Benchmarks

Measure how well your context engineering is working with 5 benchmarks:

Benchmark	What It Measures	Low Score Means
Needle in Haystack	Fact retrieval across context positions	Apply Progressive Disclosure
Instruction Adherence	System prompt rule compliance	Apply System Prompt Architecture
Compression Fidelity	Info preservation after compaction	Apply Conversation Compaction
Retrieval Relevance	Retrieved chunk usefulness	Apply RAG Context Assembly
Token Efficiency	Signal-to-noise ratio	Apply Observation Masking

# Python
cd benchmarks/python && pip install -r requirements.txt
python runner.py --all --model gpt-4o

# TypeScript
cd benchmarks/typescript && npm install
npx tsx src/runner.ts --all --model gpt-4o

See the benchmarks README for score interpretation and full docs.

Framework Integrations

Apply handbook patterns using your framework of choice:

Framework	Patterns	Languages
LangChain	Progressive Disclosure, Conversation Compaction, RAG Assembly, Tool Selection, Sub-Agent Delegation	Python, TypeScript
LlamaIndex	RAG Assembly, Episodic Memory, Context Rot Detection	Python, TypeScript
Semantic Kernel	System Prompt Architecture, Tool Selection, KV-Cache Optimization	Python
Vercel AI SDK	Progressive Disclosure, Conversation Compaction, Error Preservation	TypeScript

See the integrations README for setup guides and full docs.

Examples

Every pattern ships with runnable examples in both Python and TypeScript.

Python:

cd examples/python
pip install -r requirements.txt
python run_example.py --pattern system-prompt-architecture

TypeScript:

cd examples/typescript
npm install
npx tsx run-example.ts --pattern system-prompt-architecture

Browse all examples in the examples directory.

Roadmap

v1.0 -- 15 core patterns with Python + TypeScript examples
v1.1 -- Interactive decision tree (HTML/JS)
v1.2 -- Anti-patterns documentation (7 anti-patterns)
v2.0 -- 20 additional patterns (35 total)
v2.1 -- Benchmark suite for context quality evaluation
v2.2 -- Framework integrations (LangChain, LlamaIndex, Semantic Kernel, Vercel AI SDK)
v3.0 -- Visual context debugger

Contributing

Context engineering is a young discipline and evolving fast. Contributions are welcome.

Ways to contribute:

Add a new pattern (use the pattern template)
Improve an existing pattern's examples or documentation
Add an anti-pattern you've encountered in production
Port examples to additional languages (Go, Rust, Java)
Fix bugs or improve clarity

See CONTRIBUTING.md for guidelines.

License

MIT License. See LICENSE for details.

Built for developers who ship AI to production.

Report an Issue | Request a Pattern | Discussions

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
anti-patterns		anti-patterns
benchmarks		benchmarks
integrations		integrations
interactive		interactive
patterns		patterns
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Context Engineering Handbook

The practitioner's guide to building effective context for AI agents and LLM applications.

Table of Contents

Why Context Engineering

Quick Start

Pattern Catalog

Construction -- Building context from scratch

Retrieval -- Pulling the right context at the right time

Compression -- Fitting more signal into fewer tokens

Isolation -- Scoping context to prevent contamination

Persistence -- Remembering across sessions and runs

Optimization -- Squeezing more performance from your context budget

Evaluation -- Measuring context quality over time

Interactive Decision Tree

How Each Pattern is Structured

Anti-Patterns

Benchmarks

Framework Integrations

Examples

Roadmap

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Context Engineering Handbook

The practitioner's guide to building effective context for AI agents and LLM applications.

Table of Contents

Why Context Engineering

Quick Start

Pattern Catalog

Construction -- Building context from scratch

Retrieval -- Pulling the right context at the right time

Compression -- Fitting more signal into fewer tokens

Isolation -- Scoping context to prevent contamination

Persistence -- Remembering across sessions and runs

Optimization -- Squeezing more performance from your context budget

Evaluation -- Measuring context quality over time

Interactive Decision Tree

How Each Pattern is Structured

Anti-Patterns

Benchmarks

Framework Integrations

Examples

Roadmap

Contributing

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages