TanStack · jherr · Feb 21, 2026 · Feb 26, 2026 · Feb 26, 2026 · Feb 26, 2026
diff --git a/.changeset/code-mode-packages.md b/.changeset/code-mode-packages.md
@@ -0,0 +1,10 @@
+---
+'@tanstack/ai': minor
+'@tanstack/ai-code-mode': minor
+'@tanstack/ai-code-mode-skills': minor
+'@tanstack/ai-isolate-cloudflare': minor
+'@tanstack/ai-isolate-node': minor
+'@tanstack/ai-isolate-quickjs': minor
+---
+
+Add code mode and isolate packages for secure AI code execution
diff --git a/.gitignore b/.gitignore
@@ -60,6 +60,7 @@ test-results
 
 STATUS_*.md
 
+.skills
 # Only .claude.settings.json should be committed
 .claude/settings.local.json
-.claude/worktrees/*
+.claude/worktrees/*
diff --git a/docs/guides/code-mode-with-skills.md b/docs/guides/code-mode-with-skills.md
@@ -0,0 +1,326 @@
+---
+title: Code Mode with Skills
+id: code-mode-with-skills
+order: 20
+---
+
+Skills extend [Code Mode](./code-mode.md) with a persistent library of reusable TypeScript snippets. When the LLM writes a useful piece of code — say, a function that fetches and ranks NPM packages — it can save that code as a _skill_. On future requests, relevant skills are loaded from storage and made available as first-class tools the LLM can call without re-writing the logic.
+
+## Overview
+
+The skills system has two integration paths:
+
+| Approach | Entry point | Skill selection | Best for |
+|----------|-------------|----------------|----------|
+| **High-level** | `codeModeWithSkills()` | Automatic (LLM-based) | New projects, turnkey setup |
+| **Manual** | Individual functions (`skillsToTools`, `createSkillManagementTools`, etc.) | You decide which skills to load | Full control, existing setups |
+
+Both paths share the same storage, trust, and execution primitives — they differ only in how skills are selected and assembled.
+
+## How It Works
+
+A request with skills enabled goes through these stages:
+
+```
+┌─────────────────────────────────────────────────────┐
+│ 1. Load skill index (metadata only, no code)        │
+├─────────────────────────────────────────────────────┤
+│ 2. Select relevant skills (LLM call — fast model)   │
+├─────────────────────────────────────────────────────┤
+│ 3. Build tool registry                              │
+│    ├── execute_typescript (Code Mode sandbox)        │
+│    ├── search_skills / get_skill / register_skill   │
+│    └── skill tools (one per selected skill)         │
+├─────────────────────────────────────────────────────┤
+│ 4. Generate system prompt                           │
+│    ├── Code Mode type stubs                         │
+│    └── Skill library documentation                  │
+├─────────────────────────────────────────────────────┤
+│ 5. Main chat() call (strong model)                  │
+│    ├── Can call skill tools directly                │
+│    ├── Can write code via execute_typescript         │
+│    └── Can register new skills for future use       │
+└─────────────────────────────────────────────────────┘
+```
+
+### LLM calls
+
+There are **two** LLM interactions per request when using the high-level API:
+
+1. **Skill selection** (`selectRelevantSkills`) — A single chat call using the adapter you provide. It sends the last 5 conversation messages plus a catalog of skill names/descriptions, and asks the model to return a JSON array of relevant skill names. This should be a cheap/fast model (e.g., `gpt-4o-mini`, `claude-haiku-4-5`).
+
+2. **Main chat** — The primary `chat()` call with your full model. This is where the LLM reasons, calls tools, writes code, and registers skills.
+
+The selection call is lightweight — it only sees skill metadata (names, descriptions, usage hints), not full code. If there are no skills in storage or no messages, it short-circuits and skips the LLM call entirely.
+
+## High-Level API: `codeModeWithSkills()`
+
+```typescript
+import { chat, maxIterations, toServerSentEventsStream } from '@tanstack/ai'
+import { createNodeIsolateDriver } from '@tanstack/ai-isolate-node'
+import { codeModeWithSkills } from '@tanstack/ai-code-mode-skills'
+import { createFileSkillStorage } from '@tanstack/ai-code-mode-skills/storage'
+import { openaiText } from '@tanstack/ai-openai'
+
+const storage = createFileSkillStorage({ directory: './.skills' })
+const driver = createNodeIsolateDriver()
+
+const { registry, systemPrompt, selectedSkills } = await codeModeWithSkills({
+  config: {
+    driver,
+    tools: [myTool1, myTool2],
+    timeout: 60_000,
+    memoryLimit: 128,
+  },
+  adapter: openaiText('gpt-4o-mini'),  // cheap model for skill selection
+  skills: {
+    storage,
+    maxSkillsInContext: 5,
+  },
+  messages,  // current conversation
+})
+
+const stream = chat({
+  adapter: openaiText('gpt-4o'),  // strong model for reasoning
+  toolRegistry: registry,
+  messages,
+  systemPrompts: ['You are a helpful assistant.', systemPrompt],
+  agentLoopStrategy: maxIterations(15),
+})
+```
+
+`codeModeWithSkills` returns:
+
+| Property | Type | Description |
+|----------|------|-------------|
+| `registry` | `ToolRegistry` | Mutable registry containing all tools. Pass to `chat()` via `toolRegistry`. |
+| `systemPrompt` | `string` | Combined Code Mode + skill library documentation. |
+| `selectedSkills` | `Array<Skill>` | Skills the selection model chose for this conversation. |
+
+### What goes into the registry
+
+The registry is populated with:
+
+- **`execute_typescript`** — The Code Mode sandbox tool. Inside the sandbox, skills are also available as `skill_*` functions (loaded dynamically at execution time).
+- **`search_skills`** — Search the skill library by query. Returns matching skill metadata.
+- **`get_skill`** — Retrieve full details (including code) for a specific skill.
+- **`register_skill`** — Save working code as a new skill. Newly registered skills are immediately added to the registry as callable tools.
+- **One tool per selected skill** — Each selected skill becomes a direct tool (prefixed with `[SKILL]` in its description) that the LLM can call without going through `execute_typescript`.
+
+## Manual API
+
+If you want full control — for example, loading all skills instead of using LLM-based selection — use the lower-level functions directly. This is the approach used in the `ts-code-mode-web` example.
+
+```typescript
+import { chat, maxIterations } from '@tanstack/ai'
+import { createCodeModeToolAndPrompt } from '@tanstack/ai-code-mode'
+import { createNodeIsolateDriver } from '@tanstack/ai-isolate-node'
+import {
+  createAlwaysTrustedStrategy,
+  createSkillManagementTools,
+  createSkillsSystemPrompt,
+  skillsToTools,
+} from '@tanstack/ai-code-mode-skills'
+import { createFileSkillStorage } from '@tanstack/ai-code-mode-skills/storage'
+
+const trustStrategy = createAlwaysTrustedStrategy()
+const storage = createFileSkillStorage({
+  directory: './.skills',
+  trustStrategy,
+})
+const driver = createNodeIsolateDriver()
+
+// 1. Create Code Mode tool + prompt
+const { tool: codeModeTool, systemPrompt: codeModePrompt } =
+  createCodeModeToolAndPrompt({
+    driver,
+    tools: [myTool1, myTool2],
+    timeout: 60_000,
+    memoryLimit: 128,
+  })
+
+// 2. Load all skills and convert to tools
+const allSkills = await storage.loadAll()
+const skillIndex = await storage.loadIndex()
+
+const skillTools = allSkills.length > 0
+  ? skillsToTools({
+      skills: allSkills,
+      driver,
+      tools: [myTool1, myTool2],
+      storage,
+      timeout: 60_000,
+      memoryLimit: 128,
+    })
+  : []
+
+// 3. Create management tools
+const managementTools = createSkillManagementTools({
+  storage,
+  trustStrategy,
+})
+
+// 4. Generate skill library prompt
+const skillsPrompt = createSkillsSystemPrompt({
+  selectedSkills: allSkills,
+  totalSkillCount: skillIndex.length,
+  skillsAsTools: true,
+})
+
+// 5. Assemble and call chat()
+const stream = chat({
+  adapter: openaiText('gpt-4o'),
+  tools: [codeModeTool, ...managementTools, ...skillTools],
+  messages,
+  systemPrompts: [BASE_PROMPT, codeModePrompt, skillsPrompt],
+  agentLoopStrategy: maxIterations(15),
+})
+```
+
+This approach skips the selection LLM call entirely — you load whichever skills you want and pass them in directly.
+
+## Skill Storage
+
+Skills are persisted through the `SkillStorage` interface. Two implementations are provided:
+
+### File storage (production)
+
+```typescript
+import { createFileSkillStorage } from '@tanstack/ai-code-mode-skills/storage'
+
+const storage = createFileSkillStorage({
+  directory: './.skills',
+  trustStrategy,  // optional, defaults to createDefaultTrustStrategy()
+})
+```
+
+Creates a directory structure:
+
+```
+.skills/
+  _index.json              # Lightweight catalog for fast loading
+  fetch_github_stats/
+    meta.json              # Description, schemas, hints, stats
+    code.ts                # TypeScript source
+  compare_npm_packages/
+    meta.json
+    code.ts
+```
+
+### Memory storage (testing)
+
+```typescript
+import { createMemorySkillStorage } from '@tanstack/ai-code-mode-skills/storage'
+
+const storage = createMemorySkillStorage()
+```
+
+Keeps everything in memory. Useful for tests and demos.
+
+### Storage interface
+
+Both implementations satisfy this interface:
+
+| Method | Description |
+|--------|-------------|
+| `loadIndex()` | Load lightweight metadata for all skills (no code) |
+| `loadAll()` | Load all skills with full details including code |
+| `get(name)` | Get a single skill by name |
+| `save(skill)` | Create or update a skill |
+| `delete(name)` | Remove a skill |
+| `search(query, options?)` | Search skills by text query |
+| `updateStats(name, success)` | Record an execution result for trust tracking |
+
+## Trust Strategies
+
+Skills start untrusted and earn trust through successful executions. The trust level is metadata only — it does not currently gate execution. Four built-in strategies are available:
+
+```typescript
+import {
+  createDefaultTrustStrategy,
+  createAlwaysTrustedStrategy,
+  createRelaxedTrustStrategy,
+  createCustomTrustStrategy,
+} from '@tanstack/ai-code-mode-skills'
+```
+
+| Strategy | Initial level | Provisional | Trusted |
+|----------|--------------|-------------|---------|
+| **Default** | `untrusted` | 10+ runs, ≥90% success | 100+ runs, ≥95% success |
+| **Relaxed** | `untrusted` | 3+ runs, ≥80% success | 10+ runs, ≥90% success |
+| **Always trusted** | `trusted` | — | — |
+| **Custom** | Configurable | Configurable | Configurable |
+
+```typescript
+const strategy = createCustomTrustStrategy({
+  initialLevel: 'untrusted',
+  provisionalThreshold: { executions: 5, successRate: 0.85 },
+  trustedThreshold: { executions: 50, successRate: 0.95 },
+})
+```
+
+## Skill Lifecycle
+
+### Registration
+
+When the LLM produces useful code via `execute_typescript`, the system prompt instructs it to call `register_skill` with:
+
+- `name` — snake_case identifier (becomes the tool name)
+- `description` — what the skill does
+- `code` — TypeScript source that receives an `input` variable
+- `inputSchema` / `outputSchema` — JSON Schema strings
+- `usageHints` — when to use this skill
+- `dependsOn` — other skills this one calls
+
+The skill is saved to storage and (if a `ToolRegistry` was provided) immediately added as a callable tool in the current session.
+
+### Execution
+
+When a skill tool is called, the system:
+
+1. Wraps the skill code with `const input = <serialized input>;`
+2. Strips TypeScript syntax to plain JavaScript
+3. Creates a fresh sandbox context with `external_*` bindings
+4. Executes the code and returns the result
+5. Updates execution stats (success/failure count) asynchronously
+
+### Selection (high-level API only)
+
+On each new request, `selectRelevantSkills`:
+
+1. Takes the last 5 conversation messages as context
+2. Builds a catalog from the skill index (name + description + first usage hint)
+3. Asks the adapter to return a JSON array of relevant skill names (max `maxSkillsInContext`)
+4. Loads full skill data for the selected names
+
+If parsing fails or the model returns invalid JSON, it falls back to an empty selection — the request proceeds without pre-loaded skills, but the LLM can still search and use skills via the management tools.
+
+## Skills as Tools vs. Sandbox Bindings
+
+The `skillsAsTools` option (default: `true`) controls how skills are exposed:
+
+| Mode | How the LLM calls a skill | Pros | Cons |
+|------|--------------------------|------|------|
+| **As tools** (`true`) | Direct tool call: `skill_name({ ... })` | Simpler for the LLM, shows in tool-call UI, proper input validation | One tool per skill in the tool list |
+| **As bindings** (`false`) | Inside `execute_typescript`: `await skill_fetch_data({ ... })` | Skills composable in code, fewer top-level tools | LLM must write code to use them |
+
+When `skillsAsTools` is enabled, the system prompt documents each skill with its schema, usage hints, and example calls. When disabled, skills appear as typed `skill_*` functions in the sandbox type stubs.
+
+## Custom Events
+
+Skill execution emits events through the TanStack AI event system:
+
+| Event | When | Payload |
+|-------|------|---------|
+| `code_mode:skill_call` | Skill tool invoked | `{ skill, input, timestamp }` |
+| `code_mode:skill_result` | Skill completed successfully | `{ skill, result, duration, timestamp }` |
+| `code_mode:skill_error` | Skill execution failed | `{ skill, error, duration, timestamp }` |
+| `skill:registered` | New skill saved via `register_skill` | `{ id, name, description, timestamp }` |
+
+## Tips
+
+- **Use a cheap model for selection.** The selection call only needs to match skill names to conversation context — `gpt-4o-mini` or `claude-haiku-4-5` work well.
+- **Start without skills.** Get Code Mode working first, then add `@tanstack/ai-code-mode-skills` once you have tools that produce reusable patterns.
+- **Monitor the skill count.** As the library grows, consider increasing `maxSkillsInContext` or switching to the manual API where you control which skills load.
+- **Newly registered skills are available on the next message,** not in the current turn's tool list (unless using `ToolRegistry` with the high-level API, which adds them immediately).
+- **Skills can call other skills.** Inside the sandbox, both `external_*` and `skill_*` functions are available. Set `dependsOn` when registering to document these relationships.