Simplify AI SDK v5 prompt-caching README to link to main docs

subtleGradient · subtleGradient · commit d5ef1b4b286b · 2025-11-11T20:12:39.000-05:00
diff --git a/typescript/ai-sdk-v5/src/prompt-caching/README.md b/typescript/ai-sdk-v5/src/prompt-caching/README.md
@@ -1,100 +1,51 @@
-# Anthropic Prompt Caching Examples (AI SDK v5)
+# Prompt Caching Examples (AI SDK v5)
 
-This directory contains examples demonstrating Anthropic's prompt caching feature via OpenRouter using Vercel AI SDK v5.
+Examples demonstrating prompt caching with Vercel AI SDK v5.
 
-## What is Prompt Caching?
+## Documentation
 
-Anthropic's prompt caching allows you to cache large portions of your prompts to:
-- **Reduce costs** - Cached tokens cost significantly less
-- **Improve latency** - Cached content is processed faster
-- **Enable larger contexts** - Use more context without proportional cost increases
+For full prompt caching documentation including all providers, pricing, and configuration details, see:
+- **[Prompt Caching Guide](../../../../docs/prompt-caching.md)**
 
-Cache TTL: 5 minutes for ephemeral caches
+## Examples in This Directory
 
-## Examples
+- `user-message-cache.ts` - Cache large context in user messages
+- `multi-message-cache.ts` - Cache system prompt across multi-turn conversations
+- `no-cache-control.ts` - Control scenario (validates methodology)
+
+## Quick Start
 
-### User Message Cache (`user-message-cache.ts`)
-Cache large context in user messages using AI SDK:
 ```bash
+# Run an example
 bun run typescript/ai-sdk-v5/src/prompt-caching/user-message-cache.ts
 ```
 
-**Pattern**: User message with `providerOptions.openrouter.cacheControl`
-
-## How to Use with AI SDK
+## AI SDK v5 Usage
 
 ```typescript
 import { createOpenRouter } from '@openrouter/ai-sdk-provider';
-import { generateText } from 'ai';
 
-// CRITICAL: Must include stream_options for usage details
 const openrouter = createOpenRouter({
-  apiKey: process.env.OPENROUTER_API_KEY,
   extraBody: {
-    stream_options: { include_usage: true }, // Required!
+    stream_options: { include_usage: true }, // Required for cache metrics
   },
 });
 
+// Use providerOptions.openrouter.cacheControl on content items
 const result = await generateText({
   model: openrouter('anthropic/claude-3.5-sonnet'),
-  messages: [
-    {
-      role: 'user',
-      content: [
-        {
-          type: 'text',
-          text: 'Large context here...',
-          providerOptions: {
-            openrouter: {
-              cacheControl: { type: 'ephemeral' }, // Cache this block
-            },
-          },
-        },
-        {
-          type: 'text',
-          text: 'Your question here',
-        },
-      ],
-    },
-  ],
+  messages: [{
+    role: 'user',
+    content: [{
+      type: 'text',
+      text: 'Large context...',
+      providerOptions: {
+        openrouter: { cacheControl: { type: 'ephemeral' } }
+      }
+    }]
+  }]
 });
 
 // Check cache metrics
-const cachedTokens = result.providerMetadata?.openrouter?.usage?.promptTokensDetails?.cachedTokens ?? 0;
-```
-
-## Important Notes
-
-### Critical Configuration
-**MUST include `extraBody: { stream_options: { include_usage: true } }`**
-- Without this, usage details (including cached_tokens) are not populated
-- This is a provider-level configuration, not per-request
-
-### Cache Metrics Location
-Cache metrics are in `providerMetadata.openrouter.usage`:
-```typescript
-{
-  promptTokens: number,
-  completionTokens: number,
-  promptTokensDetails: {
-    cachedTokens: number  // Number of tokens read from cache
-  }
-}
+const cached = result.providerMetadata?.openrouter?.usage?.promptTokensDetails?.cachedTokens ?? 0;
 ```
-
-### Requirements
-1. **stream_options.include_usage = true** - CRITICAL for usage details
-2. **Minimum 2048+ tokens** - Smaller content may not be cached
-3. **providerOptions.openrouter.cacheControl** - On content items, not messages
-4. **Exact match** - Cache only hits on identical content
-
-### Expected Behavior
-- **First call**: `cachedTokens = 0` (cache miss, creates cache)
-- **Second call**: `cachedTokens > 0` (cache hit, reads from cache)
-
-## Scientific Method
-All examples follow evidence-based verification:
-- **Hypothesis**: providerOptions.openrouter.cacheControl triggers caching
-- **Experiment**: Make identical calls twice
-- **Evidence**: Measure via providerMetadata.openrouter.usage
-- **Analysis**: Compare cache miss vs cache hit