DeepSeek-native agent framework: Cache-First Loop, R1 Thought Harvesting, Tool-Call Repair. TypeScript + Ink TUI.
-
Updated
Apr 30, 2026 - TypeScript
DeepSeek-native agent framework: Cache-First Loop, R1 Thought Harvesting, Tool-Call Repair. TypeScript + Ink TUI.
The Multi-Agent Reasoning framework creates an interactive chatbot where AI agents collaborate via structured reasoning and Swarm Integration for optimal answers. Simulating a team that discusses, debates, and refines responses, it enables complex problem-solving and precise results. Now with Prompt Caching to reduce latency and costs.
Anthropic Claude API wrapper for Go
Independent research on Claude Code internals, Claude Agent SDK, and related tooling.
Automatic prompt caching for Claude Code. Cuts token costs by up to 90% on repeated file reads, bug fix sessions, and long coding conversations - zero config.
🚀 Autocache - Intelligent Anthropic API Cache Proxy Automatically inject cache-control fields into Claude API requests to reduce costs by up to 90% and latency by up to 85%. Works as a transparent drop-in replacement for popular AI platforms like n8n, Flowise, Make.com, LangChain, and LlamaIndex—no code changes required
Build production ready apps for GPT using Node.js & TypeScript
Global, unlimited persistent memory for Claude Code agents. Context-activated hints injected automatically via hooks using scatter-gather map-reduce.
Interact with Anthropic and Anthropic-compatible chat completion APIs in a simple and elegant way. Supports vision, prompt caching, and more.
Agentic-AI framework w/o the headaches
prompt caching to save dollars on generative AI API usage.
Cache-aware orchestration for LLM agents. Fork helpers that share cached prefixes, detect cache breaks, and cut token costs by 38%+.
A curated list of strategies, tools, papers, and resources for reducing LLM token costs and improving efficiency in production.
FastAPI proxy that strips volatile fields from OpenClaw requests to dramatically improve llama-server KV cache hit rates (~22× faster prompt eval)
Using local AI as easy and light weight as a crab eating the cheese
Agent Skill that caches LLM image descriptions as XMP metadata inside image files, reducing token usage by ~92% on repeated reads. Works with 30+ compatible agents.
91 production-proven patterns for building AI agents, extracted from a 512K-line codebase. Covers agentic loops, tool systems, permissions, MCP, prompt caching, multi-agent orchestration.
Self-hosted AI agent teams inside your messaging apps.
Save 60-90% on LLM token costs with intelligent memory compression for multi-agent systems
🚀 Comprehensive Claude API cost optimization toolkit - reduce costs by 50-95%
Add a description, image, and links to the prompt-caching topic page so that developers can more easily learn about it.
To associate your repository with the prompt-caching topic, visit your repo's landing page and select "manage topics."