Releases: InferQ/ART
v0.4.11: The "Unbreakable State" & Observability Update
🚀 ART Framework v0.4.11: The "Unbreakable State" & Observability Update
Date: December 28, 2025
We are closing out the year with a massive focus on robustness and consistency. Version 0.4.11 is all about ensuring that your long-running agents—specifically those involving human intervention (HITL) and multi-agent delegation (A2A)—never lose their train of thought, even in edge-case scenarios.
We’ve also unified our observability model. Whether your agent is Planning, Executing, or Synthesizing, the stream of "Thoughts" now looks and behaves exactly the same.
🛡️ Critical Resilience & Data Integrity
We identified specific scenarios where complex agent workflows could drop data. We've plugged those gaps to ensure enterprise-grade reliability.
- HITL Partial Result Preservation: Previously, if a batch of parallel tool calls contained one tool that required human approval (suspension), the results of the successful tools in that same batch could be lost. Fixed. We now persist
partialToolResultsimmediately, so when the human approves the action, the agent resumes with the full context of what already happened. - A2A Crash Recovery: Agent-to-Agent delegation is powerful but complex. We’ve added
pendingA2ATasksto the persistent state. If the host process crashes while waiting for a sub-agent, it can now recover and pick up exactly where it left off during the polling phase. - Smarter "Think" Parsing: We’ve hardened our OutputParser. If an LLM opens a
<think>tag but forgets to close it (a common issue with reasoning models), we now treat the content as valid output rather than discarding it. No more silent failures on long chains of thought.
🧠 Enhanced Reasoning Context
Your agents are only as smart as the context we feed them. We’ve significantly upgraded what the agent "sees" during execution.
- Full History Visibility: The execution prompt now includes ALL tool results from previous steps, not just the immediately preceding one. This allows the agent to correlate data from step 1 with step 5 without hallucinating.
- Execution Summary Memory: We now persist a structured summary of completed steps to the
ConversationManager. This means follow-up queries (after the agent finishes) have full awareness of exactly what actions were taken. - Preserving Tool Metadata: We fixed an issue where
tool_call_idwas sometimes stripped during history formatting. This ensures perfect compatibility with strict provider requirements (like OpenAI/Anthropic) when replaying history.
⚠️ Breaking Change: Unified Observability Standard
We have standardized how the Thinking/Reasoning process is observed across the entire lifecycle of the PES (Plan-Execute-Synthesize) Agent.
Why? Previously, "Planning" thoughts looked different from "Execution" thoughts in the socket stream. This made building UIs difficult.
The Change:
All phases now emit consistent THOUGHTS observations.
- New Metadata: Every thought observation now includes
phase('planning' | 'execution' | 'synthesis'). - Execution Specifics: Execution thoughts now include
stepIdandstepDescriptionautomatically.
Migration Guide:
If you are listening to socket events, you need to update your enumerated types.
| Old (Deprecated) | New (Standardized) |
|---|---|
AGENT_THOUGHT_LLM_THINKING |
PLANNING_LLM_THINKING or EXECUTION_LLM_THINKING |
FINAL_SYNTHESIS_LLM_RESPONSE |
SYNTHESIS_LLM_RESPONSE |
AGENT_THOUGHT (Context) |
PLANNING_THOUGHTS or EXECUTION_THOUGHTS |
🔧 Developer Experience & Quality of Life
- Drastically Reduced Truncation: We raised the default safety limit for JSON stringification from 200 characters to 10,000 characters. You will no longer see useful debugging data arbitrarily cut off in the logs.
- Flexible Tool Outputs: We discovered that some custom tools return
{ data: ... }while others return{ output: ... }or{ result: ... }. The framework now intelligently scans for all three properties, ensuring we capture the tool's actual return value regardless of how you built it. - New Documentation: Check out
docs/concepts/interface-contracts.mdfor a definitive guide onIToolExecutorandToolResult.
Thank you for building with ART. Please check the CHANGELOG.md for the raw commit history.
What's Changed
- fix: Security hardening and PESAgent result handling by @hashangit in #40
Full Changelog: v0.4.6...v0.4.11
v0.4.6: The "Trust & Control" Update
🚀 v0.4.6: The "Trust & Control" Update
We are thrilled to introduce ART Framework v0.4.6, a release focused on two critical pillars of autonomous agents: Reliability and Human Control.
As agents become more autonomous, the gap between what they plan to do and what they actually do becomes a major reliability risk. Simultaneously, giving users the power to intervene safely is paramount. This release bridges both gaps.
🌟 Key Highlights
1. Tool-Aware Execution Framework (TAEF)
"Plans are nothing; planning is everything." - Eisenhower (now applied to AI)
We've fundamentally upgraded the agent's brain to ensure it walks the talk.
- The Problem: Previously, agents might plan to "check the weather" but then hallucinate the result without actually calling the weather tool.
- The Fix: With TAEF, the planning phase now explicitly tags steps as
toolorreasoning. - The Result: If an agent skips a required tool call, the framework automatically intervenes, forcing a retry. This guarantees that critical actions are actually performed, not just hallucinated.
2. Human-in-the-Loop (HITL) V2
Seamless collaboration between Human and AI.
We've hardened the "Blocking Tools" system to make it production-ready.
- Resilience: Agents can now robustly suspend execution to wait for your input, surviving page refreshes and long delays.
- Smart Rejection: If you say "No" to an action, the agent now understands this is a directive to find another way, rather than stubbornly trying the same thing again.
- Developer Experience: New APIs like
checkForSuspendedStatemake building persistent, stateful agent UIs significantly easier.
🛠️ Technical Improvements
- New TodoItem Schema: Added
stepType,requiredTools, andexpectedOutcomefor granular execution control. - Strict Validation Mode: Agents now default to 'strict' mode for tool steps, ensuring high-fidelity execution.
- Infinite Loop Protection: Added
MAX_VALIDATION_RETRIESto prevent agents from getting stuck in validation cycles. - Enhanced Testing: Added a suite of 24+ new tests covering complex validation and suspension scenarios.
📦 Upgrading
npm install art-framework@0.4.6No breaking changes for existing agents, but we recommend checking out the new TAEF Documentation to leverage the new strict validation capabilities.
Built with ❤️ by the ART Framework Team
v0.4.5-beta Release - Added Groq Adapter, Improved reliability on Todo List and observation parsing for follow up queries with the new PES agent and some bug fixes.
🚀 ART Framework v0.4.5
This release marks a significant milestone in inference speed for the ART Framework with the introduction of the Groq Adapter, alongside a series of major enhancements to our documentation and marketing experience.
⚡️ Groq Integration (LPU™ Powered)
We've added a native GroqAdapter to leverage Groq's ultra-fast inference speeds.
- Ultra-Fast Inference: Support for Groq's LPU-powered models (Llama 3.3, Mixtral, etc.) with sub-500ms response times.
- Advanced Streaming: Full support for real-time token streaming and reasoning visualization.
- Tool-Calling: Seamless integration with ART's tool-use system in an OpenAI-compatible format.
- Proven Performance: Integration tests verified sub-400ms time-to-first-token.
✨ Highlights from v0.4.4 (Beta)
Included in this release cycle:
- Cinematic Marketing site: A total redesign of the PESAgent Workflow section using high-tech "scrollytelling" animations.
- Embedded Docs Viewer: Read concept guides and how-to documentation directly within the marketing site.
- Enhanced OpenRouter Support: Better compatibility with legacy reasoning models and improved token streaming logic.
- DX Fixes: Optimized GitHub Pages routing for SPA support and added
safeStringifyutility for more robust observation logging.
📦 Getting Started
To use the new Groq adapter, simply update your ART installation and configure your provider:
import { GroqAdapter } from 'art-framework';
const adapter = new GroqAdapter({
apiKey: process.env.GROQ_API_KEY!,
model: 'llama-3.3-70b-versatile',
});Full Changelog: v0.4.0...v0.4.5
v0.4.1 Release - LLM adapters upgraded to the latest versions (Gemini, OpenAI, Anthropic)
[0.4.1] - 2025-12-18
🤖 Next-Gen LLM Adapter Upgrades
- Gemini 3 Family Support:
- Full support for Gemini 3 Pro, Flash, and Deep Think models.
- Implemented native
systemInstructionsupport for improved behavioral steering. - Added
thinkingLevelcontrol (low, minimal, medium, high) to balance reasoning depth and latency. - Updated default model to
gemini-3-flash.
- Claude 4.5 Family Support:
- Support for Claude 4.5 Opus, Sonnet, and Haiku.
- Maintained and optimized support for thinking tokens and reasoning blocks.
- Updated default model to
claude-4.5-sonnet.
- GPT-5 Family Support:
- Full support for GPT-5, GPT-5.1, and GPT-5.2 (including Instant, Thinking, and Pro variants).
- Updated default model to
gpt-5.2-instant.
- SDK Dependencies: Upgraded
@google/genai,@anthropic-ai/sdk, andopenaito their latest December 2025 versions.
🛠 Refactors & Maintenance
- Improved integration test resilience by ensuring tests skip correctly when API keys are missing.
- Updated documentation and README to reflect v0.4.1 status.
What's Changed
- feat: upgrade LLM adapters to support Gemini 3, Claude 4.5, and GPT-5… by @hashangit in #31
Full Changelog: v0.4.0-beta...v0.4.1-beta
v0.4.0 Release - Core PES Overhaul & React Chat Showcase
Pull Request: v0.4.0 Release - Core PES Overhaul & React Chat Showcase
🎯 Overview
This release marks a significant milestone in the ART Framework's evolution, moving from basic reasoning to a state-aware Plan-Execute-Synthesize (PES) model. It also introduces a flagship React Chat Application to demonstrate the framework's real-time capabilities.
🧱 Framework Core Changes
PES Agent Model (v2)
We have overhauled the agent's internal reasoning loop to support complex, multi-step tasks:
- State-Driven Tasks: Replaced simple history with a
todoListarchitecture. Each task/sub-task now has a trackable lifecycle (Pending, In Progress, Completed). - Self-Correcting Plans: The agent can now re-evaluate its intent and modify its own plan mid-execution based on tool outputs.
- Deep Reasoning Extraction: High-fidelity extraction of "thoughts" and "decisions" from LLM output, enabling transparent reasoning traces.
Gemini Adapter & Resilience
- Auto-Retry Mechanism: Integrated
withRetrylogic for all core LLM generating calls, significantly improving reliability in production environments. - Advanced Stream Handling: Optimized memory and data extraction for Gemini streams, specifically ensuring metadata (usage tokens, finish reasons) is captured accurately at the end of streams.
- Improved Provider Configuration: Support for
baseOptionsin provider registration, allowing developers to set global defaults (like API keys) while allowing per-thread overrides.
🚀 Featured: The ART Chat App
Located in examples/chat-app, this new showcase provides a blueprint for building agentic UIs:
- Reactive State Monitor: Uses ART's
ObservationSocketto live-update the UI as the agent progresses through its plan. - Reasoning Visualization: A dedicated "Live Thoughts" console shows character-by-character reasoning tokens as they stream from the model.
- Local-First Storage: Demonstrates seamless integration with IndexedDB for persisting agent state and conversation history.
🛠 Developer Experience (DX) Updates
- Standardized Roo Rules: Added a suite of rule sets (
.roo/) to guide agentic assistants in architecting, coding, and debugging within the framework. - New Concept Docs: Published
observations.mdto formalize how developers can monitor and respond to agent state changes. - Codebase Sanitization: Removed legacy systems including
PromptManagerandWARP.mdin favor of more efficient, streamlined alternatives.
🚢 Release Checklist
- All core interfaces updated and exported in
src/index.ts. -
CHANGELOG.mdupdated with v0.4.0 details. - Example app verified through build and manual testing.
- Documentation accuracy check for PES and Observations.
Version: 0.4.0
Target Branch: main
Primary Focus: State Management & Developer Visibility