Skip to content

Releases: InferQ/ART

v0.4.11: The "Unbreakable State" & Observability Update

27 Dec 19:09

Choose a tag to compare

🚀 ART Framework v0.4.11: The "Unbreakable State" & Observability Update

Date: December 28, 2025

We are closing out the year with a massive focus on robustness and consistency. Version 0.4.11 is all about ensuring that your long-running agents—specifically those involving human intervention (HITL) and multi-agent delegation (A2A)—never lose their train of thought, even in edge-case scenarios.

We’ve also unified our observability model. Whether your agent is Planning, Executing, or Synthesizing, the stream of "Thoughts" now looks and behaves exactly the same.


🛡️ Critical Resilience & Data Integrity

We identified specific scenarios where complex agent workflows could drop data. We've plugged those gaps to ensure enterprise-grade reliability.

  • HITL Partial Result Preservation: Previously, if a batch of parallel tool calls contained one tool that required human approval (suspension), the results of the successful tools in that same batch could be lost. Fixed. We now persist partialToolResults immediately, so when the human approves the action, the agent resumes with the full context of what already happened.
  • A2A Crash Recovery: Agent-to-Agent delegation is powerful but complex. We’ve added pendingA2ATasks to the persistent state. If the host process crashes while waiting for a sub-agent, it can now recover and pick up exactly where it left off during the polling phase.
  • Smarter "Think" Parsing: We’ve hardened our OutputParser. If an LLM opens a <think> tag but forgets to close it (a common issue with reasoning models), we now treat the content as valid output rather than discarding it. No more silent failures on long chains of thought.

🧠 Enhanced Reasoning Context

Your agents are only as smart as the context we feed them. We’ve significantly upgraded what the agent "sees" during execution.

  • Full History Visibility: The execution prompt now includes ALL tool results from previous steps, not just the immediately preceding one. This allows the agent to correlate data from step 1 with step 5 without hallucinating.
  • Execution Summary Memory: We now persist a structured summary of completed steps to the ConversationManager. This means follow-up queries (after the agent finishes) have full awareness of exactly what actions were taken.
  • Preserving Tool Metadata: We fixed an issue where tool_call_id was sometimes stripped during history formatting. This ensures perfect compatibility with strict provider requirements (like OpenAI/Anthropic) when replaying history.

⚠️ Breaking Change: Unified Observability Standard

We have standardized how the Thinking/Reasoning process is observed across the entire lifecycle of the PES (Plan-Execute-Synthesize) Agent.

Why? Previously, "Planning" thoughts looked different from "Execution" thoughts in the socket stream. This made building UIs difficult.

The Change:
All phases now emit consistent THOUGHTS observations.

  • New Metadata: Every thought observation now includes phase ('planning' | 'execution' | 'synthesis').
  • Execution Specifics: Execution thoughts now include stepId and stepDescription automatically.

Migration Guide:
If you are listening to socket events, you need to update your enumerated types.

Old (Deprecated) New (Standardized)
AGENT_THOUGHT_LLM_THINKING PLANNING_LLM_THINKING or EXECUTION_LLM_THINKING
FINAL_SYNTHESIS_LLM_RESPONSE SYNTHESIS_LLM_RESPONSE
AGENT_THOUGHT (Context) PLANNING_THOUGHTS or EXECUTION_THOUGHTS

🔧 Developer Experience & Quality of Life

  • Drastically Reduced Truncation: We raised the default safety limit for JSON stringification from 200 characters to 10,000 characters. You will no longer see useful debugging data arbitrarily cut off in the logs.
  • Flexible Tool Outputs: We discovered that some custom tools return { data: ... } while others return { output: ... } or { result: ... }. The framework now intelligently scans for all three properties, ensuring we capture the tool's actual return value regardless of how you built it.
  • New Documentation: Check out docs/concepts/interface-contracts.md for a definitive guide on IToolExecutor and ToolResult.

Thank you for building with ART. Please check the CHANGELOG.md for the raw commit history.

What's Changed

  • fix: Security hardening and PESAgent result handling by @hashangit in #40

Full Changelog: v0.4.6...v0.4.11

v0.4.6: The "Trust & Control" Update

26 Dec 16:23
d06039b

Choose a tag to compare

🚀 v0.4.6: The "Trust & Control" Update

We are thrilled to introduce ART Framework v0.4.6, a release focused on two critical pillars of autonomous agents: Reliability and Human Control.

As agents become more autonomous, the gap between what they plan to do and what they actually do becomes a major reliability risk. Simultaneously, giving users the power to intervene safely is paramount. This release bridges both gaps.


🌟 Key Highlights

1. Tool-Aware Execution Framework (TAEF)

"Plans are nothing; planning is everything." - Eisenhower (now applied to AI)

We've fundamentally upgraded the agent's brain to ensure it walks the talk.

  • The Problem: Previously, agents might plan to "check the weather" but then hallucinate the result without actually calling the weather tool.
  • The Fix: With TAEF, the planning phase now explicitly tags steps as tool or reasoning.
  • The Result: If an agent skips a required tool call, the framework automatically intervenes, forcing a retry. This guarantees that critical actions are actually performed, not just hallucinated.

2. Human-in-the-Loop (HITL) V2

Seamless collaboration between Human and AI.

We've hardened the "Blocking Tools" system to make it production-ready.

  • Resilience: Agents can now robustly suspend execution to wait for your input, surviving page refreshes and long delays.
  • Smart Rejection: If you say "No" to an action, the agent now understands this is a directive to find another way, rather than stubbornly trying the same thing again.
  • Developer Experience: New APIs like checkForSuspendedState make building persistent, stateful agent UIs significantly easier.

🛠️ Technical Improvements

  • New TodoItem Schema: Added stepType, requiredTools, and expectedOutcome for granular execution control.
  • Strict Validation Mode: Agents now default to 'strict' mode for tool steps, ensuring high-fidelity execution.
  • Infinite Loop Protection: Added MAX_VALIDATION_RETRIES to prevent agents from getting stuck in validation cycles.
  • Enhanced Testing: Added a suite of 24+ new tests covering complex validation and suspension scenarios.

📦 Upgrading

npm install art-framework@0.4.6

No breaking changes for existing agents, but we recommend checking out the new TAEF Documentation to leverage the new strict validation capabilities.


Built with ❤️ by the ART Framework Team

v0.4.5-beta Release - Added Groq Adapter, Improved reliability on Todo List and observation parsing for follow up queries with the new PES agent and some bug fixes.

24 Dec 07:45

Choose a tag to compare

🚀 ART Framework v0.4.5

This release marks a significant milestone in inference speed for the ART Framework with the introduction of the Groq Adapter, alongside a series of major enhancements to our documentation and marketing experience.

⚡️ Groq Integration (LPU™ Powered)

We've added a native GroqAdapter to leverage Groq's ultra-fast inference speeds.

  • Ultra-Fast Inference: Support for Groq's LPU-powered models (Llama 3.3, Mixtral, etc.) with sub-500ms response times.
  • Advanced Streaming: Full support for real-time token streaming and reasoning visualization.
  • Tool-Calling: Seamless integration with ART's tool-use system in an OpenAI-compatible format.
  • Proven Performance: Integration tests verified sub-400ms time-to-first-token.

✨ Highlights from v0.4.4 (Beta)

Included in this release cycle:

  • Cinematic Marketing site: A total redesign of the PESAgent Workflow section using high-tech "scrollytelling" animations.
  • Embedded Docs Viewer: Read concept guides and how-to documentation directly within the marketing site.
  • Enhanced OpenRouter Support: Better compatibility with legacy reasoning models and improved token streaming logic.
  • DX Fixes: Optimized GitHub Pages routing for SPA support and added safeStringify utility for more robust observation logging.

📦 Getting Started

To use the new Groq adapter, simply update your ART installation and configure your provider:

import { GroqAdapter } from 'art-framework';

const adapter = new GroqAdapter({
  apiKey: process.env.GROQ_API_KEY!,
  model: 'llama-3.3-70b-versatile',
});

Full Changelog: v0.4.0...v0.4.5

v0.4.1 Release - LLM adapters upgraded to the latest versions (Gemini, OpenAI, Anthropic)

18 Dec 11:16
6881a9b

Choose a tag to compare

[0.4.1] - 2025-12-18

🤖 Next-Gen LLM Adapter Upgrades

  • Gemini 3 Family Support:
    • Full support for Gemini 3 Pro, Flash, and Deep Think models.
    • Implemented native systemInstruction support for improved behavioral steering.
    • Added thinkingLevel control (low, minimal, medium, high) to balance reasoning depth and latency.
    • Updated default model to gemini-3-flash.
  • Claude 4.5 Family Support:
    • Support for Claude 4.5 Opus, Sonnet, and Haiku.
    • Maintained and optimized support for thinking tokens and reasoning blocks.
    • Updated default model to claude-4.5-sonnet.
  • GPT-5 Family Support:
    • Full support for GPT-5, GPT-5.1, and GPT-5.2 (including Instant, Thinking, and Pro variants).
    • Updated default model to gpt-5.2-instant.
  • SDK Dependencies: Upgraded @google/genai, @anthropic-ai/sdk, and openai to their latest December 2025 versions.

🛠 Refactors & Maintenance

  • Improved integration test resilience by ensuring tests skip correctly when API keys are missing.
  • Updated documentation and README to reflect v0.4.1 status.

What's Changed

  • feat: upgrade LLM adapters to support Gemini 3, Claude 4.5, and GPT-5… by @hashangit in #31

Full Changelog: v0.4.0-beta...v0.4.1-beta

v0.4.0 Release - Core PES Overhaul & React Chat Showcase

18 Dec 06:50
73afb4e

Choose a tag to compare

Pull Request: v0.4.0 Release - Core PES Overhaul & React Chat Showcase

🎯 Overview

This release marks a significant milestone in the ART Framework's evolution, moving from basic reasoning to a state-aware Plan-Execute-Synthesize (PES) model. It also introduces a flagship React Chat Application to demonstrate the framework's real-time capabilities.


🧱 Framework Core Changes

PES Agent Model (v2)

We have overhauled the agent's internal reasoning loop to support complex, multi-step tasks:

  • State-Driven Tasks: Replaced simple history with a todoList architecture. Each task/sub-task now has a trackable lifecycle (Pending, In Progress, Completed).
  • Self-Correcting Plans: The agent can now re-evaluate its intent and modify its own plan mid-execution based on tool outputs.
  • Deep Reasoning Extraction: High-fidelity extraction of "thoughts" and "decisions" from LLM output, enabling transparent reasoning traces.

Gemini Adapter & Resilience

  • Auto-Retry Mechanism: Integrated withRetry logic for all core LLM generating calls, significantly improving reliability in production environments.
  • Advanced Stream Handling: Optimized memory and data extraction for Gemini streams, specifically ensuring metadata (usage tokens, finish reasons) is captured accurately at the end of streams.
  • Improved Provider Configuration: Support for baseOptions in provider registration, allowing developers to set global defaults (like API keys) while allowing per-thread overrides.

🚀 Featured: The ART Chat App

Located in examples/chat-app, this new showcase provides a blueprint for building agentic UIs:

  • Reactive State Monitor: Uses ART's ObservationSocket to live-update the UI as the agent progresses through its plan.
  • Reasoning Visualization: A dedicated "Live Thoughts" console shows character-by-character reasoning tokens as they stream from the model.
  • Local-First Storage: Demonstrates seamless integration with IndexedDB for persisting agent state and conversation history.

🛠 Developer Experience (DX) Updates

  • Standardized Roo Rules: Added a suite of rule sets (.roo/) to guide agentic assistants in architecting, coding, and debugging within the framework.
  • New Concept Docs: Published observations.md to formalize how developers can monitor and respond to agent state changes.
  • Codebase Sanitization: Removed legacy systems including PromptManager and WARP.md in favor of more efficient, streamlined alternatives.

🚢 Release Checklist

  • All core interfaces updated and exported in src/index.ts.
  • CHANGELOG.md updated with v0.4.0 details.
  • Example app verified through build and manual testing.
  • Documentation accuracy check for PES and Observations.

Version: 0.4.0
Target Branch: main
Primary Focus: State Management & Developer Visibility