Skip to content

[FEATURE] Support provider-specific metadata in message content block types #2014

@pgrayy

Description

@pgrayy

Problem Statement

Model providers like the OpenAI Responses API require provider-specific metadata to be round-tripped with message content across turns. For example, the Responses API expects reasoning to be sent back as a top-level input item with an id, structured summary list, and optionally encrypted_content. These fields are available during streaming (via response.output_item.done) but have no place to live in the current ReasoningContentBlock type, which was designed around the Bedrock API format.

Currently, OpenAIResponsesModel filters reasoning content from requests on subsequent turns (#2013) because the required metadata cannot be stored alongside the message. This means reasoning context is lost in multi-turn conversations, reducing model performance for reasoning models on the Responses API.

Storing this metadata in model_state was considered but rejected because model_state has a different lifecycle than messages: it overwrites on session save rather than appending, causing it to fall out of sync with messages after conversation manager trimming or session resume.

Proposed Solution

Provide a mechanism for model providers to attach provider-specific metadata to message content blocks. This could take the form of:

  • An optional extensible field (e.g., providerMetadata: dict[str, Any]) on ContentBlock or ReasoningContentBlock
  • Additional optional fields on ReasoningContentBlock for the OpenAI-specific data (id, summary, encryptedContent)

The metadata would travel with the message, getting trimmed by conversation managers and persisted/restored with sessions automatically.

Use Case

When using OpenAIResponsesModel with reasoning models (gpt-oss, o-series), the model produces reasoning content that should be passed back on subsequent turns for better multi-turn reasoning performance. The Responses API requires the reasoning item id and summary to accept it. With provider-specific metadata support, OpenAIResponsesModel could capture this data during streaming and reconstruct the proper reasoning input item when formatting the next request.

Alternatives Solutions

  • Store metadata in model_state: rejected due to lifecycle mismatch with messages (overwrites vs appends in session persistence).
  • Add OpenAI-specific fields directly to ReasoningContentBlock: works but adds provider-specific concerns to a shared type. There is precedent (redactedContent is Bedrock-specific), but a more general mechanism would be cleaner.

Additional Context

Related PR: #2013 (filter + warn fix for the immediate crash)

The OpenAI Responses API ResponseReasoningItemParam requires these fields:

  • id (Required): reasoning item ID (e.g., rs_xxx)
  • summary (Required): list of {"type": "summary_text", "text": "..."}
  • type (Required): always "reasoning"
  • content (Optional): list of {"type": "reasoning_text", "text": "..."}
  • encrypted_content (Optional): opaque encrypted reasoning tokens for stateless multi-turn

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions