OpenAI Harmony tokens (to=functions.X, channel markers) leaking into delta.content on   openai/gpt-5.4 via Inference

### Describe the bug

I'm running a Node.js voice agent on @livekit/agents@1.4.4. Pipeline: inference.STT (Deepgram nova-3)
  → inference.LLM (openai/gpt-5.4) → inference.TTS (ElevenLabs). Two function_tools registered, defaults
   otherwise, preemptiveGeneration.enabled: true.


Occasionnally, raw tool-call protocol text leaks into my TTS output and persisted
  transcript. Verbatim sample:

```
  to=functions.saveAnswer  天天中彩票能json
  {"question":"Who are the most important people in your
  life?","questionId":"3c6bb71f-f7e7-4c75-acb9-d9d5e4a1c563","value":"Wife Patty and their thirteen
  children"}цҳауеит
  Patty and thirteen kids — well, that's a full house. What kind of work, roles, or responsibilities
  have kept you busy over the years?
```

  The structured tool call still fires correctly; the leaked text streams in parallel. This seems to be similar to #1339 




### Relevant log output

N/A

### Describe your environment

This is occurring on Livekit Cloud.
    @livekit/agents: ^1.4.1
    @livekit/agents-plugin-livekit: ^1.4.1
    @livekit/agents-plugin-silero: ^1.4.1 
    @livekit/noise-cancellation-node: ^0.1.9 
    @livekit/rtc-node: ^0.13.27

### Minimal reproducible example

I don't have reproduction steps because this has been notoriously difficult to replicate, but I've walked the codebase with Claude Code and generated the below hypothesis

 Hypothesis

  ▎ The hypothesis and source-inspection findings below were generated by walking the SDK with Claude
  ▎ Code. I haven't independently verified every line citation — flagging in case I've misread
  ▎ something.

  This looks like an OpenAI Harmony response format leak:

  - to=functions.saveAnswer is Harmony's tool-call recipient syntax.
  - The garbled Unicode bracketing the payload (天天中彩票能json, цҳауеит) matches the byte signature of
   Harmony control tokens (<|channel|>commentary, <|constrain|>json, <|message|>, <|end|>) being decoded
   as raw text rather than consumed as special tokens.
  - Pattern: to=<recipient> <channel-tokens> <json-args> <end-tokens> <final-channel-text>.

  Reads as the Inference gateway routing Harmony channel content through delta.content instead of
  consuming channel markers and emitting tool calls only via delta.tool_calls.

  Findings from source inspection

  - No Harmony-aware decoding in @livekit/agents@1.4.4 — grep for harmony, to=functions, <|channel|>,
  <|message|> in src/ returned nothing.
  - src/inference/llm.ts parseChoice (~L615–627): delta.content falls through to emit a text chunk even
  while a tool call is mid-stream (this.toolCallId !== undefined).
  - src/voice/generation.ts (~L537–563): delta.toolCalls and delta.content are independent branches;
  delta.content is written to textWriter (→ tts_node) and appended to data.generatedText.
  - Default ttsTextTransforms (filter_markdown, filter_emoji) don't touch Harmony tokens.


### Additional information

Livekit Cloud URL of an impacted session https://cloud.livekit.io/projects/p_5vda1ybjh9k/sessions/RM_pfaxxUtAmJiY

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OpenAI Harmony tokens (to=functions.X, channel markers) leaking into delta.content on openai/gpt-5.4 via Inference #1632

Describe the bug

Relevant log output

Describe your environment

Minimal reproducible example

Additional information

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

OpenAI Harmony tokens (to=functions.X, channel markers) leaking into delta.content on openai/gpt-5.4 via Inference #1632

Description

Describe the bug

Relevant log output

Describe your environment

Minimal reproducible example

Additional information

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions