Confirm this is an issue with the Python library and not an underlying OpenAI API
Describe the bug
When a streaming chunk contains two tool_calls entries with the same index, accumulate_delta stores them as two separate list elements instead of merging them. Observed in practice against vLLM with speculative decoding enabled. Subsequent chunks then merge into element [0] only, leaving element [1] orphaned. The reconstructed tool_calls[0].function.arguments is missing whichever fragment landed on [1] and is no longer parseable as JSON.
The OpenAPI spec permits this shape. tool_calls is an unconstrained array, and ChatCompletionMessageToolCallChunk only requires index:
ChatCompletionStreamResponseDelta:
properties:
tool_calls:
type: array
items:
$ref: "#/components/schemas/ChatCompletionMessageToolCallChunk"
See also #3201 , which is the same issue but with a narrower scope. (I can move this to a comment there if needed).
The cause is two early-return branches at the top of the per-key loop in _deltas.py:
for key, delta_value in delta.items():
if key not in acc: # L8-10: key absent
acc[key] = delta_value
continue
acc_value = acc[key]
if acc_value is None: # L13-15: value is None
acc[key] = delta_value
continue
Both assign the delta list wholesale and skip the index merge logic that handles duplicate indexes in steady state. So any duplicate-index entries in the very first chunk slip through unmerged.
_assistants.py has identical merge logic with the same bug. _convert_initial_chunk_into_snapshot in _completions.py uses "message": choice.delta.to_dict() to seed the snapshot, which copies the raw delta array without an index merge. So the bug also triggers on the very first chunk of a stream.
In the Node SDK (openai@6.36.0), ChatCompletionStream.ts (lines 316-337) doesn't have the short circuit. It always loops and dispatches by index, and produces the correct result on this input. AssistantStream.accumulateDelta (lines 213-270) has the same fast-path logic and the same bug.
To Reproduce
import json
from openai.lib.streaming._deltas import accumulate_delta
acc = {"tool_calls": None}
# Simulates a first chunk with two entries at index=0
# (e.g. tool name + first argument fragment in one SSE event)
accumulate_delta(acc, {"tool_calls": [
{"index": 0, "id": "call_abc", "function": {"name": "get_weather"}, "type": "function"},
{"index": 0, "function": {"arguments": '{"city"'}},
]})
accumulate_delta(acc, {"tool_calls": [
{"index": 0, "function": {"arguments": ': "London"}'}},
]})
args = acc["tool_calls"][0]["function"]["arguments"]
try:
json.loads(args)
except json.JSONDecodeError as e:
print("arguments not parseable:", e)
print("orphan entry:", acc["tool_calls"][1])
Expected: arguments == '{"city": "London"}', parses to {"city": "London"}.
Actual: arguments == ': "London"}'. The '{"city"' prefix ended up on acc["tool_calls"][1].
Code snippets
OS
macOS
Python version
Python v3.14.2
Library version
openai v2.35.0
Confirm this is an issue with the Python library and not an underlying OpenAI API
Describe the bug
When a streaming chunk contains two
tool_callsentries with the sameindex,accumulate_deltastores them as two separate list elements instead of merging them. Observed in practice against vLLM with speculative decoding enabled. Subsequent chunks then merge into element[0]only, leaving element[1]orphaned. The reconstructedtool_calls[0].function.argumentsis missing whichever fragment landed on[1]and is no longer parseable as JSON.The OpenAPI spec permits this shape.
tool_callsis an unconstrained array, andChatCompletionMessageToolCallChunkonly requiresindex:See also #3201 , which is the same issue but with a narrower scope. (I can move this to a comment there if needed).
The cause is two early-return branches at the top of the per-key loop in
_deltas.py:Both assign the delta list wholesale and skip the index merge logic that handles duplicate indexes in steady state. So any duplicate-
indexentries in the very first chunk slip through unmerged._assistants.pyhas identical merge logic with the same bug._convert_initial_chunk_into_snapshotin_completions.pyuses"message": choice.delta.to_dict()to seed the snapshot, which copies the raw delta array without an index merge. So the bug also triggers on the very first chunk of a stream.In the Node SDK (
openai@6.36.0),ChatCompletionStream.ts(lines 316-337) doesn't have the short circuit. It always loops and dispatches byindex, and produces the correct result on this input.AssistantStream.accumulateDelta(lines 213-270) has the same fast-path logic and the same bug.To Reproduce
Expected:
arguments == '{"city": "London"}', parses to{"city": "London"}.Actual:
arguments == ': "London"}'. The'{"city"'prefix ended up onacc["tool_calls"][1].Code snippets
OS
macOS
Python version
Python v3.14.2
Library version
openai v2.35.0