fix(deepseek): extract prompt_cache_hit_tokens and reasoning_tokens from usage by Lintume · Pull Request #1021 · prism-php/prism

Lintume · 2026-05-22T17:45:59Z

Description

DeepSeek\Handlers\Text::addStep() and Stream::extractUsage() hardcode Usage to only prompt_tokens and completion_tokens from the API response, silently dropping two DeepSeek-specific fields:

usage.prompt_cache_hit_tokens — cached input portion of the prompt. DeepSeek offers a 98% discount on cache hits (their headline feature) and exposes this as a separate counter.
usage.completion_tokens_details.reasoning_tokens — internal thinking tokens emitted by reasoning models (`deepseek-reasoner`, `deepseek-v4-flash` thinking mode).

Impact

Apps that compute cost from Prism's `Usage` charge the full `prompt_tokens` at fresh rate — overstating real spend ~3-5x once the prompt cache warms up.
No signal to derive a `cacheHitRatio` for monitoring prompt-prefix stability.
Reasoning-mode token consumption is invisible to observability tooling (Langfuse, custom dashboards, etc.).

Changes

`src/Providers/DeepSeek/Handlers/Text.php`::addStep() — read both fields from `usage`, subtract cache hit from `prompt_tokens` to derive the fresh-prompt count, populate `Usage` with `cacheReadInputTokens` and `thoughtTokens`. Mirrors what the Gemini and OpenAI handlers already do for their analogous fields.
`src/Providers/DeepSeek/Handlers/Stream.php`::extractUsage() — same fix in the streaming path.
`tests/Providers/DeepSeek/TextTest.php` — multi-step tools test updated to assert the new semantics (fresh-only `promptTokens` aggregated across steps, plus the previously-invisible `cacheReadInputTokens`).

All 28 DeepSeek tests pass. Pint clean.

Testing

Verified against `deepseek-v4-flash` direct API in production:

Cold cache request → `cacheReadInputTokens=0`, `promptTokens=`
Warm cache (same prefix within ~hour TTL) → `cacheReadInputTokens=`, `promptTokens=`
Reasoning model → `thoughtTokens` populated and matches the DeepSeek platform dashboard

References

Non-overlapping with #1020 (which fixes `reasoning_content` round-trip) — both fixes are complementary; this one is purely about exposing usage detail that's already in the response.

…rom usage The DeepSeek Text and Stream handlers hardcode `Usage` to only `prompt_tokens` and `completion_tokens`, silently dropping two DeepSeek-specific usage fields: - `usage.prompt_cache_hit_tokens` — cached input portion of the prompt. DeepSeek offers a 98% discount on cache hits (their headline feature) and reports the hit/miss split as separate counters. - `usage.completion_tokens_details.reasoning_tokens` — internal thinking tokens emitted by reasoning models (deepseek-reasoner, deepseek-v4-flash thinking mode). Without these, cost trackers that subscribe to `cacheReadInputTokens` see zero and charge the full `prompt_tokens` at fresh rate — overstating real spend ~3-5x once the prompt cache warms up. Reasoning-mode token usage is invisible to observability tooling. Both handlers now subtract `prompt_cache_hit_tokens` from `prompt_tokens` to derive the fresh-prompt count, and populate `Usage` with `cacheReadInputTokens` and `thoughtTokens`. Mirrors what the Gemini and OpenAI handlers already do for their analogous fields. The multi-step tools test asserts the new semantics: aggregated promptTokens reflects fresh-only counts and the previously-invisible cacheReadInputTokens is now exposed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(deepseek): extract prompt_cache_hit_tokens and reasoning_tokens from usage#1021

fix(deepseek): extract prompt_cache_hit_tokens and reasoning_tokens from usage#1021
Lintume wants to merge 1 commit into
prism-php:mainfrom
Lintume:fix/deepseek-usage-extraction

Lintume commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

Lintume commented May 22, 2026

Description

Impact

Changes

Testing

References

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant