fix: Accumulate retry costs instead of overwriting in _LAST_CALLBACK_DATA by Serhan-Asad · Pull Request #519 · promptdriven/pdd

Serhan-Asad · 2026-02-13T20:50:10Z

Summary

Fixes cost under-reporting when llm_invoke retries due to None content, malformed JSON, or invalid Python
_LAST_CALLBACK_DATA["cost"] was being overwritten on each retry call, silently losing the original call's cost
Fix saves and accumulates cost/tokens before each retry across all 3 retry paths (~lines 2455, 2503, 2749)

Verified with real API calls

Branch	Reported Cost	LLM Calls
main (buggy)	$0.000340	2 (only retry cost reported)
fix branch	$0.000528	2 (both costs accumulated)

Test plan

Unit tests: tests/test_llm_invoke_retry_cost.py (4 tests)
E2E tests: tests/test_e2e_issue_509_retry_cost.py (2 tests)
All 6 tests fail on main, pass on fix branch
Full test suite: 773 passed, no regressions
Manual test with real OpenAI API confirming cost accumulation

🤖 Generated with Claude Code

Unit and E2E tests that verify _LAST_CALLBACK_DATA["cost"] accumulates across retries instead of being overwritten. Tests currently fail, confirming the bug. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…n _LAST_CALLBACK_DATA Fixes promptdriven#509

…urvive importlib.reload Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

gltanaka · 2026-02-13T23:20:22Z

you need to fix the unit tests

Copilot

Pull request overview

Fixes under-reported LLM spend when llm_invoke() triggers cache-bypass retries by accumulating _LAST_CALLBACK_DATA cost/token totals across retry paths, and adds regression tests for issue #509.

Changes:

Accumulate cost + token totals across retries (None content, malformed JSON, invalid Python) instead of overwriting _LAST_CALLBACK_DATA.
Expand malformed-JSON detection to include excessive actual trailing newlines.
Add unit + “E2E-style” tests covering retry cost accumulation behavior.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 8 comments.

File	Description
`pdd/llm_invoke.py`	Saves pre-retry callback totals and adds them back after retry completion; expands malformed-JSON heuristic.
`tests/test_llm_invoke_retry_cost.py`	Adds unit tests asserting cost accumulation across retry scenarios.
`tests/test_e2e_issue_509_retry_cost.py`	Adds higher-level tests simulating retry behavior and verifying accumulated cost.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

tests/test_llm_invoke_retry_cost.py

tests/test_e2e_issue_509_retry_cost.py

pdd/llm_invoke.py

The CSV had empty run_test_command for JavaScript, TypeScript, and TypeScriptReact, causing 5 CI test failures in test_agentic_langtest, test_get_language, and test_get_test_command. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…y condition - Add token accumulation assertions to test that claims to verify tokens - Remove unused imports (csv, CliRunner) from e2e test - Fix misleading docstring about CliRunner usage - Trim 3-call setup to 2-call to match actual retry behavior - Simplify redundant endswith('}') check in _is_malformed_json_response Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

… retry failure All 3 retry paths (None content, malformed JSON, invalid Python) set litellm.cache = None but only restored it on success. If the retry call raised an exception, cache stayed permanently disabled for the process. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Resolve conflict: keep upstream's language_format.csv (was accidentally deleted on this branch). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Serhan-Asad and others added 3 commits February 12, 2026 14:57

Add failing tests for retry cost under-reporting bug (promptdriven#509)

f05ad9e

Unit and E2E tests that verify _LAST_CALLBACK_DATA["cost"] accumulates across retries instead of being overwritten. Tests currently fail, confirming the bug. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: Cost under-reporting: retry calls overwrite original call cost i…

928ac98

…n _LAST_CALLBACK_DATA Fixes promptdriven#509

fix: test isolation bug - use module ref for _LAST_CALLBACK_DATA to s…

615f32d

…urvive importlib.reload Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

gltanaka requested a review from Copilot February 13, 2026 23:20

gltanaka marked this pull request as draft February 13, 2026 23:20

Copilot started reviewing on behalf of gltanaka February 13, 2026 23:20 View session

Copilot AI reviewed Feb 13, 2026

View reviewed changes

Serhan-Asad and others added 7 commits February 13, 2026 18:30

Merge remote-tracking branch 'upstream/main' into fix/issue-509

e10622f

Delete pdd/data/language_format.csv

e3def9b

Merge remote-tracking branch 'upstream/main' into fix/issue-509

0bec60c

Merge upstream/main into fix/issue-509

3c79251

Resolve conflict: keep upstream's language_format.csv (was accidentally deleted on this branch). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Serhan-Asad marked this pull request as ready for review February 15, 2026 22:57

gltanaka merged commit 1cb477c into promptdriven:main Feb 16, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Accumulate retry costs instead of overwriting in _LAST_CALLBACK_DATA#519

fix: Accumulate retry costs instead of overwriting in _LAST_CALLBACK_DATA#519
gltanaka merged 10 commits intopromptdriven:mainfrom
Serhan-Asad:fix/issue-509

Serhan-Asad commented Feb 13, 2026

Uh oh!

gltanaka commented Feb 13, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Serhan-Asad commented Feb 13, 2026

Summary

Verified with real API calls

Test plan

Uh oh!

gltanaka commented Feb 13, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants