[Feature] Support set PREEMPTED_TOKEN_ID in GET_SAVE_OUTPUT_V1 by rainyfly · Pull Request #7159 · PaddlePaddle/FastDeploy

rainyfly · 2026-04-02T09:24:25Z

Motivation

💡 If this PR is a Cherry Pick, the PR title needs to follow the format by adding the [Cherry-Pick] label at the very beginning and appending the original PR ID at the end. For example, [Cherry-Pick][CI] Add check trigger and logic(#5191)

💡 如若此PR是Cherry Pick，PR标题需遵循格式，在最开始加上[Cherry-Pick]标签，以及最后面加上原PR ID，例如[Cherry-Pick][CI] Add check trigger and logic(#5191)

Modifications

Usage or Command

Accuracy Tests

Checklist

Add at least a tag in the PR title.
- Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
- You can add new tags based on the PR content, but the semantics must be clear.
Format your code, run pre-commit before commit.
Add unit tests. Please write the reason in this PR if no unit tests.
Provide accuracy results.
If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

paddle-bot · 2026-04-02T09:24:31Z

Thanks for your contribution!

fastdeploy-bot

🤖 AI Code Review | 2026-04-02 20:02 CST

📋 Review 摘要

PR 概述：在 GET_SAVE_OUTPUT_V1 模式下支持设置 PREEMPTED_TOKEN_ID 标记被抢占的请求
变更范围：worker/gpu_model_runner.py 的 _postprocess 方法
影响面 Tag：Engine Worker

问题

级别	文件	概述
🔴 Bug	`gpu_model_runner.py:2407`	环境变量名错误，`envs.GET_SAVE_OUTPUT_V1` 不存在

总体评价

PR 意图清晰，为 V1 输出模式增加了抢占标记支持。但存在一个关键的变量名错误，会导致运行时异常，需要修复后才能合并。

fastdeploy-bot · 2026-04-02T12:02:23Z

fastdeploy/worker/gpu_model_runner.py


            # 5.1. Async cpy
            post_process_event = paddle.device.cuda.create_event()
+            if envs.GET_SAVE_OUTPUT_V1:


🔴 Bug 变量名错误：envs.GET_SAVE_OUTPUT_V1 不存在，应为 envs.FD_USE_GET_SAVE_OUTPUT_V1

fastdeploy/envs.py 中定义的环境变量是 FD_USE_GET_SAVE_OUTPUT_V1，当前代码使用 envs.GET_SAVE_OUTPUT_V1 会在运行时触发 AttributeError，导致条件分支永远无法进入（Python 会抛出异常）。

其他使用此环境变量的代码（如同文件的 241 行和 2300 行）都使用了正确的 envs.FD_USE_GET_SAVE_OUTPUT_V1。

建议修复：

if envs.FD_USE_GET_SAVE_OUTPUT_V1:

fastdeploy-bot

🤖 AI Code Review | 2026-04-03 15:24 CST

📋 Review 摘要

PR 概述：在 FD_USE_GET_SAVE_OUTPUT_V1 模式下，为被抢占的请求设置 PREEMPTED_TOKEN_ID (-9) 以通知 server 端 abort 操作已完成。

变更范围：fastdeploy/worker/gpu_model_runner.py - _postprocess 方法

影响面 Tag：Engine Scheduler

📝 PR 规范检查

PR 描述中 Motivation 和 Modifications 部分未填写具体内容，建议补充说明。

描述模板（可直接复制）：

## Motivation
在使用 GET_SAVE_OUTPUT_V1 模式时，当请求被抢占（preempted）后，没有对应的采样 token。本 PR 通过设置 PREEMPTED_TOKEN_ID (-9) 来通知 server 端 abort 操作已完成，使抢占流程能够正确结束。

## Modifications
- 在 `gpu_model_runner.py` 的 `_postprocess` 方法中，当 `FD_USE_GET_SAVE_OUTPUT_V1` 开启时，检查 `last_preempted_idx`
- 对于被抢占的请求，将其 `sampled_token_ids` 设置为 `PREEMPTED_TOKEN_ID`

问题

未发现阻塞性问题。

总体评价

代码逻辑清晰，与已有的 token_processor.py 中处理 PREEMPTED_TOKEN_ID 的逻辑保持一致。建议补充 PR 描述以便于后续维护和代码审查。

codecov-commenter · 2026-04-03T08:41:49Z

Codecov Report

❌ Patch coverage is 66.66667% with 1 line in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (develop@938e7dd). Learn more about missing BASE report.

Files with missing lines	Patch %	Lines
fastdeploy/worker/gpu_model_runner.py	66.66%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             develop    #7159   +/-   ##
==========================================
  Coverage           ?   73.86%           
==========================================
  Files              ?      376           
  Lines              ?    52888           
  Branches           ?     8250           
==========================================
  Hits               ?    39064           
  Misses             ?    11095           
  Partials           ?     2729

Flag	Coverage Δ
GPU	`73.86% <66.66%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

[Feature] Support set PREEMPTED_TOKEN_ID in GET_SAVE_OUTPUT_V1

8dd4eec

rainyfly had a problem deploying to Metax_ci April 2, 2026 09:24 — with GitHub Actions Failure

[Feature] Support set PREEMPTED_TOKEN_ID in GET_SAVE_OUTPUT_V1

d940eb6

rainyfly had a problem deploying to Metax_ci April 2, 2026 09:30 — with GitHub Actions Failure

fastdeploy-bot suggested changes Apr 2, 2026

View reviewed changes

fix

556c78a

rainyfly temporarily deployed to Metax_ci April 3, 2026 07:13 — with GitHub Actions Inactive

fastdeploy-bot reviewed Apr 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Support set PREEMPTED_TOKEN_ID in GET_SAVE_OUTPUT_V1#7159

[Feature] Support set PREEMPTED_TOKEN_ID in GET_SAVE_OUTPUT_V1#7159
rainyfly wants to merge 3 commits intoPaddlePaddle:developfrom
rainyfly:support_abort_token_id_in_get_output_v1

rainyfly commented Apr 2, 2026

Uh oh!

paddle-bot bot commented Apr 2, 2026

Uh oh!

fastdeploy-bot left a comment

Uh oh!

fastdeploy-bot Apr 2, 2026

Uh oh!

fastdeploy-bot left a comment

Uh oh!

codecov-commenter commented Apr 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

rainyfly commented Apr 2, 2026

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

Uh oh!

paddle-bot bot commented Apr 2, 2026

Uh oh!

fastdeploy-bot left a comment

Choose a reason for hiding this comment

📋 Review 摘要

问题

总体评价

Uh oh!

fastdeploy-bot Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

fastdeploy-bot left a comment

Choose a reason for hiding this comment

📋 Review 摘要

📝 PR 规范检查

问题

总体评价

Uh oh!

codecov-commenter commented Apr 3, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants