Skip to content

[Speculative Decoding] fix mtp stop_seqs and limit thinking bugs#7166

Open
lonelygsh wants to merge 1 commit intoPaddlePaddle:developfrom
lonelygsh:fix-speculate-decoding-index-bugs
Open

[Speculative Decoding] fix mtp stop_seqs and limit thinking bugs#7166
lonelygsh wants to merge 1 commit intoPaddlePaddle:developfrom
lonelygsh:fix-speculate-decoding-index-bugs

Conversation

@lonelygsh
Copy link
Copy Markdown
Contributor

@lonelygsh lonelygsh commented Apr 2, 2026

Motivation

本 PR 修复投机解码中 speculate_set_stop_value_multi_seqsspeculate_limit_thinking_content_length 两个 kernel 的多个问题。

Modifications

speculate_set_stop_value_multi_seqs

  1. stop_seqs 不对比最后一个 accept token:将循环条件从 accept_idx <= accept_num - 1 改为 accept_idx < accept_num - 1
  2. 加入回退逻辑:匹配到 stop sequence 后,丢弃其后的多余 token,并将 step_idx 回退相应步数,确保状态一致。
  3. 修复截断逻辑:匹配到 stop sequence 后,保留完整的 stop sequence,在其后追加 EOS token(对齐非 MTP 行为:...<|im_end|> <eos>)。
  4. 修复索引计算:修正 accept_tokens 和 pre_ids 的边界判断及索引计算。

speculate_limit_thinking_content_length

  1. 修复截断判断条件:将 current_step - 1 == max_think_len 改为 current_step == max_think_len,修正 off-by-one 错误。

测试

  • 更新 test_speculate_set_stop_value_multi_seqs.py,同步适配新的 kernel 语义,新增 step_idx 回退验证。

Usage or Command

无新增接口,修复已有逻辑。可通过投机解码推理验证 stop sequences 截断行为及 thinking 长度限制是否正确。

Accuracy Tests

单元测试通过。

Checklist

  • Add at least a tag in the PR title.
  • Format your code, run pre-commit before commit.
  • Add unit tests. 已更新 test_speculate_set_stop_value_multi_seqs.py
  • Provide accuracy results.
  • If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

@paddle-bot
Copy link
Copy Markdown

paddle-bot bot commented Apr 2, 2026

Thanks for your contribution!

@paddle-bot paddle-bot bot added the contributor External developers label Apr 2, 2026
@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


guanshihui] seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@lonelygsh lonelygsh force-pushed the fix-speculate-decoding-index-bugs branch from ba88df0 to 0f4325c Compare April 2, 2026 13:37
@lonelygsh lonelygsh force-pushed the fix-speculate-decoding-index-bugs branch from 0f4325c to 41a8185 Compare April 2, 2026 13:40
@lonelygsh lonelygsh force-pushed the fix-speculate-decoding-index-bugs branch from 41a8185 to 8dea198 Compare April 2, 2026 13:42
@codecov-commenter
Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (develop@98f3fc9). Learn more about missing BASE report.

Additional details and impacted files
@@            Coverage Diff             @@
##             develop    #7166   +/-   ##
==========================================
  Coverage           ?   73.90%           
==========================================
  Files              ?      376           
  Lines              ?    52867           
  Branches           ?     8243           
==========================================
  Hits               ?    39072           
  Misses             ?    11071           
  Partials           ?     2724           
Flag Coverage Δ
GPU 73.90% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@lonelygsh lonelygsh changed the title [Speculative Decoding] fix mtp stop_seqs bugs [Speculative Decoding] fix mtp stop_seqs and limit thinging bugs Apr 3, 2026
@lonelygsh lonelygsh changed the title [Speculative Decoding] fix mtp stop_seqs and limit thinging bugs [Speculative Decoding] fix mtp stop_seqs and limit thinking bugs Apr 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor External developers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants