[Draft] Enable text-only deployment for multimodal models by K11OntheBoat · Pull Request #7183 · PaddlePaddle/FastDeploy

K11OntheBoat · 2026-04-03T07:49:24Z

Motivation

💡 If this PR is a Cherry Pick, the PR title needs to follow the format by adding the [Cherry-Pick] label at the very beginning and appending the original PR ID at the end. For example, [Cherry-Pick][CI] Add check trigger and logic(#5191)

💡 如若此PR是Cherry Pick，PR标题需遵循格式，在最开始加上[Cherry-Pick]标签，以及最后面加上原PR ID，例如[Cherry-Pick][CI] Add check trigger and logic(#5191)

Modifications

Usage or Command

Accuracy Tests

Checklist

Add at least a tag in the PR title.
- Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
- You can add new tags based on the PR content, but the semantics must be clear.
Format your code, run pre-commit before commit.
Add unit tests. Please write the reason in this PR if no unit tests.
Provide accuracy results.
If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

paddle-bot · 2026-04-03T07:49:32Z

Thanks for your contribution!

CLAassistant · 2026-04-03T07:50:14Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.

liuruian seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

codecov-commenter · 2026-04-03T09:17:29Z

Codecov Report

❌ Patch coverage is 69.49153% with 18 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (develop@6cae9b1). Learn more about missing BASE report.

Files with missing lines	Patch %	Lines
fastdeploy/worker/input_batch.py	37.50%	8 Missing and 2 partials ⚠️
fastdeploy/config.py	57.14%	4 Missing and 2 partials ⚠️
fastdeploy/engine/async_llm.py	0.00%	0 Missing and 1 partial ⚠️
...executor/layers/attention/dsa_attention_backend.py	0.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             develop    #7183   +/-   ##
==========================================
  Coverage           ?   73.58%           
==========================================
  Files              ?      376           
  Lines              ?    52939           
  Branches           ?     8257           
==========================================
  Hits               ?    38954           
  Misses             ?    11243           
  Partials           ?     2742

Flag	Coverage Δ
GPU	`73.58% <69.49%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

fastdeploy-bot

🤖 AI Code Review | 2026-04-03 17:37 CST

📋 Review 摘要

PR 概述：为多模态模型启用纯文本部署模式，引入 enable_mm_runtime 属性统一控制运行时多模态特性
变更范围：FDConfig、Engine、Worker、Scheduler、Attention Backends、Speculative Decoding
影响面 Tag：[FDConfig] [Engine] [Scheduler] [Speculative Decoding] [XPU] [HPU] [GCU] [Iluvatar] [Metax]

📝 PR 规范检查

PR 标题使用 [Draft] 不是有效 Tag，且描述中 Motivation、Modifications、Usage 均未填写。

标题建议（可直接复制）：

[Feature] Enable text-only deployment for multimodal models

描述模板（可直接复制）：

## Motivation
支持多模态模型以纯文本模式部署，通过 `deploy_modality=TEXT` 配置禁用多模态运行时特性（如 3D RoPE、encoder cache 等），降低资源占用并提升纯文本场景性能。

## Modifications
1. 在 `FDConfig` 中新增 `enable_mm_runtime` 和 `enable_rope_3d_runtime` 属性
2. 将各模块中对 `model_config.enable_mm` 的判断替换为 `fd_config.enable_mm_runtime`
3. 当 `deploy_modality=TEXT` 时，强制禁用 `rope_3d` 和 `use_3d_rope`

## Usage
```bash
python -m fastdeploy.entrypoints.openai.api_server \
    --model /path/to/multimodal_model \
    --deploy-modality text


### 问题

| 级别 | 文件 | 概述 |
|------|------|------|
| 🔴 Bug | `engine/common_engine.py:1282` | 包含调试打印语句，不应合并到 develop |
| 🔴 Bug | `entrypoints/engine_client.py:364` | 包含调试打印语句 |
| 🔴 Bug | `output/token_processor.py:952` | 包含调试打印语句 |
| 🔴 Bug | `inter_communicator/engine_worker_queue.py:554` | 包含调试打印语句，频繁调用影响性能 |
| ❓ 疑问 | `worker/input_batch.py:235` | `has_mm_model` 与 `enable_mm` 区分逻辑需确认 |

### 总体评价

本 PR 的设计思路清晰，通过引入 `enable_mm_runtime` 属性统一管理运行时多模态特性，变更覆盖面广且一致性好。但存在多处调试打印语句需要在合并前移除，这些 `print` 语句会在生产环境中产生大量输出，严重影响性能和日志可读性。

fastdeploy-bot · 2026-04-03T09:37:27Z

fastdeploy/engine/common_engine.py

                        request = Request.from_dict(data)
+
+                        # [DEBUG] engine 接收到的请求关键字段
+                        print(


🔴 Bug 此处包含调试打印语句，不应合并到 develop 分支。

这些 print 语句会在每个请求处理时输出大量调试信息，影响生产环境性能和日志可读性。

建议：移除这些调试打印，或使用 logger.debug() 替代并通过日志级别控制输出。

fastdeploy-bot · 2026-04-03T09:37:27Z

fastdeploy/entrypoints/engine_client.py

            min_tokens = task.get("min_tokens", 1)
+
+            # [DEBUG] 发送到 engine 前的关键字段
+            print(


🔴 Bug 此处包含调试打印语句，不应合并到 develop 分支。

建议移除或改用 logger.debug()。

fastdeploy-bot · 2026-04-03T09:37:27Z

fastdeploy/output/token_processor.py

                    llm_logger.info(f"task {task_id} received eos token. Recycling.")
+
+                    # [DEBUG] 输出 token 对比调试
+                    print(


🔴 Bug 此处包含调试打印语句，不应合并到 develop 分支。

建议移除或改用 logger.debug()。

fastdeploy-bot · 2026-04-03T09:37:27Z

fastdeploy/inter_communicator/engine_worker_queue.py

            time.sleep(0.001)
            self.lock.acquire()
+        wait_ms = (time.perf_counter() - wait_start) * 1000.0
+        print(


🔴 Bug 此处包含调试打印语句，在 put_tasks 和 get_tasks 中会被频繁调用，严重影响性能。

建议移除这些调试打印。

fastdeploy-bot · 2026-04-03T09:37:27Z

fastdeploy/worker/input_batch.py

                model_config=self.model_config,
                partial_rotary_factor=self.model_config.partial_rotary_factor,
            )
+            if self.has_mm_model:


❓ 疑问 此处在 not self.enable_mm 条件下初始化 image_features，但仅当 has_mm_model=True 时执行。

这意味着当模型支持多模态但部署为纯文本模式时，会初始化这些变量。请确认这是否是预期行为？如果纯文本部署不需要这些变量，可以考虑移除此初始化逻辑。

Split enable_mm

f2a696b

K11OntheBoat had a problem deploying to Metax_ci April 3, 2026 07:49 — with GitHub Actions Failure

paddle-bot bot added the contributor External developers label Apr 3, 2026

fastdeploy-bot suggested changes Apr 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Draft] Enable text-only deployment for multimodal models#7183

[Draft] Enable text-only deployment for multimodal models#7183
K11OntheBoat wants to merge 1 commit intoPaddlePaddle:developfrom
K11OntheBoat:dev_split_mm

K11OntheBoat commented Apr 3, 2026

Uh oh!

paddle-bot bot commented Apr 3, 2026

Uh oh!

CLAassistant commented Apr 3, 2026

Uh oh!

codecov-commenter commented Apr 3, 2026

Uh oh!

fastdeploy-bot left a comment

Uh oh!

fastdeploy-bot Apr 3, 2026

Uh oh!

fastdeploy-bot Apr 3, 2026

Uh oh!

fastdeploy-bot Apr 3, 2026

Uh oh!

fastdeploy-bot Apr 3, 2026

Uh oh!

fastdeploy-bot Apr 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

K11OntheBoat commented Apr 3, 2026

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

Uh oh!

paddle-bot bot commented Apr 3, 2026

Uh oh!

CLAassistant commented Apr 3, 2026

Uh oh!

codecov-commenter commented Apr 3, 2026

Codecov Report

Uh oh!

fastdeploy-bot left a comment

Choose a reason for hiding this comment

📋 Review 摘要

📝 PR 规范检查

Uh oh!

fastdeploy-bot Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

fastdeploy-bot Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

fastdeploy-bot Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

fastdeploy-bot Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

fastdeploy-bot Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants