Skip to content

<fix>[query]: ZSTAC-80742 limit API query concurrency#3975

Open
MatheMatrix wants to merge 1 commit into
5.5.22from
sync/shan.wu/fix/5.5.22/ZSTAC-80742
Open

<fix>[query]: ZSTAC-80742 limit API query concurrency#3975
MatheMatrix wants to merge 1 commit into
5.5.22from
sync/shan.wu/fix/5.5.22/ZSTAC-80742

Conversation

@MatheMatrix
Copy link
Copy Markdown
Owner

Summary

Limit backend AutoQuery concurrency to prevent bursty GraphQL VM list refreshes from overwhelming the management node.

Root Cause

Refreshing the VM list can fan out into many GraphQL-backed APIQueryMessage requests such as GuestToolsState and ModelServiceInstance queries. ZQL and BatchQuery were already routed through the query sync queue, but ordinary AutoQuery handling executed directly, allowing large bursts to run concurrently and overload the management node.

Changes

  • Route APIQueryMessage handling through the existing query sync queue.
  • Reuse the same query sync level as ZQL/BatchQuery.
  • Preserve existing AutoQuery response construction and error propagation.

Testing

  • JAVA_HOME=$(/usr/libexec/java_home -v 1.8) mvn -pl search -am -DskipTests compile
  • CI pipeline
  • Manual reproduction with 100 VM batch migration and VM list refresh

Resolves: ZSTAC-80742

sync from gitlab !9866

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 14, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: http://open.zstack.ai:20001/code-reviews/zstack-cloud.yaml (via .coderabbit.yaml)

Review profile: CHILL

Plan: Pro

Run ID: 70946b51-6e8b-48ff-b677-efd60f0c83b2

📥 Commits

Reviewing files that changed from the base of the PR and between 0374da1 and bb01702.

📒 Files selected for processing (1)
  • search/src/main/java/org/zstack/query/QueryFacadeImpl.java
🚧 Files skipped from review as they are similar to previous changes (1)
  • search/src/main/java/org/zstack/query/QueryFacadeImpl.java

Walkthrough

QueryFacadeImpl 将查询构建与结果封装从 handle(APIQueryMessage) 提取为私有方法 doQuery(APIQueryMessage)handlethdf.syncSubmit 中调用 doQuery,捕获 OperationFailureException 并通过 bus.reply 返回错误或结果回复。

Changes

查询处理逻辑提取

Layer / File(s) Summary
查询逻辑方法提取与 handle 重构
search/src/main/java/org/zstack/query/QueryFacadeImpl.java
新增私有方法 doQuery(APIQueryMessage) 封装查询构建、@AutoQuery 注解解析、反射调用 setInventories 写入结果并在可用时设置 total;统一包装非 OperationFailureException 异常;handle(APIQueryMessage) 被重构为在 thdf.syncSubmit 同步任务中调用 doQuery,捕获 OperationFailureException 时构建错误 APIQueryReply 并通过 bus.reply 分发。

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 查询从心里跳出,
方法拆分不慌不忙,
doQuery 把结果捧来,
handle 只管递送回响,
异常有序不再慌。

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed 标题符合 [scope]: 格式要求,长度为53字符(在72字符限制内),且清晰描述了PR的主要变更意图。
Description check ✅ Passed 描述详细阐述了PR的目的、根本原因、具体变更和测试计划,与代码变更(限制AutoQuery并发)高度相关。
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch sync/shan.wu/fix/5.5.22/ZSTAC-80742

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
search/src/main/java/org/zstack/query/QueryFacadeImpl.java (1)

352-362: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

并发查询路径下的 HashMap 缓存存在线程安全风险

handle(APIQueryMessage) 已改为并发 SyncTask 执行(getSyncLevel() > 1),但 Line 352-362/Line 369 仍会并发访问 autoQueryMapreplySetter(当前是 HashMap)。这会引入竞态与可见性问题,建议改为并发容器。

🔧 建议修改
@@
 import java.util.*;
+import java.util.concurrent.ConcurrentHashMap;
 import java.util.concurrent.TimeUnit;
@@
-    private Map<Class, Method> replySetter = new HashMap<>();
-    private Map<Class, AutoQuery> autoQueryMap = new HashMap<>();
+    private final Map<Class, Method> replySetter = new ConcurrentHashMap<>();
+    private final Map<Class, AutoQuery> autoQueryMap = new ConcurrentHashMap<>();

Also applies to: 369-370

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@search/src/main/java/org/zstack/query/QueryFacadeImpl.java` around lines 352
- 362, The code accesses shared maps (autoQueryMap and replySetter) concurrently
from handle(APIQueryMessage)/doQuery causing race/visibility bugs; replace their
HashMap declarations with thread-safe ConcurrentHashMap and update usages to
atomic helpers (e.g., use computeIfAbsent for autoQueryMap to load/put the
AutoQuery annotation atomically) so gets/puts are safe under concurrent SyncTask
execution; ensure any put-then-get patterns on replySetter are converted to
atomic operations or synchronized blocks if necessary.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@search/src/main/java/org/zstack/query/QueryFacadeImpl.java`:
- Around line 352-362: The code accesses shared maps (autoQueryMap and
replySetter) concurrently from handle(APIQueryMessage)/doQuery causing
race/visibility bugs; replace their HashMap declarations with thread-safe
ConcurrentHashMap and update usages to atomic helpers (e.g., use computeIfAbsent
for autoQueryMap to load/put the AutoQuery annotation atomically) so gets/puts
are safe under concurrent SyncTask execution; ensure any put-then-get patterns
on replySetter are converted to atomic operations or synchronized blocks if
necessary.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: http://open.zstack.ai:20001/code-reviews/zstack-cloud.yaml (via .coderabbit.yaml)

Review profile: CHILL

Plan: Pro

Run ID: 081c0bc1-1a81-4b2a-8598-491f887515db

📥 Commits

Reviewing files that changed from the base of the PR and between 4cdaed7 and 0374da1.

📒 Files selected for processing (1)
  • search/src/main/java/org/zstack/query/QueryFacadeImpl.java

Route AutoQuery handling through the query sync queue so bursty GraphQL refreshes cannot run unlimited APIQueryMessage requests concurrently.

Resolves: ZSTAC-80742

Change-Id: If2c052e539526cd0511e037d1718838bdb65a7a3
@MatheMatrix MatheMatrix force-pushed the sync/shan.wu/fix/5.5.22/ZSTAC-80742 branch from 0374da1 to bb01702 Compare May 17, 2026 11:16
@zstack-robot-1
Copy link
Copy Markdown
Collaborator

Comment from gitlab:

自上次添加REVIEWED标签(2026-05-17 17:43:28.000Z)后, 有新的COMMIT更新(2026-05-17 19:16:23.853Z), 所以移除了REVIEWED标签

@ZStack-Robot
Copy link
Copy Markdown
Collaborator

Comment from gitlab:

检测到REVIEWED标签添加者(shan.wu)不属于Maintainers:[ye.tian wenhao.zhang xin.zhang yaohua.wu lock-files yuerong.su gitlab qun.li jin.ma shixin.ruan lei.liu zstackio wei.wang ye.zou zhangjianjun], 所以移除了REVIEWED标签

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants