<fix>[compute]: fix VM clone quota check fail#3980
Conversation
<fix>[loadBalancer]: block SLB deletion during grayscale upgrade See merge request zstackio/zstack!9187
<fix>[volumebackup]: add backup cancel timeout error code See merge request zstackio/zstack!9200
<fix>[core]: add @nologging to sensitive config fields See merge request zstackio/zstack!9171
…plit-brain When a management node departs, its VM skip-trace entries were immediately removed. If VMs were still being started by kvmagent, the next VM sync would falsely detect them as Stopped and trigger HA, causing split-brain. Fix: transfer departed MN skip-trace entries to an orphaned set with 10-minute TTL instead of immediate deletion. VMs in the orphaned set remain skip-traced until the TTL expires or they are explicitly continued, preventing false HA triggers during MN restart scenarios. Resolves: ZSTAC-80821 Change-Id: I3222e260b2d7b33dc43aba0431ce59a788566b34
…anup Resolves: ZSTAC-80821 Change-Id: I59284c4e69f5d2ee357b1836b7c243200e30949a
Resolves: ZSTAC-77544 Change-Id: I1f711bff9c1e87a8cbf6a2eb310ca6086f0f99ba
Resolves: ZSTAC-80821 Change-Id: Ia9a9597feceb96b3e6e22259e2d0be7bde8ae499
…V bond check add new error code constant for SR-IOV bond slave NIC count validation. Resolves: ZSTAC-81163 Change-Id: Ie2a74411129a98c3c03a4a085e94a3bd45922da5
…tion Resolves: ZSTAC-80991 Change-Id: I7677ddc25c8859e35e8ba80fd3105406bc761a76
Resolves: ZSTAC-74908 Change-Id: I48054139babb1e8092ab81e4367743ae3fd8aefb
Resolves: ZSTAC-81354 Change-Id: Iff2131b3a878444fa27641f24dd727fe4fa176fb
- UpdateQueryImpl: guard val.getClass() NPE - LogSafeGson: return JsonNull when input is null - HostAllocatorChain: null check completion - VmInstanceVO: use Objects.equals to avoid NPE - SessionManagerImpl: guard null session - VmCapabilitiesJudger: guard null PS type result - CephPSMonBase: guard null self when evicted - CephPSBase: guard null mon.getSelf() - HostBase: guard null self when HostVO deleted - ExternalPSFactory: guard null URI protocol - LocalStorageBase: guard null errorCode.getCause() Resolves: ZSTAC-69300, ZSTAC-69957, ZSTAC-71973, ZSTAC-81294, ZSTAC-70180, ZSTAC-70181, ZSTAC-78309, ZSTAC-78310, ZSTAC-70668, ZSTAC-71909, ZSTAC-80555, ZSTAC-81270, ZSTAC-70101, ZSTAC-72034, ZSTAC-73197, ZSTAC-79921, ZSTAC-81160, ZSTAC-81224, ZSTAC-81805, ZSTAC-72304, ZSTAC-81804, ZSTAC-74898, ZSTAC-69215, ZSTAC-70151, ZSTAC-68933 Change-Id: I910e9b542ecd254fdf7e956f943316988a56a1f9
- LdapUtil: CRE to OperationFailureException - QueryFacadeImpl: CRE to OperationFailureException - HostAllocatorManagerImpl: CRE to warn + clamp - CloudOperationsErrorCode: add LDAP/PROMETHEUS codes Resolves: ZSTAC-81334 Change-Id: Iab947b0476e9174d5a61baa095847b521b1f59fa
<fix>[pciDevice]: add error code ORG_ZSTACK_PCIDEVICE_10077 See merge request zstackio/zstack!9205
Add ORG_ZSTACK_AI_10134 for ModelCenter disconnected check and ORG_ZSTACK_PCIDEVICE_10077 for SR-IOV bond validation. Resolves: ZSTAC-72783 Change-Id: I504a415a6e822513df955be600188ae88e2e1058
fix(ai): add error codes for AI and PCI modules [ZSTAC-72783] See merge request zstackio/zstack!9215
<fix>[multi]: batch fix CRE quality issues See merge request zstackio/zstack!9214
<fix>[volumebackup]: add backup cancel timeout error code See merge request zstackio/zstack!9204
Companion DB migration for premium MR !13025. Adds normalizedModelName column and index to GpuDeviceSpecVO for GPU spec dedup by normalized model name. Resolves: ZSTAC-75319 Change-Id: If15e615bcbda955cc1d6c58527bae27d4af4b497
<fix>[compute]: respect vm.migrationQuantity during host maintenance See merge request zstackio/zstack!9209
…ttachedPSMountPath NPE Resolves: ZSTAC-69300 Change-Id: I1b39a9e7b76751e8a4ef4cc53e9ac2028386e334
Resolves: ZSTAC-61988 Change-Id: Id3d5a48801fda21e2a51d96949c743bac254b2e6
<fix>[kvm]: configurable orphan skip timeout See merge request zstackio/zstack!9203
Resolves: ZSTAC-61988 Change-Id: I0908fce97128904f9954198645290f4e5709252e
- Add Javadoc: NULL resourceType = universal (backward compatible) - Add resourceType to TagPatternVO_ metamodel and TagPatternInventory - Add groovy integration test (3 scenarios: universal/scoped/combined filter) Resolves: ZSTAC-74908 Change-Id: I6fc05535ae688e50290759f1e129501f0240696c
<fix>[multi]: batch guard NPE quality issues See merge request zstackio/zstack!9213
<fix>[telemetry]: fix Sentry transaction loss and add debug logging See merge request zstackio/zstack!9220
<fix>[gpu]: add normalizedModelName migration SQL See merge request zstackio/zstack!9218
<fix>[zbs]: sync MDS node statuses to DB when reconnect fails See merge request zstackio/zstack!9161
Add missing error codes ORG_ZSTACK_STORAGE_PRIMARY_10048 and ORG_ZSTACK_STORAGE_PRIMARY_10051 to all 10 language files. Fix zh_CN mistranslations replaced with correct term. Fix zh_TW garbled characters in error messages. Resolves: ZSTAC-72656 Change-Id: I5f08109d1c415b751ec130285b9d92522f1e0a34
br_conn_all_ns (169.254.64.1, mevoco.py CONNECT_ALL_NETNS_BR_OUTER_IP) is created on the host only when the first flat L3 network lands, AFTER the add-host ansible playbook has generated the TLS cert. The cert SAN therefore lacks this IP, and on every subsequent reconnect the check-tls-certs flow detects the gap and triggers an ansible redeploy plus libvirtd/kvmagent restart, breaking PID stability. This IP is host-internal (used only for VM <-> root-netns userdata / pushgateway / kvmagent HTTP) and has no business in a libvirtd TLS cert. Add EXCLUDED_INTERNAL_IPS as a single source of truth used by both the check flow (buildIpList) and the deploy flow (unionTlsCertIps) so the two flows can never disagree, regardless of when the bridge appears. host_plugin.fact() on the agent side is unchanged. Resolves: ZSTAC-84446 Change-Id: I10821d8c68190f2cbc8a0679d19ed053916d9184
<fix>[kvm]: exclude br_conn_all_ns IP from TLS cert SAN See merge request zstackio/zstack!9758
DBImpact Change-Id: I6d757a76676d6763697161776a61796175737775
…erviceRefVO.isDefault/createDate/lastOpDate Resolves: ZSTAC-84025 Change-Id: Ifbb68c9d475aa19762bb053e6545381a305dfe5b
…l.defaultModelServiceUuid Resolves: ZSTAC-84025 Change-Id: I4046976b0c4437ba4c14069946b8075393f152ed
…Model DSL Resolves: ZSTAC-84025 Change-Id: I0df26f7367a28da9c6c825dcddcb829db133374e
Resolves: ZSTAC-84025 Change-Id: I7a6c657369736978676164676b616b626d707974
Review found the new ModelServiceRefVO createDate column used a zero-date default, which is unsafe under strict SQL mode. The upgrade now adds the column as nullable, backfills existing rows, then tightens it to NOT NULL with CURRENT_TIMESTAMP. This also adds the shared AI error code consumed by the premium review fix. Constraint: Premium branch references CloudOperationsErrorCode from zstack utils Rejected: Keep zero-date default | fails SQL coding rules and strict-mode upgrades Confidence: high Scope-risk: narrow Tested: JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 mvn -pl utils -DskipTests install Tested: JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 mvn -pl sdk -DskipTests install Not-tested: Full database migration against a live strict-mode MySQL instance Resolves: ZSTAC-84025 Change-Id: Ic0a63eec1e3db23e4bb8843efd8e2aee143dce21
CI hit MySQL error 1293 because ModelServiceRefVO.lastOpDate already uses TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, and older MySQL/MariaDB versions allow only one automatic TIMESTAMP column per table. Keep createDate non-zero with a fixed default while the migration backfills existing rows with CURRENT_TIMESTAMP and premium ModelServiceRefVO prePersist sets real creation time for new JPA inserts. Constraint: Older MySQL/MariaDB permits only one automatic TIMESTAMP column per table Rejected: DEFAULT CURRENT_TIMESTAMP on createDate | conflicts with lastOpDate automatic timestamp Confidence: high Scope-risk: narrow Tested: git diff --check Tested: rg createDate/current timestamp/default checks in V5.5.22__schema.sql Tested: JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 mvn -pl utils -DskipTests install Not-tested: Full Flyway migration against the CI MySQL image Resolves: ZSTAC-84025 Change-Id: I095a6818443441409c84988dafbb6c16ef5d8e54
Code review noted that adding ModelServiceRefVO.isDefault as NOT NULL DEFAULT 0 already initializes existing rows, so the immediate UPDATE to the same value only adds unnecessary migration work. The schema keeps the defaulted column and removes the redundant write.\n\nConstraint: MySQL initializes existing rows when adding a NOT NULL column with a DEFAULT\nRejected: Keep explicit UPDATE | unnecessary table write during upgrade\nConfidence: high\nScope-risk: narrow\nTested: git diff --check\nTested: rg check for redundant isDefault UPDATE and zero-date defaults in V5.5.22__schema.sql\nTested: JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 mvn -pl utils -DskipTests install\nNot-tested: Full Flyway migration against CI database image Resolves: ZSTAC-84025 Change-Id: I062d884ed6232fa603a6ecfd68c2124db85c5ed6
<feature>[ai] ZSTAC-84025: auto-match schema + SDK See merge request zstackio/zstack!9772
DBImpact Resolves: ZSTAC-81413 Change-Id: I6a7574656467636e686377756f716d67707a7063
XInfiniStorageController.blacklist was a no-op (// todo, comp.success()). When ExternalPrimaryStorageKvmFactory's beforeStartVmOnKvm fell back to blacklist after a deactivate failure, the no-op silently allowed the VM to start while a stale client on another host still held write access, risking a split-brain on the volume. 1. Why? xinfini does not implement volume path isolation yet. The blacklist no-op masked the missing capability and let the caller proceed as if the stale client had been fenced. 2. How? Throw OperationFailureException from blacklist with a clear message. Because beforeStartVmOnKvm invokes blacklist with NopeCompletion (which ignores fail()), only an exception can abort the VM start. This makes the unsupported case explicit and conservative until xinfini supports path isolation. 3. Side effects? VM start is aborted when an old active client cannot be deactivated on xinfini. This is the safe behavior; previously such VMs would start with split-brain risk. # Summary of changes (by module): - plugin/xinfini: throw OperationFailureException in blacklist instead of silently returning success. Related: ZSTAC-84963 Change-Id: Ib0092f39a7b23cf7b06442ac9616cb2ce39240b7 (cherry picked from commit 024c445)
Regenerated by ./runMavenProfile sdk from updated @APIParam.validValues on APIAddLabelToAlarmMsg and APIUpdateAlarmLabelMsg in premium repo (same Jira). Lets SDK clients pass RegexAgainst and NotEqual as the operator field for alarm labels, matching the server-side API interceptor. APIImpact: AddLabelToAlarm/UpdateAlarmLabel SDK Action now accepts RegexAgainst and NotEqual operator values. Resolves: ZSTAC-84454 Change-Id: I71636c616567697566786f667974766a6c757073
<fix>[xinfini]: fail blacklist to prevent split-brain on VM start See merge request zstackio/zstack!9797
<feature>[sdk]: regenerate SDK for RegexAgainst/NotEqual operators See merge request zstackio/zstack!9798
Existing host reconnect can fail before kvmagent echo. Ansible masks libvirt sockets with systemctl during deploy. If host systemd D-Bus is stuck, that optional step times out. Continue only for this known mask timeout on reconnect. New host deploy and other ansible failures still fail. Resolves: ZSTAC-77120 Change-Id: I0ef4a535065ff797c9e4cfae5b39c2daa321a4cc
fix(ZSTAC-77120): continue reconnect on systemd timeout See merge request zstackio/zstack!9810
Remove an unused MariaDB socket before zstack-server start. Restart MariaDB so MN can recover after power loss. Resolves: ZSTAC-83507 Change-Id: Ifb8255f7c9021ef12353f058a0230fe66989b590
…to avoid stale config cache Resolves: ZSTAC-79075 Change-Id: I6868776c0ea5ef4905ec409181fbf2a431f3034a
<feature>[utils]: Network group high availability strategy See merge request zstackio/zstack!9779
<feature>[conf]: recover stale mariadb socket See merge request zstackio/zstack!9809
<fix>[virtualRouter]: skip grayscale upgrade check on auto reconnect to avoid stale config cache See merge request zstackio/zstack!9814
Store VmModelMountVO lastAttachedEpoch in database so restore cleanup can distinguish successful asynchronous attach from stale failure callbacks.\n\nTested: docker verify-case runMavenProfile premium\nTested: docker verify-case VmModelMountCase Resolves: ZSTAC-84246 Change-Id: Ib426797059be9401c3e2556ecc31c1879dd04049
Resolves: ZSTAC-84919 Change-Id: Iff44de2bbb1dad50e75678180cb89d3430c562b1
<fix>[ai]: persist mount restore epoch schema See merge request zstackio/zstack!9833
<fix>[compute]: ZSTAC-84919 avoid stale iso detach NPE See merge request zstackio/zstack!9834
Walkthrough本次变更新增插件与外部服务框架接口,扩展核心错误码、本地化、CloudBus/REST/Telemetry 与数据库查询能力,重构计算与虚拟机分配及网络逻辑,并同步更新数据库迁移、安装部署脚本、文档与版本信息。 Changes平台、计算与发布更新
Estimated code review effort🎯 5 (Critical) | ⏱️ ~150 minutes Poem
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
|
There was a problem hiding this comment.
Actionable comments posted: 5
Note
Due to the large number of review comments, Critical severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (8)
conf/db/upgrade/V5.1.8.1__schema.sql (1)
1-233:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift本文件多处零时间默认值需统一整改
该脚本里有大量
DEFAULT '0000-00-00 00:00:00'(例如 Line 15、Line 33、Line 58、Line 94、Line 197、Line 218、Line 230 等)。这在升级场景下风险较高,建议统一替换为仓库兼容的非零哨兵时间,并做一次全文件批量修正,避免后续分支重复踩坑。示例修复(请同模式应用到全文件)
- `createDate` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00', + `createDate` timestamp NOT NULL DEFAULT '2000-01-01 00:00:00',As per coding guidelines "Do not use
DEFAULT 0000-00-00 00:00:00, useDEFAULT CURRENT_TIMESTAMPinstead".
Based on learnings "in this repo’s MySQL 5.7 setup, when another TIMESTAMP already hasON UPDATE CURRENT_TIMESTAMP, use a non-CURRENT_TIMESTAMP sentinel such as'2000-01-01 00:00:00'."🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@conf/db/upgrade/V5.1.8.1__schema.sql` around lines 1 - 233, This migration uses many DEFAULT '0000-00-00 00:00:00' values (e.g., ModelCenterVO.createDate, ModelVO.createDate, ModelServiceVO.createDate, ModelServiceInstanceVO.createDate, SSOServerTokenVO.createDate/lastOpDate, ContainerManagementVmVO.createDate/lastOpDate, DatasetVO.createDate/lastOpDate, UserProxyConfigVO.createDate/lastOpDate, UserProxyConfigResourceRefVO.createDate/lastOpDate, ModelEvaluationTaskVO.createDate/lastOpDate, TrainedModelRecordVO.createDate/lastOpDate) which must be replaced: for each TIMESTAMP column that already uses ON UPDATE CURRENT_TIMESTAMP keep the ON UPDATE but set a non-zero sentinel like '2000-01-01 00:00:00' as the default; for TIMESTAMP columns that do not rely on another ON UPDATE CURRENT_TIMESTAMP use DEFAULT CURRENT_TIMESTAMP (or DEFAULT '2000-01-01 00:00:00' if CURRENT_TIMESTAMP would conflict with repo MySQL compatibility rules); apply this change consistently across all occurrences in the file, ensuring constraints and ON UPDATE clauses (e.g., columns with "ON UPDATE CURRENT_TIMESTAMP") are preserved.conf/deployuidb.sh (1)
1-21:⚠️ Potential issue | 🟠 Major | ⚡ Quick winShell 兼容性问题:shebang 与语法不匹配
脚本声明为
#!/bin/sh(POSIX sh),但使用了 bash 特有语法:
[[ ]]双括号测试(第 12、18 行)&>重定向(第 16、28、40、78 行)在
/bin/sh不是 bash 的系统上(如 Debian/Ubuntu 使用 dash),脚本会执行失败。🔧 建议修复
-#!/bin/sh +#!/bin/bash set -e或者保持 POSIX 兼容:
-if [[ `id -u` -ne 0 ]] && [[ x"$user" = x"root" ]]; then +if [ "$(id -u)" -ne 0 ] && [ "$user" = "root" ]; then -if command -v greatdb &> /dev/null; then +if command -v greatdb > /dev/null 2>&1; then🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@conf/deployuidb.sh` around lines 1 - 21, The script uses bash-only constructs while claiming a POSIX shebang; either update the shebang to #!/bin/bash or make the script POSIX-compliant: replace all [[ ... ]] tests (e.g., the checks around `id -u` and `x"$user" = x"root"`) with POSIX [ ... ] or test, and replace all `&>` redirections with POSIX equivalents (e.g., redirect stdout and stderr using >file 2>&1 or >/dev/null 2>&1); update usages around the `MYSQL` variable assignment, the `command -v greatdb` block and any `&>` occurrences to follow the chosen approach so the script runs on shells like dash.core/src/main/java/org/zstack/core/job/JobQueueFacadeImpl2.java (1)
286-299:⚠️ Potential issue | 🟠 Major | ⚡ Quick win处理反序列化失败后的空任务分支,避免二次 NPE。
Line 298 重新查询后
jobe可能为null;下一轮循环会在jobe.getContext()或jobe.getId()处再次抛异常,导致该队列处理链路中断。💡 建议修复
while (true) { try { JobContextObject ctx = SerializableHelper.readObject(jobe.getContext()); Job theJob = ctx.load(); jobe.setState(JobState.Processing); jobe = dbf.updateAndRefresh(jobe); return Bucket.newBucket(jobe, theJob); } catch (Exception e1) { ErrorCode ierr = inerr(ORG_ZSTACK_CORE_JOB_10001, "[Job de-serialize failed, the job will be marked as Error] queue name: %s, job id: %s, %s", qvo.getName(), jobe.getId(), e1.getMessage()); jobFail(jobe, ierr); logger.warn(ierr.getDetails(), e1); jobe = findJob(qvo); + if (jobe == null) { + dbf.remove(qvo); + logger.debug(String.format("[JobQueue released, no pending task, delete the queue] last owner: %s, queue name: %s, queue id: %s", + qvo.getOwner(), qvo.getName(), qvo.getId())); + return null; + } } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@core/src/main/java/org/zstack/core/job/JobQueueFacadeImpl2.java` around lines 286 - 299, In JobQueueFacadeImpl2 inside the loop that deserializes job context, handle the case where findJob(qvo) returns null after a deserialization failure: after the catch where you call jobFail(jobe, ierr) and set jobe = findJob(qvo), check if jobe is null and if so skip processing that iteration (e.g., continue the while loop or return an empty/none Bucket as appropriate) to avoid a second NPE on jobe.getContext() or jobe.getId(); ensure you reference the jobe local, SerializableHelper.readObject/JobContextObject load, jobFail(...) and Bucket.newBucket(...) paths when applying the fix.console/src/main/java/org/zstack/console/ManagementServerConsoleProxyBackend.java (1)
565-572:⚠️ Potential issue | 🟠 Major | ⚡ Quick win
consoleProxyPort配置失败现在会被当成成功。这里先执行了
setConsoleProxyPort(newProxyPort),但分支只判断rst == 0,所以只要 IP 配置成功、端口配置失败,流程仍然会继续往下走。另外失败消息里还有两个%d占位符却没有传参,错误路径本身也会丢失诊断信息。建议直接一起修正
- if (rst == 0) { + if (rst == 0 && portRst == 0) { trigger.next(); } else { - trigger.fail(operr(ORG_ZSTACK_CONSOLE_10003, "failed to configure consoleProxyOverriddenIp[code:%d] or consoleProxyPort[code:%d]")); + trigger.fail(operr( + ORG_ZSTACK_CONSOLE_10003, + "failed to configure consoleProxyOverriddenIp[code:%d] or consoleProxyPort[code:%d]", + rst, portRst)); }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@console/src/main/java/org/zstack/console/ManagementServerConsoleProxyBackend.java` around lines 565 - 572, 在 run(FlowTrigger trigger, Map data) 中当前只检查 rst (来自 setConsoleProxyOverridenIp) 导致 setConsoleProxyPort(newProxyPort) 失败也被视为成功;修改逻辑同时检查 rst 和 portRst 的返回值,只有两者都为 0 时才调用 trigger.next(),否则调用 trigger.fail(...) 并将两个返回码和底层错误信息包含在 operr 的格式化参数中(使用 setConsoleProxyOverridenIp、setConsoleProxyPort 的返回码与/或异常信息),确保占位符 %d 有对应参数并附带可诊断的错误上下文。conf/tools/install.sh (1)
107-121:⚠️ Potential issue | 🟠 Major | ⚡ Quick win包装脚本这里需要用
"$@"原样透传参数。当前的
$@会再次分词,带空格的 inventory 路径、-e参数值或其他复杂参数都会被拆坏,实际行为会和直接调用 ansible 不一致。🛠️ 建议修改
-\${VIRTUAL_ENV}/bin/ansible \$@ +\${VIRTUAL_ENV}/bin/ansible "$@"-\${VIRTUAL_ENV}/bin/ansible-playbook \$@ +\${VIRTUAL_ENV}/bin/ansible-playbook "$@"Also applies to: 124-138
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@conf/tools/install.sh` around lines 107 - 121, The wrapper script that writes /usr/bin/ansible uses unquoted $@ which re-splits arguments and breaks spaces/complex args; update the generated script to forward parameters exactly using "$@" (and prefer exec "${VIRTUAL_ENV}/bin/ansible" "$@" so the wrapper doesn't add an extra process). Apply the same fix to the second wrapper block referenced (lines 124-138) so both instances use "$@" and exec with ${VIRTUAL_ENV}/bin/ansible to preserve argument quoting and behavior.core/src/main/java/org/zstack/core/cloudbus/ResourceDestinationMakerImpl.java (1)
80-92:⚠️ Potential issue | 🟠 Major | ⚡ Quick win缓存未命中时这里会返回
null。
Map.put()返回的是旧值,不是新插入的对象。也就是说数据库查到了vo之后,info = nodes.put(...)这一句仍然会把info设成null,调用方拿到的还是空值。🛠️ 建议修改
if (info == null) { ManagementNodeVO vo = dbf.findByUuid(nodeUuid, ManagementNodeVO.class); if (vo == null) { throw new ManagementNodeNotFoundException(nodeUuid); } nodeHash.add(nodeUuid); - info = nodes.put(nodeUuid, new NodeInfo(vo)); + info = new NodeInfo(vo); + nodes.put(nodeUuid, info); }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@core/src/main/java/org/zstack/core/cloudbus/ResourceDestinationMakerImpl.java` around lines 80 - 92, In getNodeInfo the code sets info = nodes.put(nodeUuid, new NodeInfo(vo)), but Map.put returns the previous value so info becomes null; instead construct the NodeInfo (new NodeInfo(vo)), assign it to a local variable, put it into nodes (nodes.put(nodeUuid, info) or nodes.putIfAbsent and then retrieve), then add nodeUuid to nodeHash and return that local NodeInfo; refer to getNodeInfo, NodeInfo, nodes.put/nodes.putIfAbsent, nodeHash.add, dbf.findByUuid, and ManagementNodeNotFoundException when making the change.compute/src/main/java/org/zstack/compute/vm/VmQuotaOperator.java (1)
44-73:⚠️ Potential issue | 🟠 Major | ⚡ Quick win管理员路径仍然会绕过新增的运行态配额检查。
APIChangeInstanceOfferingMsg和APIUpdateVmInstanceMsg这两个新分支现在只挂在type != AccountType.SystemAdmin下面。管理员代租户执行这两类操作时,下面新增的check(...)永远不会被调用,运行中 VM 的 CPU/内存扩容仍然不会按目标租户配额拦截。既然这里已经把
currentAccountUuid和resourceTargetOwnerAccountUuid拆开了,这两类消息也应该像APIChangeResourceOwnerMsg一样接入管理员分支。🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@compute/src/main/java/org/zstack/compute/vm/VmQuotaOperator.java` around lines 44 - 73, In VmQuotaOperator.checkQuota, admin users currently bypass the new runtime quota checks for APIChangeInstanceOfferingMsg and APIUpdateVmInstanceMsg because those branches are only under the non-admin path; update the admin branch to also invoke check(...) for APIChangeInstanceOfferingMsg and APIUpdateVmInstanceMsg (just like APIChangeResourceOwnerMsg) so admin-acting-on-behalf operations validate quotas against resourceTargetOwnerAccountUuid/currentAccountUuid separation and enforce target tenant runtime CPU/memory limits. Ensure you call the same check((APIChangeInstanceOfferingMsg) msg, pairs) and check((APIUpdateVmInstanceMsg) msg, pairs) in the else (AccountType.SystemAdmin) branch.compute/src/main/java/org/zstack/compute/vm/VmInstanceBase.java (1)
1112-1114:⚠️ Potential issue | 🟠 Major | ⚡ Quick win相同 IP 时不要直接短路返回。
这里现在只比较了 IP 字面值。这个补丁新增了
overrideInfo的网关/掩码/IPv6 前缀覆写;如果请求保持 IP 不变、只更新这些字段,这里会直接success(),后面的重新分配不会执行,UsedIpVO仍然保留旧网络参数。🛠️ 一个可行修正
- if (targetNic.getUsedIps().stream().map(UsedIpVO::getIp).collect(Collectors.toList()).containsAll(staticIpMap.values())) { + boolean sameIp = targetNic.getUsedIps().stream() + .map(UsedIpVO::getIp) + .collect(Collectors.toList()) + .containsAll(staticIpMap.values()); + boolean hasOverride = overrideInfo != null && ( + StringUtils.isNotBlank(overrideInfo.getNetmask()) || + StringUtils.isNotBlank(overrideInfo.getGateway()) || + StringUtils.isNotBlank(overrideInfo.getIpv6Gateway()) || + StringUtils.isNotBlank(overrideInfo.getIpv6Prefix()) + ); + if (sameIp && !hasOverride) { completion.success(); return; }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@compute/src/main/java/org/zstack/compute/vm/VmInstanceBase.java` around lines 1112 - 1114, The current early-return in VmInstanceBase (the if that checks targetNic.getUsedIps().stream().map(UsedIpVO::getIp).collect(Collectors.toList()).containsAll(staticIpMap.values())) only compares IP string values and therefore incorrectly short-circuits when overrideInfo (gateway/netmask/ipv6Prefix) needs to be applied; change the check to verify for each desired IP in staticIpMap there exists a matching UsedIpVO whose ip AND network parameters (gateway, netmask, and IPv6 prefix where applicable) equal the requested values from overrideInfo (or null/empty equivalently), and only call completion.success() if all fields match; otherwise let the existing reallocation/update logic run so UsedIpVO entries are updated. Ensure you reference targetNic, getUsedIps, UsedIpVO and staticIpMap when locating and updating the logic.
🟡 Minor comments (27)
conf/db/upgrade/V5.1.19__schema.sql-7-8 (1)
7-8:⚠️ Potential issue | 🟡 Minor | ⚡ Quick win时间戳字段的默认值不符合编码规范。
根据编码规范:
- 不应使用
DEFAULT '0000-00-00 00:00:00'lastOpDate应使用DEFAULT CURRENT_TIMESTAMPcreateDate因为已有一个 TIMESTAMP 列使用了ON UPDATE CURRENT_TIMESTAMP,根据 MySQL 5.7 兼容性要求,可使用'2000-01-01 00:00:00'作为哨兵值(而非'0000-00-00 00:00:00')As per coding guidelines, SQL scripts should not use
DEFAULT '0000-00-00 00:00:00'. Based on learnings, use'2000-01-01 00:00:00'as sentinel default for the second TIMESTAMP column to avoid MySQL 5.7 Error 1293.🔧 建议的修复
- `lastOpDate` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00' ON UPDATE CURRENT_TIMESTAMP, - `createDate` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00', + `lastOpDate` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, + `createDate` timestamp NOT NULL DEFAULT '2000-01-01 00:00:00',🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@conf/db/upgrade/V5.1.19__schema.sql` around lines 7 - 8, Replace the illegal '0000-00-00 00:00:00' defaults: set the lastOpDate TIMESTAMP column to use DEFAULT CURRENT_TIMESTAMP (and keep ON UPDATE CURRENT_TIMESTAMP) and set the createDate TIMESTAMP column to use the sentinel DEFAULT '2000-01-01 00:00:00' to comply with MySQL 5.7 compatibility and the coding guidelines.abstraction/src/main/java/org/zstack/abstraction/PluginEndpointValidator.java-22-35 (1)
22-35:⚠️ Potential issue | 🟡 Minor | ⚡ Quick win异常缺少错误信息且逻辑可以优化。
- 抛出异常时未提供错误消息,无法得知哪些插件类型重复。
- 当插件数量与去重后数量不等时才抛异常的逻辑略显间接,建议直接描述"存在重复的插件类型"。
🔧 建议的修复
public void validateAllPlugins(List<PluginDriver> pluginDriverList) { int pluginNumber = pluginDriverList.size(); int distinctPluginByEndpointTypeNumber = pluginDriverList .stream() .map(PluginDriver::type) .collect(Collectors.toSet()).size(); if (pluginNumber == distinctPluginByEndpointTypeNumber) { return; } - throw new InvalidPluginDefinitionException(); + throw new InvalidPluginDefinitionException( + String.format("Duplicate plugin types detected. Total plugins: %d, Unique types: %d", + pluginNumber, distinctPluginByEndpointTypeNumber)); }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@abstraction/src/main/java/org/zstack/abstraction/PluginEndpointValidator.java` around lines 22 - 35, validateAllPlugins currently throws InvalidPluginDefinitionException without details and uses an indirect size comparison; update PluginEndpointValidator.validateAllPlugins to detect duplicate plugin types by comparing pluginDriverList.size() with the distinct set size and, when they differ, compute the duplicate types via pluginDriverList.stream().map(PluginDriver::type) and grouping/counting to produce a list or set of duplicated type names, then throw new InvalidPluginDefinitionException with a clear message like "duplicate plugin types found: [...]" that includes the duplicate type names so callers can see which PluginDriver.type values conflict.abstraction/src/main/java/org/zstack/abstraction/PluginEndpointValidator.java-16-18 (1)
16-18:⚠️ Potential issue | 🟡 Minor | ⚡ Quick win异常缺少错误信息。
抛出
InvalidPluginDefinitionException时未提供任何错误消息,这会导致调试困难。建议添加描述性的错误信息,说明哪个插件的 type 为 null。🔧 建议的修复
public void validate(PluginDriver register) { if (register.type() == null) { - throw new InvalidPluginDefinitionException(); + throw new InvalidPluginDefinitionException( + String.format("Plugin type cannot be null for plugin class: %s", + register.getClass().getName())); } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@abstraction/src/main/java/org/zstack/abstraction/PluginEndpointValidator.java` around lines 16 - 18, 在 PluginEndpointValidator 中当 register.type() 为 null 时,抛出的 InvalidPluginDefinitionException 缺少描述性信息;修改该分支在抛出异常时附带上下文信息(例如 register 的标识或 toString() 输出)以指明是哪一个插件定义的 type 为 null(参考符号:PluginEndpointValidator、register.type()、InvalidPluginDefinitionException、register)。确保异常构造器接收一个包含插件标识的字符串,例如 "Plugin registration has null type: " + register,以便后续能定位问题。README.zh-CN.md-98-119 (1)
98-119:⚠️ Potential issue | 🟡 Minor | ⚡ Quick win标题层级有跳级,建议调整为连续层级。
在
##后直接使用了####(如 Line 100、Line 125),会触发 MD001 并影响文档结构解析。建议这些二级标题下的小节统一改成###。📝 建议修改
-#### 伸缩性 +### 伸缩性 -#### 插件架构 +### 插件架构 -#### 资源操作框架 +### 资源操作框架 -#### 整体技术架构概述 +### 整体技术架构概述 -#### 问题反馈 +### 问题反馈 -#### 功能需求 +### 功能需求 -#### 代码贡献 +### 代码贡献Also applies to: 124-137
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@README.zh-CN.md` around lines 98 - 119, The section under the "## ZStack架构设计" heading uses fourth-level headings (e.g., "#### 伸缩性", "#### 插件架构", "#### 资源操作框架", "#### 整体技术架构概述") which causes a jump from H2 to H4 and triggers MD001; change those "####" headings to third-level "###" so the hierarchy is continuous (e.g., update "#### 伸缩性" → "### 伸缩性", "#### 插件架构" → "### 插件架构", etc.), and ensure any nested subpoints below those remain at lower levels if intended.conf/db/upgrade/V1.3__schema.sql-73-73 (1)
73-73:⚠️ Potential issue | 🟡 Minor | ⚡ Quick win该视图语句仍有未加反引号的标识符,建议一次性补齐。
这次只修复了
system,但同一行里其他列名与WHERE条件列仍未加反引号,不符合当前 SQL 规范。🧩 建议修改
-CREATE VIEW `zstack`.`ImageVO` AS SELECT uuid, name, description, status, state, size, actualSize, md5Sum, platform, type, format, url, `system`, mediaType, createDate, lastOpDate, guestOsType FROM `zstack`.`ImageEO` WHERE deleted IS NULL; +CREATE VIEW `zstack`.`ImageVO` AS SELECT `uuid`, `name`, `description`, `status`, `state`, `size`, `actualSize`, `md5Sum`, `platform`, `type`, `format`, `url`, `system`, `mediaType`, `createDate`, `lastOpDate`, `guestOsType` FROM `zstack`.`ImageEO` WHERE `deleted` IS NULL;As per coding guidelines "所有表名和列名必须使用反引号包裹(例如:WHERE
system= 1),以避免 MySQL 8.0 / GreatSQL 保留关键字冲突导致的语法错误"。🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@conf/db/upgrade/V1.3__schema.sql` at line 73, The CREATE VIEW definition for ImageVO still has unquoted identifiers; update the CREATE VIEW `zstack`.`ImageVO` ... SELECT ... FROM `zstack`.`ImageEO` statement so that every table and column identifier is wrapped in backticks (e.g., `uuid`, `name`, `description`, `status`, `state`, `size`, `actualSize`, `md5Sum`, `platform`, `type`, `format`, `url`, `system`, `mediaType`, `createDate`, `lastOpDate`, `guestOsType`) and also quote the WHERE column (`deleted`) to comply with the guideline that all table/column names must use backticks.abstraction/src/main/java/org/zstack/abstraction/sns/EndpointDriver.java-5-8 (1)
5-8:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winJavadoc 与实际接口不一致,建议立即修正。
Line 6-7 注释提到
PluginEndpointSender/PluginRegister,与当前EndpointDriver extends PluginDriver不一致,容易误导后续实现者。建议修复
/** - * PluginEndpointSender extends PluginRegister and contains sender - * of sns endpoint. + * EndpointDriver extends PluginDriver and defines the sender + * behavior for SNS endpoints. */As per coding guidelines 对于较长的注释,需要仔细校对并随代码更新,确保内容正确。
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@abstraction/src/main/java/org/zstack/abstraction/sns/EndpointDriver.java` around lines 5 - 8, The class Javadoc is out of sync: it refers to PluginEndpointSender and PluginRegister but the class is declared as EndpointDriver extends PluginDriver; update the class-level comment to correctly describe EndpointDriver and its relationship to PluginDriver and its responsibility (sending/handling SNS endpoints) so readers aren't misled by stale names like PluginEndpointSender/PluginRegister—locate the Javadoc above the EndpointDriver class and replace the incorrect references with accurate wording mentioning EndpointDriver and PluginDriver and a brief description of SNS endpoint behavior.abstraction/src/main/java/org/zstack/abstraction/sso/OAuth2PluginConstants.java-20-20 (1)
20-20:⚠️ Potential issue | 🟡 Minor | ⚡ Quick win修正拼写错误。
GRANT_TYPE_OCDE_VALUE存在拼写错误,应为GRANT_TYPE_CODE_VALUE。-public static final String GRANT_TYPE_OCDE_VALUE = "authorization_code"; +public static final String GRANT_TYPE_CODE_VALUE = "authorization_code";🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@abstraction/src/main/java/org/zstack/abstraction/sso/OAuth2PluginConstants.java` at line 20, Fix the typo in OAuth2PluginConstants by renaming the constant GRANT_TYPE_OCDE_VALUE to GRANT_TYPE_CODE_VALUE and updating every usage/reference to the new name (e.g., imports/usages across the codebase) to avoid compilation errors; modify the declaration in class OAuth2PluginConstants and run a project-wide search/replace to update references (ensure no duplicate constants remain and adjust any tests or docs that referenced the old symbol).core/src/main/java/org/zstack/core/CoreManagerImpl.java-141-143 (1)
141-143:⚠️ Potential issue | 🟡 Minor | ⚡ Quick win改进任务过滤逻辑的效率和设计清晰度。
当
isOnlyRunningTask()为 true 时,代码构建完整的 ChainInfo 对象(包含 running 和 pending 任务),然后清空 pendingTask 列表。这种方式存在两个问题:
效率问题:pendingQueue 中的任务被遍历并添加到 ChainInfo,随后立即被清除,造成不必要的计算和内存操作。
设计问题:在返回的对象上执行副作用(清除列表)不符合函数式设计原则,使代码意图不够清晰。
建议重构:在
ChainTaskQueueWrapper.getChainInfo()中增加参数控制是否包含 pending 任务,或在getChainTaskInfo()中新增专用方法(如getRunningTasksOnly()),避免无效的构建和清除操作。🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@core/src/main/java/org/zstack/core/CoreManagerImpl.java` around lines 141 - 143, When msg.isOnlyRunningTask() is true, avoid building full ChainInfo and then mutating its pendingTask lists; instead update CoreManagerImpl to call a new or overloaded method that returns only running tasks (e.g., add getRunningTasksOnly() or add a boolean includePending parameter to ChainTaskQueueWrapper.getChainInfo) and use that from getChainTaskInfo() so pendingQueue entries are not added then cleared and no side-effect (clear()) is performed on the returned ChainInfo objects.core/src/main/java/org/zstack/core/With.java-21-23 (1)
21-23:⚠️ Potential issue | 🟡 Minor | ⚡ Quick win修正注释格式错误。
第 21 行末尾的
/**注释标记位置不正确,应移至单独的行或删除。当前格式可能导致文档工具解析错误。🔧 建议的修正
import static org.zstack.core.Platform.inerr; -import static org.zstack.utils.clouderrorcode.CloudOperationsErrorCode.*;/** - */ - +import static org.zstack.utils.clouderrorcode.CloudOperationsErrorCode.*; + +/** + */ `@Configurable`(preConstruction = true, autowire = Autowire.BY_TYPE) public class With {🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@core/src/main/java/org/zstack/core/With.java` around lines 21 - 23, The stray Javadoc start token `/**` appended to the import statement should be removed or moved to its own proper comment line to avoid parsing errors; edit the import line that currently reads `import static org.zstack.utils.clouderrorcode.CloudOperationsErrorCode.*;/**` in the With class and either delete the trailing `/**` or place it on a separate line as a valid comment block opener so the import statement is clean.conf/db/upgrade/V5.1.20__schema.sql-10-10 (1)
10-10:⚠️ Potential issue | 🟡 Minor | ⚡ Quick win删除 SQL 注释中的中文
第 10 行和第 17 行的 SQL 注释包含中文文本,违反编码指南要求。应改为英文注释:
- 第 10 行:
-- 开始事务改为-- Start transaction- 第 17 行:
-- 提交事务改为-- Commit transaction根据编码指南,代码中不应出现中文,包括注释和错误信息,应统一使用无拼写错误的英文。
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@conf/db/upgrade/V5.1.20__schema.sql` at line 10, Replace the Chinese SQL comments in V5.1.20__schema.sql with English equivalents: change the comment "-- 开始事务" to "-- Start transaction" and change "-- 提交事务" to "-- Commit transaction"; ensure no other Chinese text remains in comments or messages in this file and keep spelling correct.core/src/main/java/org/zstack/core/plugin/PluginApiInterceptor.java-25-27 (1)
25-27:⚠️ Potential issue | 🟡 Minor | ⚡ Quick win先对 UUID 做 trim,再交给扩展点校验
这里直接使用
msg.getUuid(),当请求参数带首尾空白时,扩展点可能得到错误值,导致误判或查找失败。建议修改
private void validate(APIDeletePluginDriversMsg msg) { + String pluginUuid = msg.getUuid() == null ? null : msg.getUuid().trim(); for (PluginDriversExtensionPoint ext : pluginRgty.getExtensionList(PluginDriversExtensionPoint.class)) { - ErrorCode result = ext.validateDeletePluginDrivers(msg.getUuid()); + ErrorCode result = ext.validateDeletePluginDrivers(pluginUuid); if (result != null) { throw new ApiMessageInterceptionException(result); } } }As per coding guidelines:
**/*Interceptor.java: “注意检查来自 Message 的参数是否做过 trim,用户可能在浏览器上复制粘贴的数据带有空格、换行符等”.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@core/src/main/java/org/zstack/core/plugin/PluginApiInterceptor.java` around lines 25 - 27, In PluginApiInterceptor's validate(APIDeletePluginDriversMsg msg) ensure the UUID is trimmed before passing to extension points: retrieve msg.getUuid(), call trim() (and handle null safely), then pass the trimmed value to ext.validateDeletePluginDrivers(...) (or set the trimmed value back on msg if other code expects msg.getUuid()); update references in the validate method so pluginRgty.getExtensionList(...)/ext.validateDeletePluginDrivers use the trimmed UUID to avoid failures when the incoming message contains leading/trailing whitespace.build/bump_version.py-27-27 (1)
27-27:⚠️ Potential issue | 🟡 Minor | ⚡ Quick win注释中包含全角标点符号。
根据编码规范,代码中(包括注释)不应包含中文及中文标点。Lines 27, 49, 60, 83的注释中使用了全角逗号
,,应改为半角逗号,。🔧 建议修复
- # 使用 'rb' 模式读取,保留原始字节 + # 使用 'rb' 模式读取, 保留原始字节对其他三处做类似修改。
Also applies to: 49-49, 60-60, 83-83
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@build/bump_version.py` at line 27, Replace full-width Chinese punctuation in comments by using ASCII punctuation: locate the comments that contain the text "使用 'rb' 模式读取,保留原始字节" and the other comment lines noted (the occurrences that use the full-width comma ',') and change ',' to a standard half-width comma ',' so all comments use ASCII punctuation; apply the same replacement to the other two comment instances referenced (the ones at the other similar comment lines).conf/install/zstack-server-105-120 (1)
105-120:⚠️ Potential issue | 🟡 Minor | ⚡ Quick win函数
recover_mariadb_stale_socket的返回值处理需要更明确。当前实现中,如果
ZSTACK_RECOVER_MARIADB_SOCKET设为false或MariaDB服务不存在时,函数返回0(成功)。但在Line 112-113,当socket_is_used返回2时(表示无法探测socket状态),函数也返回0。这可能掩盖了潜在的环境配置问题(如lsof和fuser都不可用)。建议:
- 当
socket_status为2时,考虑记录警告日志而非静默返回成功- 对
start_mariadb_service的失败情况应有更明确的错误处理与日志🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@conf/install/zstack-server` around lines 105 - 120, In recover_mariadb_stale_socket, don't silently treat socket_status==2 as success: call socket_is_used and if socket_status equals 2, emit a warning via log_zstack_server that socket detection tools are unavailable (include $MARIADB_SOCKET and suggest lsof/fuser missing) and return a non‑zero status; also after calling start_mariadb_service check its exit code and if it fails, log an error with details via log_zstack_server and return a non‑zero status so callers see the failure (keep existing early returns for ZSTACK_RECOVER_MARIADB_SOCKET, mariadb_service_exists and mariadb_is_active).conf/db/upgrade/V5.3.6__schema.sql-15-41 (1)
15-41:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winSQL 注释使用了中文
根据编码规范,代码中的注释应使用英文。当前 SQL 语句中存在多处中文注释(如 "生成不含连字符的uuid"、"使用DatasetVO的uuid作为resourceUuid" 等)。
🔧 建议修复
INSERT INTO SystemTagVO (`uuid`, `resourceUuid`, `resourceType`, `inherent`, `type`, `tag`, `createDate`, `lastOpDate`) SELECT - REPLACE(UUID(), '-', ''), -- 生成不含连字符的uuid - uuid, -- 使用DatasetVO的uuid作为resourceUuid + REPLACE(UUID(), '-', ''), -- generate uuid without hyphens + uuid, -- use DatasetVO uuid as resourceUuid 'DatasetVO', -- resourceType 1, -- inherent 'System', -- type 'dataset::usage::scenarios::ModelEval', -- tag - CURRENT_TIMESTAMP(), -- createDate - CURRENT_TIMESTAMP() -- lastOpDate + CURRENT_TIMESTAMP(), + CURRENT_TIMESTAMP() FROM DatasetVO WHERE `system` = true;As per coding guidelines: "代码里不应当有有中文,包括报错、注释等都应当使用正确的、无拼写错误的英文来写"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@conf/db/upgrade/V5.3.6__schema.sql` around lines 15 - 41, The SQL migration contains Chinese comments in the INSERT...SELECT statements that populate SystemTagVO from DatasetVO (comments like "生成不含连字符的uuid" and "使用DatasetVO的uuid作为resourceUuid"); update all inline comments to English while preserving meaning (e.g. explain REPLACE(UUID(), '-', '') generates a UUID without hyphens and uuid is used as resourceUuid), ensure comments referencing SystemTagVO, DatasetVO, tag values ('dataset::usage::scenarios::ModelEval', 'dataset::datatype::Text'), and column names (`uuid`, `resourceUuid`, `resourceType`, `inherent`, `type`, `tag`, `createDate`, `lastOpDate`) remain accurate and concise in English.core/src/main/java/org/zstack/core/jsonlabel/JsonLabelInventoryDoc_zh_cn.groovy-7-7 (1)
7-7:⚠️ Potential issue | 🟡 Minor | ⚡ Quick win文档字符串包含中文,违反仓库文本规范
Line [7]、Line [11]、Line [17]、Line [23]、Line [29]、Line [35]、Line [41] 使用了中文描述。请改为英文,避免后续一致性与自动化检查问题。
As per coding guidelines:代码里不应当有有中文,包括报错、注释等都应当使用正确的、无拼写错误的英文来写。Also applies to: 11-11, 17-17, 23-23, 29-29, 35-35, 41-41
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@core/src/main/java/org/zstack/core/jsonlabel/JsonLabelInventoryDoc_zh_cn.groovy` at line 7, The file JsonLabelInventoryDoc_zh_cn.groovy contains Chinese text in its documentation strings (e.g., the title "JsonLabelInventory的数据结构" and the other doc lines at the same file: lines referenced at 7, 11, 17, 23, 29, 35, 41); replace each Chinese string with clear, grammatically correct English equivalents (for example change the title to "JsonLabelInventory data structure" and translate the other Chinese descriptions to English), ensuring you only modify the human-readable doc strings in JsonLabelInventoryDoc_zh_cn.groovy and preserve identifiers and code structure.conf/db/upgrade/beforeMigrate.sql-251-251 (1)
251-251:⚠️ Potential issue | 🟡 Minor | ⚡ Quick win
RENAME_TABLE动态 SQL 需补齐 schema 与反引号Line [251] 当前目标表名未显式限定
zstack,且表名未使用反引号。建议统一为带 schema 且带反引号的形式,避免升级脚本在关键字/环境差异下出错。建议修复
- SET `@alter_sql` = CONCAT('RENAME TABLE zstack.', old_name, ' TO ', new_name); + SET `@alter_sql` = CONCAT('RENAME TABLE `zstack`.`', old_name, '` TO `zstack`.`', new_name, '`');Based on learnings: In ZStack upgrade SQL, schema name is fixed as
zstackand should be explicitly qualified.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@conf/db/upgrade/beforeMigrate.sql` at line 251, The dynamic RENAME TABLE SQL built into `@alter_sql` uses unqualified and unquoted identifiers; update the CONCAT in the RENAME TABLE construction so it explicitly qualifies the schema and wraps identifiers in backticks (i.e. produce `RENAME TABLE `zstack`.`old_name` TO `zstack`.`new_name``) by referencing the variables used (old_name, new_name) and the `@alter_sql/CONCAT` expression that builds the statement; ensure both source and target table names are schema-qualified with backticks to avoid issues with keywords or environment differences.core/src/main/java/org/zstack/core/errorcode/LocaleUtils.java-40-43 (1)
40-43:⚠️ Potential issue | 🟡 Minor | ⚡ Quick win为
availableLocales增加空值兜底,避免 NPE。当前只校验了
acceptLanguage,未校验availableLocales。当其为null时,Line 61 会直接抛 NPE。💡 建议修复
public static String resolveLocale(String acceptLanguage, Set<String> availableLocales) { + if (availableLocales == null || availableLocales.isEmpty()) { + return DEFAULT_LOCALE; + } + if (acceptLanguage == null || acceptLanguage.trim().isEmpty()) { return DEFAULT_LOCALE; }Also applies to: 61-68
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@core/src/main/java/org/zstack/core/errorcode/LocaleUtils.java` around lines 40 - 43, The resolveLocale method fails to null-check the availableLocales parameter causing a possible NPE; update resolveLocale to guard against availableLocales being null or empty (e.g., treat null as Collections.emptySet() or return DEFAULT_LOCALE immediately) and adjust the subsequent logic that iterates or queries availableLocales (the code around the resolveLocale method and the block referenced at lines 61-68) to safely handle an empty set when matching acceptLanguage; ensure you reference the resolveLocale method and DEFAULT_LOCALE constant while making the change.conf/db/upgrade/V4.5.1.2__schema.sql-37-37 (1)
37-37:⚠️ Potential issue | 🟡 Minor | ⚡ Quick win请移除 SQL 中的中文常量文本。
Line 37 直接写入中文字符串与仓库规范不一致,建议改为英文文本或改为可本地化键值。
As per coding guidelines: 代码里不应当有有中文,包括报错、注释等都应当使用正确的、无拼写错误的英文来写。
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@conf/db/upgrade/V4.5.1.2__schema.sql` at line 37, The SQL update uses a Chinese literal in SystemTagVO.tag for resourceUuid='eecc7b576c05391bb01fd956964d3ba4'; replace the Chinese constant "name::cn::密码资源状态异常" with an English string or a localization key (e.g. "name::en::password_resource_status_error" or "name::key::PASSWORD_RESOURCE_STATUS_ERROR") so the update statement on SystemTagVO sets a non-Chinese value; keep the same UPDATE statement but change only the tag value.compute/src/main/java/org/zstack/compute/host/HostApiInterceptor.java-84-87 (1)
84-87:⚠️ Potential issue | 🟡 Minor | ⚡ Quick win创建前请先
trim()serviceType。这里直接用
msg.getServiceType()做唯一性校验,"foo"和"foo "会被当成两个值,重复校验能被绕过,后面也容易把带空白的值落库。As per coding guidelines: "注意检查来自 Message 的参数是否做过 trim,用户可能在浏览器上复制粘贴的数据带有空格、换行符等"🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@compute/src/main/java/org/zstack/compute/host/HostApiInterceptor.java` around lines 84 - 87, The uniqueness check in HostApiInterceptor.validate(APICreateHostNetworkServiceTypeMsg msg) uses msg.getServiceType() raw, so values like "foo" and "foo " bypass validation; trim the serviceType before using it for the DB existence check and normalize the message payload (i.e., obtain String svc = msg.getServiceType(); if non-null call svc = svc.trim(); use svc in the Q.New(...).eq(HostNetworkLabelVO_.serviceType, svc).isExists() check and, if the message has a setter, set the trimmed value back on msg so later persistence also uses the trimmed value).compute/src/main/java/org/zstack/compute/host/HostIpmiPowerExecutor.java-223-225 (1)
223-225:⚠️ Potential issue | 🟡 Minor | ⚡ Quick win修正这条报错里的英文断句。
is not reachable.because少了空格,最终会直接出现在运维侧错误信息里。这里改成is not reachable: %s或is not reachable, because %s会更干净。As per coding guidelines, "代码里不应当有有中文,包括报错、注释等都应当使用正确的、无拼写错误的英文来写".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@compute/src/main/java/org/zstack/compute/host/HostIpmiPowerExecutor.java` around lines 223 - 225, The error message in HostIpmiPowerExecutor's operr call uses a malformed sentence fragment ("is not reachable.because %s"); update the operr invocation that formats ipmi.getIpmiAddress() and rst.getStderr() so the message reads cleanly (e.g. "host ipmi[%s] is not reachable: %s" or "host ipmi[%s] is not reachable, because %s") and ensure there is a space/colon before the stderr substitution to avoid concatenated text.conf/db/upgrade/V5.3.20__schema.sql-100-103 (1)
100-103:⚠️ Potential issue | 🟡 Minor | ⚡ Quick win表名和列名未使用反引号包裹
根据编码规范,所有表名和列名必须使用反引号包裹,以避免 MySQL 8.0 / GreatSQL 保留关键字冲突导致的语法错误。
🔧 建议的修复方案
-Update ModelServiceVO set framework = 'Other' where framework not in +UPDATE `zstack`.`ModelServiceVO` SET `framework` = 'Other' WHERE `framework` NOT IN ('vLLM', 'Diffusers', 'Transformers', 'sentence_transformers', 'llama.cpp', 'BentoML', 'Other', 'Ollama') - AND source != 'Bentoml'; -Update ModelServiceVO set framework = 'Other' where framework is NULL; + AND `source` != 'Bentoml'; +UPDATE `zstack`.`ModelServiceVO` SET `framework` = 'Other' WHERE `framework` IS NULL; -UPDATE `zstack`.`L2NetworkVO` set vSwitchType='TfL2Network' where type='TfL2Network'; +UPDATE `zstack`.`L2NetworkVO` SET `vSwitchType`='TfL2Network' WHERE `type`='TfL2Network';Also applies to: 186-186
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@conf/db/upgrade/V5.3.20__schema.sql` around lines 100 - 103, The SQL updates referencing ModelServiceVO should wrap identifiers with backticks to avoid reserved-word conflicts: update the two statements that set framework to 'Other' to use `ModelServiceVO`, and enclose column names `framework` and `source` in backticks (e.g., UPDATE `ModelServiceVO` SET `framework` = 'Other' WHERE `framework` NOT IN (...) AND `source` != 'Bentoml'; and the NULL-check UPDATE `ModelServiceVO` SET `framework` = 'Other' WHERE `framework` IS NULL;), leaving string literals unchanged.core/src/main/java/org/zstack/core/externalservice/ExternalServiceManagerImpl.java-263-263 (1)
263-263:⚠️ Potential issue | 🟡 Minor | ⚡ Quick win错误码不一致
部分
operr()调用使用了ORG_ZSTACK_CORE_EXTERNALSERVICE_*错误码(Lines 54, 129, 135, 141),但 Lines 263, 429, 434 没有使用错误码。建议保持一致性。🔧 建议的修复方案
- completion.fail(operr("unable to find external service configuration with uuid [%s]", msg.getUuid())); + completion.fail(operr(ORG_ZSTACK_CORE_EXTERNALSERVICE_10004, "unable to find external service configuration with uuid [%s]", msg.getUuid())); - completion.fail(operr("failed to apply external service configuration for type [%s]: %s", serviceType, e.getMessage())); + completion.fail(operr(ORG_ZSTACK_CORE_EXTERNALSERVICE_10005, "failed to apply external service configuration for type [%s]: %s", serviceType, e.getMessage())); - completion.fail(operr("unable to find external service type [%s]", serviceType)); + completion.fail(operr(ORG_ZSTACK_CORE_EXTERNALSERVICE_10006, "unable to find external service type [%s]", serviceType));Also applies to: 429-429, 434-434
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@core/src/main/java/org/zstack/core/externalservice/ExternalServiceManagerImpl.java` at line 263, 在 ExternalServiceManagerImpl 中的 completion.fail(operr("unable to find external service configuration with uuid [%s]", msg.getUuid())) 调用缺少与类中其它 operr(...) 调用一致的错误码;请修改该调用(以及文件中类似位置 429、434 的 operr 调用)为使用与本类其它错误处理一致的错误码常量(即类中已使用的 ORG_ZSTACK_CORE_EXTERNALSERVICE_* 常量),保持形如 operr(ORG_ZSTACK_CORE_EXTERNALSERVICE_<SAME_CODE>, "unable to find external service configuration with uuid [%s]", msg.getUuid()),并确保使用相同的错误码符号风格以便统一日志/错误码管理。core/src/main/java/org/zstack/core/cloudbus/CloudBusImpl3.java-143-144 (1)
143-144:⚠️ Potential issue | 🟡 Minor | ⚡ Quick win静态字段在实例初始化块中被修改
SEND_CONFIRMED是一个static final字段,但SEND_CONFIRMED.success()在实例初始化块中调用。虽然FutureCompletion.success()可能是幂等的,但如果创建多个CloudBusImpl3实例,这个调用会执行多次。建议将
success()调用移到静态初始化块中:🔧 建议的修复方案
public static final String HTTP_BASE_URL = "/cloudbus"; - public static final FutureCompletion SEND_CONFIRMED = new FutureCompletion(null); + public static final FutureCompletion SEND_CONFIRMED; + + static { + SEND_CONFIRMED = new FutureCompletion(null); + SEND_CONFIRMED.success(); + } // ... instance initializer block ... { // ... - SEND_CONFIRMED.success(); }Also applies to: 178-179
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@core/src/main/java/org/zstack/core/cloudbus/CloudBusImpl3.java` around lines 143 - 144, The static final SEND_CONFIRMED is being mutated by calling SEND_CONFIRMED.success() inside an instance initializer; move that invocation into a static initialization block so the success() call runs once per class load (not per instance). Specifically, remove the SEND_CONFIRMED.success() call from the instance initialization area in CloudBusImpl3 and add a static { SEND_CONFIRMED.success(); } block in the class; do the same for any other occurrences where a static final future is completed in an instance initializer (the similar calls around the other instance-init region mentioned) so all static futures are completed in a static init block.core/src/main/java/org/zstack/core/cloudbus/CloudBusImpl3.java-940-940 (1)
940-940:⚠️ Potential issue | 🟡 Minor | ⚡ Quick win代码注释包含中文
根据编码规范,代码中不应包含中文,包括注释。请将此注释改为英文。
🔧 建议的修复方案
- // Sentry 仅由 TelemetryFacade 在注册全局 OpenTelemetry 后 init;此处只判断是否已启用 + // Sentry is only initialized by TelemetryFacade after registering global OpenTelemetry; here we only check if it's enabled🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@core/src/main/java/org/zstack/core/cloudbus/CloudBusImpl3.java` at line 940, The inline comment in CloudBusImpl3 (around the Telemetry/Sentry init check) is written in Chinese; replace it with an English comment that conveys the same meaning (e.g., "Sentry is only initialized by TelemetryFacade after global OpenTelemetry registration; here we only check whether it is enabled"). Update the comment near the Telemetry/Sentry check in the CloudBusImpl3 class accordingly, preserving intent and clarity.compute/src/main/java/org/zstack/compute/vm/VmInstanceManagerImpl.java-1147-1150 (1)
1147-1150:⚠️ Potential issue | 🟡 Minor | ⚡ Quick win不要在不可变空列表上追加错误码。
这里的
result是Collections.emptyList(),result.add(...)会直接抛UnsupportedOperationException,把原本想返回的“msg 为空”错误再次覆盖掉。改成new ArrayList<>(),或者直接返回单元素列表更安全。Also applies to: 1160-1163
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@compute/src/main/java/org/zstack/compute/vm/VmInstanceManagerImpl.java` around lines 1147 - 1150, In VmInstanceManagerImpl (the method containing the shown snippet) avoid mutating Collections.emptyList(); replace result initialization with a mutable list (e.g., new ArrayList<>()) or simply return a single-element list containing operr(ORG_ZSTACK_COMPUTE_VM_10238, "CreateVmInstanceMsg cannot be null"); do the same fix for the other occurrence around lines 1160-1163 so you don't trigger UnsupportedOperationException when adding the error code.compute/src/main/java/org/zstack/compute/vm/VmInstanceApiInterceptor.java-409-425 (1)
409-425:⚠️ Potential issue | 🟡 Minor | ⚡ Quick win先
trim()DNS 地址再校验。现在直接校验原始
dnsAddresses,浏览器复制出来的前后空格会被当成非法地址。拦截器里最好先做trim(),并把清洗后的值回写消息。 As per coding guidelines "注意检查来自 Message 的参数是否做过 trim,用户可能在浏览器上复制粘贴的数据带有空格、换行符等"🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@compute/src/main/java/org/zstack/compute/vm/VmInstanceApiInterceptor.java` around lines 409 - 425, In validateDnsAddresses (VmInstanceApiInterceptor) trim each DNS string (use String.trim()), drop empty results, and replace the original dnsAddresses list with the cleaned values before performing size and format checks; then enforce the MAXIMUM_NIC_DNS_NUMBER limit and run the existing NetworkUtils.isIpv4Address / IPv6NetworkUtils.isIpv6Address validations against the trimmed addresses so browser-copied entries with surrounding whitespace are handled correctly.compute/src/main/java/org/zstack/compute/vm/VmInstanceBase.java-3537-3537 (1)
3537-3537:⚠️ Potential issue | 🟡 Minor | ⚡ Quick win静态 DNS 的写入和删除路径不对称。
这里开始把 DNS 当作静态 IP 的伴生状态持久化了,但同文件里的
handle(APIDeleteVmStaticIpMsg)仍然只删 static IP tag,不会在删掉该 L3 的最后一个静态 IP 时同步清掉 static DNS。这样用户删除静态 IP 后,旧 DNS 仍会残留到后续启动/换网流程里。Also applies to: 3719-3722, 3768-3771
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@compute/src/main/java/org/zstack/compute/vm/VmInstanceBase.java` at line 3537, The code writes DNS as companion state of static IPs via cmsg.setDnsAddresses(...) but the delete path in handle(APIDeleteVmStaticIpMsg) only removes the static IP tag and does not clear/update the static DNS tag, leaving stale DNS entries; update the delete handlers (the APIDeleteVmStaticIpMsg handler and similar delete flows referenced around the other spots) to check whether the removed static IP was the last one on that L3 and, if so, clear the stored static DNS (or remove the static DNS tag) for that L3—mirror the logic used when creating/setting DNS (cmsg.setDnsAddresses) so DNS persistence and deletion are symmetric. Ensure you reference and modify the same methods/classes handling static IP deletion and tag updates so the DNS state is cleaned when the last static IP for an L3 is removed.
| # Detect MySQL version | ||
| # Extract major version number from various MySQL/MariaDB/GreatDB output formats | ||
| db_version=$(${MYSQL} --version 2>/dev/null | grep -oP '\d+\.\d+\.\d+' | head -1 | cut -d'.' -f1 || echo "5") | ||
|
|
||
| # GreatDB and MySQL 8.0+ require CREATE USER before GRANT | ||
| if command -v greatdb &> /dev/null || [ "$db_version" -ge 8 ] 2>/dev/null; then | ||
| ${MYSQL} ${loginCmd} << EOF | ||
| set global log_bin_trust_function_creators=1; | ||
| DROP DATABASE IF EXISTS zstack; | ||
| CREATE DATABASE zstack; | ||
| DROP DATABASE IF EXISTS zstack_rest; | ||
| CREATE DATABASE zstack_rest; | ||
| CREATE USER IF NOT EXISTS 'root'@'%' IDENTIFIED BY "${password}"; | ||
| CREATE USER IF NOT EXISTS 'root'@'127.0.0.1' IDENTIFIED BY "${password}"; | ||
| grant all privileges on zstack.* to root@'%'; | ||
| grant all privileges on zstack_rest.* to root@'%'; | ||
| grant all privileges on zstack.* to root@'127.0.0.1'; | ||
| grant all privileges on zstack_rest.* to root@'127.0.0.1'; | ||
| EOF | ||
| else | ||
| # MySQL 5.x: GRANT with IDENTIFIED BY auto-creates users | ||
| ${MYSQL} ${loginCmd} << EOF | ||
| set global log_bin_trust_function_creators=1; | ||
| DROP DATABASE IF EXISTS zstack; | ||
| CREATE DATABASE zstack; | ||
| DROP DATABASE IF EXISTS zstack_rest; | ||
| CREATE DATABASE zstack_rest; | ||
| grant all privileges on zstack.* to root@'%' identified by "${password}"; | ||
| grant all privileges on zstack_rest.* to root@'%' identified by "${password}"; | ||
| grant all privileges on zstack.* to root@'127.0.0.1' identified by "${password}"; | ||
| grant all privileges on zstack_rest.* to root@'127.0.0.1' identified by "${password}"; | ||
| EOF | ||
| fi |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# 测试数据库版本检测逻辑在不同环境下的行为
# 检查当前环境的 MySQL/GreatDB 版本输出格式
if command -v mysql &> /dev/null; then
echo "=== MySQL version output ==="
mysql --version
fi
if command -v greatdb &> /dev/null; then
echo "=== GreatDB version output ==="
greatdb --version
fi
# 检查版本提取逻辑
test_version_extraction() {
local version_output="$1"
local extracted=$(echo "$version_output" | grep -oP '\d+\.\d+\.\d+' | head -1 | cut -d'.' -f1 || echo "5")
echo "Input: $version_output"
echo "Extracted major version: $extracted"
echo "---"
}
# 测试各种可能的输出格式
test_version_extraction "mysql Ver 5.7.38 for Linux"
test_version_extraction "mysql Ver 8.0.32 for Linux"
test_version_extraction "mysql Ver 15.1 Distrib 10.6.12-MariaDB"
test_version_extraction "greatdb Ver 8.0.25 for Linux"Repository: MatheMatrix/zstack
Length of output: 341
🏁 Script executed:
# 查看实际的 deploydb.sh 文件
find . -name "deploydb.sh" -type f | head -5Repository: MatheMatrix/zstack
Length of output: 101
🏁 Script executed:
# 测试 MariaDB 版本提取的边界情况
#!/bin/bash
test_version_extraction() {
local version_output="$1"
local extracted=$(echo "$version_output" | grep -oP '\d+\.\d+\.\d+' | head -1 | cut -d'.' -f1 || echo "5")
echo "Input: $version_output"
echo "Extracted major version: $extracted"
echo "---"
}
# 测试可能导致问题的 MariaDB 格式
# MariaDB 10.6.12-MariaDB 可能报告为 "Ver 15.1 Distrib 10.6.12-MariaDB"
test_version_extraction "mysql Ver 15.1 Distrib 10.6.12-MariaDB"
# 测试只有 X.Y 格式的版本
test_version_extraction "mysql Ver 5.5 for Linux"
# 测试无版本号输出的情况
test_version_extraction "some random output"
# 测试空输出的情况
test_version_extraction ""Repository: MatheMatrix/zstack
Length of output: 297
🏁 Script executed:
# 读取 build/deploydb.sh 的相关部分
head -80 ./build/deploydb.sh | tail -60Repository: MatheMatrix/zstack
Length of output: 2361
🏁 Script executed:
# 测试完整的版本检测逻辑(包括 || 后备)
#!/bin/bash
# 测试版本提取中的后备机制
test_with_fallback() {
local version_output="$1"
# 模拟实际的脚本逻辑
local result=$(echo "$version_output" | grep -oP '\d+\.\d+\.\d+' | head -1 | cut -d'.' -f1 || echo "5")
echo "Input: '$version_output'"
echo "Result: '$result'"
echo "---"
}
# 测试各种格式
test_with_fallback "mysql Ver 5.7.38 for Linux"
test_with_fallback "mysql Ver 8.0.32 for Linux"
test_with_fallback "mysql Ver 5.5 for Linux"
test_with_fallback "mysql Ver 15.1 Distrib 10.6.12-MariaDB"
test_with_fallback "some random output"
test_with_fallback ""
# 测试 grep -oP 的返回码
echo "=== Testing grep return codes ==="
echo "mysql Ver 5.5 for Linux" | grep -oP '\d+\.\d+\.\d+'
echo "Return code: $?"
echo "some random output" | grep -oP '\d+\.\d+\.\d+'
echo "Return code: $?"Repository: MatheMatrix/zstack
Length of output: 415
修复版本检测逻辑中的严重缺陷。
版本提取逻辑存在关键问题:
- 提取失败:MySQL 5.5(格式为 X.Y)无法匹配正则
\d+\.\d+\.\d+,导致db_version为空而不是回退到 "5"。根本原因是cut对空输入返回退出码 0,使|| echo "5"永不执行。 - 版本错误:MariaDB "Ver 15.1 Distrib 10.6.12-MariaDB" 提取到 "10"(发行版本)而非 "15"(报告版本),导致版本判断错误。
- 空值风险:当
db_version为空时,第 36 行的[ "$db_version" -ge 8 ]会评估失败,分支逻辑不符预期。
建议重写版本检测使用更兼容的模式(如 \d+ 提取首个数字),或使用版本命令的不同输出解析方法。同时在第 34、49 行为 "${MYSQL}" 和 ${loginCmd} 添加双引号以防止分词。
🧰 Tools
🪛 Shellcheck (0.11.0)
[info] 36-36: Double quote to prevent globbing and word splitting.
(SC2086)
[info] 51-51: Double quote to prevent globbing and word splitting.
(SC2086)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@build/deploydb.sh` around lines 30 - 62, The db version detection and MySQL
invocation are brittle: update the extraction logic for db_version to reliably
get the major version (e.g., use a regex that grabs the first numeric token like
'\d+' or parse the "Distrib" token so MariaDB returns 10 not the client report
number), ensure the fallback to "5" actually triggers if extraction yields empty
(assign db_version="$(... || echo 5)" or set db_version="${db_version:-5}"), and
guard the numeric compare by treating db_version as an integer (e.g., defaulting
empty to 5 before [ "$db_version" -ge 8 ]). Also quote the invocations of MYSQL
and loginCmd (use "${MYSQL}" "${loginCmd}") where used in the here-doc blocks so
word-splitting is prevented; locate and change uses of db_version, MYSQL, and
loginCmd in the script (variables: db_version, MYSQL, loginCmd) accordingly.
| } else if (isNoHypervisor(spec.getHypervisorType())) { | ||
| error = operr("no Enabled hosts found in the [%s] candidate hosts having the hypervisor type [%s]", candidates.size(), spec.getHypervisorType()); | ||
| error = operr(ORG_ZSTACK_COMPUTE_ALLOCATOR_10031, "no Enabled hosts found in the [%s] candidate hosts having the hypervisor type [%s]", candidates.size(), spec.getHypervisorType()); |
There was a problem hiding this comment.
spec.getHypervisorType() 为空时存在空指针风险
Line 95 在未判空时调用 isNoHypervisor(...),会在 hyType.equals(...) 处触发 NPE,导致分配流程异常中断。
建议修复
- } else if (isNoHypervisor(spec.getHypervisorType())) {
+ } else if (spec.getHypervisorType() != null && isNoHypervisor(spec.getHypervisorType())) {
error = operr(ORG_ZSTACK_COMPUTE_ALLOCATOR_10031, "no Enabled hosts found in the [%s] candidate hosts having the hypervisor type [%s]", candidates.size(), spec.getHypervisorType());
} else {🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In
`@compute/src/main/java/org/zstack/compute/allocator/HostStateAndHypervisorAllocatorFlow.java`
around lines 95 - 96, The call to isNoHypervisor(spec.getHypervisorType()) can
throw an NPE when spec.getHypervisorType() is null; update the
HostStateAndHypervisorAllocatorFlow code to null-safe this check by first
verifying spec.getHypervisorType() is not null (e.g., if
(spec.getHypervisorType() != null && isNoHypervisor(spec.getHypervisorType())) )
or alternatively make isNoHypervisor(...) defensively handle null inputs, and
keep the existing operr(...) usage unchanged so the error path still executes
when no hypervisor type is present.
| private void deleteHostNetworkLabel(APIDeleteHostNetworkServiceTypeMsg msg, Completion completion) { | ||
| HostNetworkLabelVO vo = dbf.findByUuid(msg.getUuid(), HostNetworkLabelVO.class); | ||
| CollectionUtils.safeForEach(pluginRgty.getExtensionList(HostNetworkLabelExtensionPoint.class), | ||
| arg -> arg.deleteHostNetworkLabel(vo.toInventory(), completion)); | ||
| } | ||
|
|
||
| private void updateHostNetworkLabel(APIUpdateHostNetworkServiceTypeMsg msg, Completion completion) { | ||
| HostNetworkLabelVO vo = dbf.findByUuid(msg.getUuid(), HostNetworkLabelVO.class); | ||
| CollectionUtils.safeForEach(pluginRgty.getExtensionList(HostNetworkLabelExtensionPoint.class), | ||
| arg -> arg.updateHostNetworkLabel(vo.toInventory(), msg.getServiceType(), completion)); |
There was a problem hiding this comment.
扩展点回调方式会让更新/删除接口挂起或重入。
这里把同一个 Completion 直接传给所有 HostNetworkLabelExtensionPoint。如果没有任何扩展实现,completion 永远不会被回调,API 会一直卡住;如果有多个实现,任一扩展都可能重复触发同一个 completion,导致 chain.next()/事件发布重入。这里需要像前面的 beforeAddHost 一样串行聚合扩展,并在扩展列表为空时直接 success()。
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@compute/src/main/java/org/zstack/compute/host/HostManagerImpl.java` around
lines 222 - 231, The current deleteHostNetworkLabel and updateHostNetworkLabel
pass the same Completion to every HostNetworkLabelExtensionPoint via
CollectionUtils.safeForEach, which can leave the API hanging if the extension
list is empty or cause duplicate/ reentrant completion calls when multiple
extensions invoke the same Completion; fix by retrieving
pluginRgty.getExtensionList(HostNetworkLabelExtensionPoint.class) into a list,
if the list is empty call completion.success() immediately, otherwise invoke the
extensions serially (like beforeAddHost) by chaining calls so each extension
receives a per-extension callback that advances to the next extension and only
the final step calls the original completion.success()/fail(), ensuring
HostNetworkLabelExtensionPoint.deleteHostNetworkLabel/updateHostNetworkLabel are
invoked in order and completion is invoked exactly once.
| AttachNicToVmReply r = new AttachNicToVmReply(); | ||
| if (!reply.isSuccess()) { | ||
| r.setError(err(VmErrors.ATTACH_NETWORK_ERROR, r.getError(), r.getError().getDetails())); | ||
| r.setError(err(ORG_ZSTACK_COMPUTE_VM_10274, VmErrors.ATTACH_NETWORK_ERROR, r.getError(), r.getError().getDetails())); | ||
| } |
There was a problem hiding this comment.
这里会把真实错误覆盖成空指针。
Line 2979 读取的是新建出来的 AttachNicToVmReply r,它的 error 还是 null。一旦 host 侧返回失败,这里会在 r.getError().getDetails() 处抛 NPE,原始 attach 失败原因也一起丢掉。
🛠️ 建议修正
AttachNicToVmReply r = new AttachNicToVmReply();
if (!reply.isSuccess()) {
- r.setError(err(ORG_ZSTACK_COMPUTE_VM_10274, VmErrors.ATTACH_NETWORK_ERROR, r.getError(), r.getError().getDetails()));
+ r.setError(err(ORG_ZSTACK_COMPUTE_VM_10274, VmErrors.ATTACH_NETWORK_ERROR,
+ reply.getError(), reply.getError().getDetails()));
}
bus.reply(msg, r);| if (!_conditions.isEmpty()) { | ||
| for (Condition condition : _conditions) { | ||
| q.condition(condition._attr, condition._op, condition._val[0]); | ||
| } |
There was a problem hiding this comment.
条件参数转换存在越界与语义丢失
Line [433] 只传了 condition._val[0]:
Op.NULL/Op.NOT_NULL时可能数组为空,触发越界异常;Op.IN/Op.NOT_IN时会丢失其余参数,导致查询条件错误。
💡 建议修复
if (!_conditions.isEmpty()) {
for (Condition condition : _conditions) {
- q.condition(condition._attr, condition._op, condition._val[0]);
+ if (condition._val == null || condition._val.length == 0) {
+ q.condition(condition._attr, condition._op);
+ } else {
+ q.condition(condition._attr, condition._op, condition._val);
+ }
}
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@core/src/main/java/org/zstack/core/db/SimpleQueryImpl.java` around lines 431
- 434, The loop in SimpleQueryImpl that calls q.condition(condition._attr,
condition._op, condition._val[0]) is unsafe and lossy: it will IndexOutOfBounds
for Op.NULL/Op.NOT_NULL when _val is empty and will drop values for
Op.IN/Op.NOT_IN. Fix by branching on condition._op inside the _conditions
iteration in SimpleQueryImpl: for Op.NULL/Op.NOT_NULL call q.condition with no
value (or the API equivalent that accepts only attr+op); for Op.IN/Op.NOT_IN
pass the whole condition._val array or converted Collection to q.condition; for
single-value ops verify condition._val.length>0 and pass condition._val[0] (or
handle/throw when missing). Update the call sites that use q.condition
accordingly and reference the Condition class fields _attr, _op, _val to
implement these branches.
1.Clone a tenant VM using admin account, quota check
didn't base on current role.
2.quota error is overwritten during exception handling.
http://jira.zstack.io/browse/ZSTAC-83646
Resolves: ZSTAC-83646
Change-Id: I776c6b7a786d79746f616b716f736f646e686868
sync from gitlab !9873