ddl: assign index name for anonymous add index ddl#4270
ddl: assign index name for anonymous add index ddl#4270wk989898 wants to merge 13 commits intopingcap:masterfrom
Conversation
|
Skipping CI for Draft Pull Request. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
Summary of ChangesHello @wk989898, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a mechanism to correctly handle DDL statements that add anonymous indexes. Previously, such statements could lead to issues, especially during change data capture (CDC) retries, due to the lack of explicit index names. The changes parse Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a mechanism to restore anonymous index names in DDL queries, specifically for ADD INDEX operations. It adds a new file ddl_index_rewrite.go with functions to parse and modify the DDL statement, and modifies mysql_writer_ddl.go to incorporate this functionality. Additionally, it includes a test file mysql_writer_ddl_index_rewrite_test.go to verify the index name restoration. The integration test script run.sh is also updated to include index-related operations.
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughCentralizes SQL restoration into a new Restore(stmt) helper, threads IndexIDs through persisted and emitted DDL events, adds logic to rewrite anonymous ALTER TABLE ADD CONSTRAINT/INDEX to named indexes using IndexIDs and TableInfo, and adds tests plus integration updates exercising anonymous-index scenarios. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as DDL_Event
participant Exec as execDDL
participant Rewriter as IndexRewriter
participant Parser as TiDB_Parser
participant Table as TableInfo
participant Restore as Restore()
participant DB as MySQL_DB
Client->>Exec: emit DDL (ActionAddIndex, Query, IndexIDs)
Exec->>Rewriter: rewriteAnonymousIndexes(Query, TableInfo, IndexIDs)
Rewriter->>Parser: Parse(Query)
Parser-->>Rewriter: AST StmtNode
Rewriter->>Table: lookup index names by ID
Table-->>Rewriter: index name map
Rewriter->>Rewriter: substitute unnamed constraints with names
Rewriter->>Restore: Restore(modified StmtNode)
Restore-->>Rewriter: formatted SQL
Rewriter-->>Exec: (changed?, new Query)
Exec->>DB: execute (possibly rewritten) DDL
DB-->>Exec: result
Exec-->>Client: emit result/ack
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/common/event/util.go`:
- Around line 439-458: The exported Restore(stmt ast.StmtNode) helper is placed
in a file marked "for testing only" but is used by production code; move it into
a dedicated non-test utility file/package (create a new production utility file
and keep the same function signature and behavior of Restore, including the
format flags and error handling), remove it from the test-only file, and update
all call sites (ddl_index_rewrite.go, format_ddl.go, logservice schemastore
utils) to import the new package; ensure the function remains exported and unit
tests (if any) are updated to use the new location.
In `@pkg/sink/mysql/ddl_index_rewrite.go`:
- Around line 129-132: The loop in ddl_index_rewrite.go currently skips indices
with index.State == timodel.StatePublic which removes the newly added index from
the DDLEvent snapshot; stop filtering out public indices so the added index
remains in event.TableInfo for matching. Locate the loop that references
index.State (the code block with "Only use non-public indices." and the variable
index) and remove or change the conditional that continues on StatePublic so
public indices are preserved; ensure subsequent matching logic uses the
DDLEvent.TableInfo (from job.BinlogInfo.TableInfo) as-is so the retry path can
find the newly added index.
In `@pkg/sink/mysql/mysql_writer_ddl_index_rewrite_test.go`:
- Around line 87-90: The test passes raw SQL (restoredQuery) to
sqlmock.ExpectExec which treats its argument as a regexp; because restoredQuery
contains parentheses/backticks it can fail to match. Update the ExpectExec call
to pass a regexp-escaped string (e.g. use regexp.QuoteMeta(restoredQuery)) so
the literal SQL is matched; ensure you import the regexp package and apply the
same QuoteMeta wrapping for any other ExpectExec calls that contain characters
meaningful to regex (e.g., the "USE `test`;" or "SET TIMESTAMP = DEFAULT"
expectations if they include such characters).
In `@tests/integration_tests/ddl_wait/run.sh`:
- Around line 65-74: The test currently uses a fixed sleep before restarting CDC
which can miss overlapping async ADD INDEX jobs; replace the sleep 10 with the
same ddl_jobs polling used earlier to wait until the downstream add-index job
for the table is in state "running" or "queueing" (poll the ddl_jobs helper or
equivalent until it reports the add-index job is running/queueing), then call
cleanup_process and run_cdc_server to ensure the restart occurs while the DDL
job is active; keep the surrounding calls (run_sql, cleanup_process,
run_cdc_server, check_sync_diff) unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 197f318b-89d4-46ed-b754-dccda15fda50
📒 Files selected for processing (7)
logservice/schemastore/utils.gopkg/common/event/util.gopkg/sink/mysql/ddl_index_rewrite.gopkg/sink/mysql/format_ddl.gopkg/sink/mysql/mysql_writer_ddl.gopkg/sink/mysql/mysql_writer_ddl_index_rewrite_test.gotests/integration_tests/ddl_wait/run.sh
| func Restore(stmt ast.StmtNode) (string, error) { | ||
| var sb strings.Builder | ||
| // translate TiDB feature to special comment | ||
| restoreFlags := format.RestoreTiDBSpecialComment | ||
| // escape the keyword | ||
| restoreFlags |= format.RestoreNameBackQuotes | ||
| // upper case keyword | ||
| restoreFlags |= format.RestoreKeyWordUppercase | ||
| // wrap string with single quote | ||
| restoreFlags |= format.RestoreStringSingleQuotes | ||
| // remove placement rule | ||
| restoreFlags |= format.SkipPlacementRuleForRestore | ||
| // force disable ttl | ||
| restoreFlags |= format.RestoreWithTTLEnableOff | ||
| err := stmt.Restore(format.NewRestoreCtx(restoreFlags, &sb)) | ||
| if err != nil { | ||
| return "", errors.Trace(err) | ||
| } | ||
| return sb.String(), nil | ||
| } |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Move Restore into a production utility file.
Lines 49-51 say everything in pkg/common/event/util.go is for testing only, but this new exported helper is now used from production paths (pkg/sink/mysql/ddl_index_rewrite.go, pkg/sink/mysql/format_ddl.go, and logservice/schemastore/utils.go). Please move it to a dedicated non-test file/package so the boundary is explicit.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/common/event/util.go` around lines 439 - 458, The exported Restore(stmt
ast.StmtNode) helper is placed in a file marked "for testing only" but is used
by production code; move it into a dedicated non-test utility file/package
(create a new production utility file and keep the same function signature and
behavior of Restore, including the format flags and error handling), remove it
from the test-only file, and update all call sites (ddl_index_rewrite.go,
format_ddl.go, logservice schemastore utils) to import the new package; ensure
the function remains exported and unit tests (if any) are updated to use the new
location.
| mock.ExpectBegin() | ||
| mock.ExpectExec("USE `test`;").WillReturnResult(sqlmock.NewResult(1, 1)) | ||
| mock.ExpectExec("SET TIMESTAMP = DEFAULT").WillReturnResult(sqlmock.NewResult(1, 1)) | ||
| mock.ExpectExec(restoredQuery).WillReturnResult(sqlmock.NewResult(1, 1)) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cd /tmp && find . -name "mysql_writer_ddl_index_rewrite_test.go" -type f 2>/dev/null | head -5Repository: pingcap/ticdc
Length of output: 39
🏁 Script executed:
find . -name "mysql_writer_ddl_index_rewrite_test.go" -type f 2>/dev/nullRepository: pingcap/ticdc
Length of output: 112
🏁 Script executed:
fd "mysql_writer_ddl_index_rewrite_test.go"Repository: pingcap/ticdc
Length of output: 110
🏁 Script executed:
cat -n pkg/sink/mysql/mysql_writer_ddl_index_rewrite_test.go | sed -n '1,120p'Repository: pingcap/ticdc
Length of output: 3554
🏁 Script executed:
web_search
sqlmock ExpectExec regexp pattern matchingRepository: pingcap/ticdc
Length of output: 153
🏁 Script executed:
rg "ExpectExec" --type go -B 2 -A 2 | head -80Repository: pingcap/ticdc
Length of output: 6855
🏁 Script executed:
rg "regexp.QuoteMeta" --type go | head -20Repository: pingcap/ticdc
Length of output: 1132
🏁 Script executed:
rg "ExpectExec|ExpectQuery" --type go -B 1 -A 1 | grep -E "(ExpectExec|ExpectQuery|QuoteMeta)" | head -40Repository: pingcap/ticdc
Length of output: 6761
🏁 Script executed:
cat -n pkg/sink/mysql/mysql_writer_ddl_index_rewrite_test.go | sed -n '55,95p'Repository: pingcap/ticdc
Length of output: 1571
🏁 Script executed:
rg "func restoreAnonymousIndexToNamedIndex" --type go -A 30Repository: pingcap/ticdc
Length of output: 1931
🏁 Script executed:
rg "ExpectExec.*regexp.QuoteMeta" --type goRepository: pingcap/ticdc
Length of output: 39
🏁 Script executed:
rg "ExpectQuery.*regexp.QuoteMeta" --type go | head -10Repository: pingcap/ticdc
Length of output: 1008
🏁 Script executed:
cat -n pkg/sink/mysql/ddl_index_rewrite.go | sed -n '1,60p'Repository: pingcap/ticdc
Length of output: 2089
🏁 Script executed:
rg "sqlmock\." pkg/sink/mysql/mysql_writer_ddl_index_rewrite_test.goRepository: pingcap/ticdc
Length of output: 290
🏁 Script executed:
rg "github.com/DATA-DOG/go-sqlmock" --type go -lRepository: pingcap/ticdc
Length of output: 568
🏁 Script executed:
rg "go-sqlmock" go.modRepository: pingcap/ticdc
Length of output: 95
🏁 Script executed:
rg "sqlmock.*regexp|regexp.*sqlmock" --type goRepository: pingcap/ticdc
Length of output: 39
🏁 Script executed:
rg "ExpectExec.*\(" pkg/sink/mysql/ --type go | grep -E "VALUES|INSERT|ALTER|CREATE" | head -15Repository: pingcap/ticdc
Length of output: 2954
🏁 Script executed:
python3 << 'EOF'
# Let's check the sqlmock library documentation pattern
# Based on go-sqlmock usage in the codebase, both ExpectQuery and ExpectExec use regexp matching
# Evidence found:
print("Evidence from codebase search:")
print("1. ExpectQuery uses regexp.QuoteMeta() in multiple places:")
print(" - pkg/check/cluster_test.go")
print(" - pkg/check/active_active_tso_indexes_test.go")
print()
print("2. go-sqlmock version: v1.5.0")
print()
print("3. Pattern: When passing dynamic SQL strings to ExpectQuery,")
print(" regexp.QuoteMeta() is used to escape special characters")
print()
print("4. Since both ExpectQuery and ExpectExec are in the same library,")
print(" they should have the same behavior (treating input as regexp)")
print()
print("Conclusion: The review is CORRECT")
EOFRepository: pingcap/ticdc
Length of output: 545
🏁 Script executed:
cat -n pkg/sink/mysql/ddl_index_rewrite.go | sed -n '30,80p'Repository: pingcap/ticdc
Length of output: 1514
Escape the SQL before handing it to sqlmock.
ExpectExec interprets its input as a regexp. restoredQuery contains parentheses, so this expectation can miss the literal statement even when execDDL is correct.
Suggested fix
import (
+ "regexp"
"testing"
@@
- mock.ExpectExec(restoredQuery).WillReturnResult(sqlmock.NewResult(1, 1))
+ mock.ExpectExec(regexp.QuoteMeta(restoredQuery)).WillReturnResult(sqlmock.NewResult(1, 1))📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| mock.ExpectBegin() | |
| mock.ExpectExec("USE `test`;").WillReturnResult(sqlmock.NewResult(1, 1)) | |
| mock.ExpectExec("SET TIMESTAMP = DEFAULT").WillReturnResult(sqlmock.NewResult(1, 1)) | |
| mock.ExpectExec(restoredQuery).WillReturnResult(sqlmock.NewResult(1, 1)) | |
| mock.ExpectBegin() | |
| mock.ExpectExec("USE `test`;").WillReturnResult(sqlmock.NewResult(1, 1)) | |
| mock.ExpectExec("SET TIMESTAMP = DEFAULT").WillReturnResult(sqlmock.NewResult(1, 1)) | |
| mock.ExpectExec(regexp.QuoteMeta(restoredQuery)).WillReturnResult(sqlmock.NewResult(1, 1)) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/sink/mysql/mysql_writer_ddl_index_rewrite_test.go` around lines 87 - 90,
The test passes raw SQL (restoredQuery) to sqlmock.ExpectExec which treats its
argument as a regexp; because restoredQuery contains parentheses/backticks it
can fail to match. Update the ExpectExec call to pass a regexp-escaped string
(e.g. use regexp.QuoteMeta(restoredQuery)) so the literal SQL is matched; ensure
you import the regexp package and apply the same QuoteMeta wrapping for any
other ExpectExec calls that contain characters meaningful to regex (e.g., the
"USE `test`;" or "SET TIMESTAMP = DEFAULT" expectations if they include such
characters).
| run_sql "update test.t set col = 55 where id = 5;" | ||
| run_sql "alter table test.t add index (col);" | ||
| run_sql "update test.t set col = 66 where id = 6;" | ||
| run_sql "alter table test.t add index (col);" | ||
| run_sql "update test.t set col = 77 where id = 7;" | ||
| sleep 10 | ||
| cleanup_process $CDC_BINARY | ||
| run_cdc_server --workdir $WORK_DIR --binary $CDC_BINARY | ||
| # make sure all tables are equal in upstream and downstream | ||
| check_sync_diff $WORK_DIR $CUR/conf/diff_config.toml 300 |
There was a problem hiding this comment.
Make the retry window deterministic.
This block only waits sleep 10 before restarting CDC. On a fast run, both anonymous ADD INDEX jobs can finish before Line 71, so the restart never overlaps the async DDL and this test passes without covering the new retry path. Reuse the earlier ddl_jobs polling here and restart only after the downstream add-index job is running or queueing.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/integration_tests/ddl_wait/run.sh` around lines 65 - 74, The test
currently uses a fixed sleep before restarting CDC which can miss overlapping
async ADD INDEX jobs; replace the sleep 10 with the same ddl_jobs polling used
earlier to wait until the downstream add-index job for the table is in state
"running" or "queueing" (poll the ddl_jobs helper or equivalent until it reports
the add-index job is running/queueing), then call cleanup_process and
run_cdc_server to ensure the restart occurs while the DDL job is active; keep
the surrounding calls (run_sql, cleanup_process, run_cdc_server,
check_sync_diff) unchanged.
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
pkg/sink/mysql/ddl_index_rewrite.go (1)
75-79: Wrap the error fromRestorewitherrors.Trace.Per coding guidelines, errors from library/third-party calls should be wrapped with
errors.Traceto attach a stack trace. The error on line 77 is returned directly without wrapping.🔧 Proposed fix
restoredQuery, err := commonEvent.Restore(stmt) if err != nil { - return query, false, err + return query, false, errors.Trace(err) }As per coding guidelines: "When an error comes from a third-party or library call in Go, wrap it immediately with
errors.Trace(err)"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/sink/mysql/ddl_index_rewrite.go` around lines 75 - 79, The error returned from the call commonEvent.Restore(stmt) should be wrapped with errors.Trace before returning to attach a stack trace; update the error return in the function that calls commonEvent.Restore (the block that currently does "restoredQuery, err := commonEvent.Restore(stmt) ... return query, false, err") to return errors.Trace(err) instead of err and keep the existing boolean/return structure so callers receive the traced error.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@logservice/schemastore/types.go`:
- Around line 91-94: PersistedDDLEvent's new field IndexIDs (msg:"index_ids")
isn't handled by the generated serializer: update the msgp-generated code so
UnmarshalMsg in types_gen.go includes the case for "index_ids"; run the msgp
generation (e.g., go generate ./...) to regenerate types_gen.go so the
UnmarshalMsg and MarshalMsg implementations correctly serialize/deserialize
IndexIDs for the PersistedDDLEvent type.
In `@logservice/schemastore/utils.go`:
- Around line 83-102: In getIndexIDs, avoid a nil pointer dereference by
checking job and job.MultiSchemaInfo before iterating SubJobs: when idxArgs==nil
|| err!=nil, first return early or guard with if job==nil ||
job.MultiSchemaInfo==nil { return res } (or check
len(job.MultiSchemaInfo.SubJobs)==0) before the for loop that ranges over
job.MultiSchemaInfo.SubJobs; then continue to call subJob.ToProxyJob and
model.GetModifyIndexArgs as before. Ensure the same nil-guard covers access to
SubJobs throughout getIndexIDs.
In `@pkg/common/event/ddl_event.go`:
- Around line 108-112: The DDLEvent struct's IndexIDs field lacks a json tag so
json.Marshal/json.Unmarshal used in encodeV1 won't include it; update the
IndexIDs field (in the DDLEvent definition) to include a matching json tag
(e.g., json:"index_ids") so JSON encoding/decoding preserves the field (ensure
the tag name matches the existing msg tag and any expectations in encodeV1).
---
Nitpick comments:
In `@pkg/sink/mysql/ddl_index_rewrite.go`:
- Around line 75-79: The error returned from the call commonEvent.Restore(stmt)
should be wrapped with errors.Trace before returning to attach a stack trace;
update the error return in the function that calls commonEvent.Restore (the
block that currently does "restoredQuery, err := commonEvent.Restore(stmt) ...
return query, false, err") to return errors.Trace(err) instead of err and keep
the existing boolean/return structure so callers receive the traced error.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: aa53af07-74a6-4e7e-a47c-e6b4c6c34ca5
📒 Files selected for processing (7)
logservice/schemastore/persist_storage_ddl_handlers.gologservice/schemastore/types.gologservice/schemastore/utils.gopkg/common/event/ddl_event.gopkg/sink/mysql/ddl_index_rewrite.gopkg/sink/mysql/mysql_writer_ddl.gopkg/sink/mysql/mysql_writer_ddl_index_rewrite_test.go
✅ Files skipped from review due to trivial changes (1)
- logservice/schemastore/persist_storage_ddl_handlers.go
🚧 Files skipped from review as they are similar to previous changes (1)
- pkg/sink/mysql/mysql_writer_ddl.go
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
tests/integration_tests/ddl_wait/run.sh (1)
80-89:⚠️ Potential issue | 🟠 MajorReplace the fixed restart sleep with DDL-state polling.
sleep 10still makes this branch timing-dependent: on a fast run, both anonymousADD INDEXjobs can finish before the restart, so the retry/rewrite path never gets exercised. Reuse the earlierddl_jobspolling and restart only after one of those downstream jobs isrunningorqueueing.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/integration_tests/ddl_wait/run.sh` around lines 80 - 89, The fixed sleep is making the restart timing-dependent; replace the sleep 10 in run.sh with a poll loop that checks the DDL job states (reuse the earlier ddl_jobs polling helper) after issuing the two ALTER TABLE ... ADD INDEX commands and before calling cleanup_process/$CDC_BINARY and run_cdc_server, and only proceed to restart once at least one downstream DDL job reports a state of "running" or "queueing" (or timeout/fail after the same deadline used elsewhere) so the retry/rewrite path is exercised reliably.
🧹 Nitpick comments (1)
logservice/schemastore/utils.go (1)
45-55: PropagatecommonEvent.Restoreerrors directly.
commonEvent.Restorealready returns a traced error, so wrapping it again here just duplicates the stack. Returnerrdirectly in both branches.Suggested cleanup
for _, stmt := range stmts { query, err := commonEvent.Restore(stmt) if err != nil { - return "", errors.Trace(err) + return "", err } results = append(results, query) } result = strings.Join(results, ";") } else { result, err = commonEvent.Restore(stmts[0]) if err != nil { - return "", errors.Trace(err) + return "", err } }As per coding guidelines, "When an error comes from a third-party or library call in Go, wrap it immediately with
errors.Trace(err)orerrors.WrapError(...)to attach a stack trace; upstream callers should propagate wrapped errors without wrapping again".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@logservice/schemastore/utils.go` around lines 45 - 55, The code is re-wrapping an already-traced error from commonEvent.Restore; instead of returning errors.Trace(err) in both branches, propagate the original err directly. Update the error returns in the branch that handles multiple stmts (inside the loop after commonEvent.Restore) and in the single-stmt branch (after commonEvent.Restore(stmts[0])) to return err, leaving commonEvent.Restore, err, results and result handling unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/common/event/ddl_event.go`:
- Around line 109-111: The V1 payload was changed incompatibly by appending an
index_ids trailer; revert to a backward-compatible V1 by removing the extra
trailer and its parsing: stop appending the index-count trailer in encodeV1 and
remove the corresponding read logic in decodeV1 so both rely solely on the JSON
field IndexIDs (and keep DDLEventVersion1 unchanged); if you instead intend a
new payload layout, bump the version constant DDLEventVersion1 and implement a
decode branch that uses the new trailer, otherwise also remove any logic that
treats multipleTableInfosDataSize as the index-count footer to avoid mis-parsing
older V1 payloads.
---
Duplicate comments:
In `@tests/integration_tests/ddl_wait/run.sh`:
- Around line 80-89: The fixed sleep is making the restart timing-dependent;
replace the sleep 10 in run.sh with a poll loop that checks the DDL job states
(reuse the earlier ddl_jobs polling helper) after issuing the two ALTER TABLE
... ADD INDEX commands and before calling cleanup_process/$CDC_BINARY and
run_cdc_server, and only proceed to restart once at least one downstream DDL job
reports a state of "running" or "queueing" (or timeout/fail after the same
deadline used elsewhere) so the retry/rewrite path is exercised reliably.
---
Nitpick comments:
In `@logservice/schemastore/utils.go`:
- Around line 45-55: The code is re-wrapping an already-traced error from
commonEvent.Restore; instead of returning errors.Trace(err) in both branches,
propagate the original err directly. Update the error returns in the branch that
handles multiple stmts (inside the loop after commonEvent.Restore) and in the
single-stmt branch (after commonEvent.Restore(stmts[0])) to return err, leaving
commonEvent.Restore, err, results and result handling unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 367ac3d3-0264-4150-8aed-1d7b70629540
📒 Files selected for processing (4)
logservice/schemastore/types_gen.gologservice/schemastore/utils.gopkg/common/event/ddl_event.gotests/integration_tests/ddl_wait/run.sh
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
pkg/common/event/ddl_event.go (2)
353-363:⚠️ Potential issue | 🟠 MajorDecode
MultipleTableInfoswithout reversing it.
encodeV1appendsMultipleTableInfosfrom first to last, but this loop reads them from the tail andappends each item. The decoded slice becomes reversed, soGetEvents()can pairqueries[i]with the wrong table info forCREATE TABLES/RENAME TABLES.Suggested fix
- for i := 0; i < int(multipleTableInfosDataSize); i++ { + count := int(multipleTableInfosDataSize) + t.MultipleTableInfos = make([]*common.TableInfo, count) + for i := count - 1; i >= 0; i-- { tableInfoDataSize := binary.BigEndian.Uint64(data[end-8 : end]) tableInfoData := data[end-8-int(tableInfoDataSize) : end-8] info, err := common.UnmarshalJSONToTableInfo(tableInfoData) if err != nil { return err } - t.MultipleTableInfos = append(t.MultipleTableInfos, info) + t.MultipleTableInfos[i] = info end -= 8 + int(tableInfoDataSize) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/common/event/ddl_event.go` around lines 353 - 363, The decode loop for MultipleTableInfos in ddl_event.go reverses the order because it reads from the end and appends, so update the logic in the Unmarshal section that uses binary.BigEndian and UnmarshalJSONToTableInfo to preserve original order (matching encodeV1). Determine multipleTableInfosDataSize, preallocate t.MultipleTableInfos with that length, then as you iterate reading each tableInfoData fill the slice at index (i) or at (multipleTableInfosDataSize-1-i) appropriately so the final t.MultipleTableInfos matches the encodeV1 order (or alternatively collect into a temp slice and reverse it before assigning). Ensure GetEvents pairing logic continues to reference t.MultipleTableInfos in the original encode order.
95-107:⚠️ Potential issue | 🟠 MajorIsBootstrap will not survive serialization/deserialization roundtrips.
IsBootstrapis marked withjson:"-", which excludes it from the JSON payload inencodeV1()at line 306. When deserialized indecodeV1(), it will always befalse, even if it was set totruebefore serialization. This breaks callers likepkg/sink/codec/bootstraper.go:158andpkg/sink/codec/simple/encoder_test.go:1327that rely on this flag being preserved.
NotSyncis correctly tagged asjson:"not_sync"and will survive the round-trip, which is essential sincedownstreamadapter/dispatcher/basic_dispatcher.go:333reads this field post-deserialization to decide whether to skip syncing events.Either serialize
IsBootstrapby removing thejson:"-"tag or remove it as a serialized field and reconstruct it separately during deserialization.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/common/event/ddl_event.go` around lines 95 - 107, The IsBootstrap field is currently excluded from JSON (json:"-") so it is lost in encodeV1()/decodeV1() round-trips; either persist it or rehydrate it. Fix by making IsBootstrap a serialized field (e.g., change `IsBootstrap bool \`json:"-"\`` to `IsBootstrap bool \`json:"is_bootstrap"\``) and ensure encodeV1() and decodeV1() include/handle the new "is_bootstrap" key (functions to update: the IsBootstrap field declaration and the encodeV1()/decodeV1() implementations that serialize/deserialize ddl_event.go) so callers like pkg/sink/codec/bootstraper.go and tests relying on its value observe the original flag after decode.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@pkg/common/event/ddl_event.go`:
- Around line 353-363: The decode loop for MultipleTableInfos in ddl_event.go
reverses the order because it reads from the end and appends, so update the
logic in the Unmarshal section that uses binary.BigEndian and
UnmarshalJSONToTableInfo to preserve original order (matching encodeV1).
Determine multipleTableInfosDataSize, preallocate t.MultipleTableInfos with that
length, then as you iterate reading each tableInfoData fill the slice at index
(i) or at (multipleTableInfosDataSize-1-i) appropriately so the final
t.MultipleTableInfos matches the encodeV1 order (or alternatively collect into a
temp slice and reverse it before assigning). Ensure GetEvents pairing logic
continues to reference t.MultipleTableInfos in the original encode order.
- Around line 95-107: The IsBootstrap field is currently excluded from JSON
(json:"-") so it is lost in encodeV1()/decodeV1() round-trips; either persist it
or rehydrate it. Fix by making IsBootstrap a serialized field (e.g., change
`IsBootstrap bool \`json:"-"\`` to `IsBootstrap bool \`json:"is_bootstrap"\``)
and ensure encodeV1() and decodeV1() include/handle the new "is_bootstrap" key
(functions to update: the IsBootstrap field declaration and the
encodeV1()/decodeV1() implementations that serialize/deserialize ddl_event.go)
so callers like pkg/sink/codec/bootstraper.go and tests relying on its value
observe the original flag after decode.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: c6ee7319-0d94-49e0-9b58-f10319bb2f9e
📒 Files selected for processing (1)
pkg/common/event/ddl_event.go
Signed-off-by: wk989898 <nhsmwk@gmail.com>
There was a problem hiding this comment.
♻️ Duplicate comments (2)
pkg/sink/mysql/mysql_writer_ddl_index_rewrite_test.go (2)
231-235:⚠️ Potential issue | 🟡 MinorSame sqlmock regex escaping issue here.
Apply
regexp.QuoteMetatorestoredQueryas in the earlier test.🔧 Proposed fix
- mock.ExpectExec(restoredQuery).WillReturnResult(sqlmock.NewResult(1, 1)) + mock.ExpectExec(regexp.QuoteMeta(restoredQuery)).WillReturnResult(sqlmock.NewResult(1, 1))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/sink/mysql/mysql_writer_ddl_index_rewrite_test.go` around lines 231 - 235, The test uses sqlmock.ExpectExec(restoredQuery) which treats restoredQuery as a regex and can fail on unescaped SQL; update the mock expectation to use regexp.QuoteMeta(restoredQuery) (i.e., call regexp.QuoteMeta on the restoredQuery passed into mock.ExpectExec) so the SQL string is matched literally; keep the other expectations (mock.ExpectBegin, mock.ExpectExec("USE `test`;"), mock.ExpectExec("SET TIMESTAMP = DEFAULT"), mock.ExpectCommit()) unchanged and only replace restoredQuery with regexp.QuoteMeta(restoredQuery).
120-124:⚠️ Potential issue | 🟡 MinorEscape the SQL before handing it to sqlmock.
ExpectExecinterprets its input as a regexp.restoredQuerycontains characters like backticks and parentheses that have special regex meanings, so the expectation may fail to match the literal statement.🔧 Proposed fix
import ( + "regexp" "testing" @@ - mock.ExpectExec(restoredQuery).WillReturnResult(sqlmock.NewResult(1, 1)) + mock.ExpectExec(regexp.QuoteMeta(restoredQuery)).WillReturnResult(sqlmock.NewResult(1, 1))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/sink/mysql/mysql_writer_ddl_index_rewrite_test.go` around lines 120 - 124, The test uses mock.ExpectExec(restoredQuery) but sqlmock treats the argument as a regexp, so special characters in restoredQuery (backticks, parentheses, etc.) can cause the match to fail; update the expectation to pass an escaped literal by wrapping restoredQuery with regexp.QuoteMeta (e.g., mock.ExpectExec(regexp.QuoteMeta(restoredQuery))). Ensure you import regexp if not already and apply the same escaping for any other ExpectExec/ExpectQuery calls that use raw SQL strings.
🧹 Nitpick comments (1)
pkg/sink/mysql/mysql_writer_ddl_index_rewrite_test.go (1)
190-193: Clarify variable naming for readability.The variable
unchangedactually holds thechangedreturn value fromrestoreAnonymousIndexToNamedIndex. Consider renaming for clarity.♻️ Suggested refactor
- unchangedQuery, unchanged, err := restoreAnonymousIndexToNamedIndex(mixedQuery, tableInfo, nil) + unchangedQuery, changed, err := restoreAnonymousIndexToNamedIndex(mixedQuery, tableInfo, nil) require.NoError(t, err) - require.False(t, unchanged) + require.False(t, changed) require.Equal(t, mixedQuery, unchangedQuery)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/sink/mysql/mysql_writer_ddl_index_rewrite_test.go` around lines 190 - 193, The boolean variable named `unchanged` is misleading because it actually receives the `changed` return value from restoreAnonymousIndexToNamedIndex; rename `unchanged` to `changed` in the test (line where you call restoreAnonymousIndexToNamedIndex and the subsequent require assertions) and update the assertions accordingly (e.g., require.False(t, changed) and any related messages) so variable names reflect their meaning while keeping the same expected outcomes; locate usages around unchangedQuery, unchanged, err and replace with unchangedQuery, changed, err and adjust the require calls to reference changed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@pkg/sink/mysql/mysql_writer_ddl_index_rewrite_test.go`:
- Around line 231-235: The test uses sqlmock.ExpectExec(restoredQuery) which
treats restoredQuery as a regex and can fail on unescaped SQL; update the mock
expectation to use regexp.QuoteMeta(restoredQuery) (i.e., call regexp.QuoteMeta
on the restoredQuery passed into mock.ExpectExec) so the SQL string is matched
literally; keep the other expectations (mock.ExpectBegin, mock.ExpectExec("USE
`test`;"), mock.ExpectExec("SET TIMESTAMP = DEFAULT"), mock.ExpectCommit())
unchanged and only replace restoredQuery with regexp.QuoteMeta(restoredQuery).
- Around line 120-124: The test uses mock.ExpectExec(restoredQuery) but sqlmock
treats the argument as a regexp, so special characters in restoredQuery
(backticks, parentheses, etc.) can cause the match to fail; update the
expectation to pass an escaped literal by wrapping restoredQuery with
regexp.QuoteMeta (e.g., mock.ExpectExec(regexp.QuoteMeta(restoredQuery))).
Ensure you import regexp if not already and apply the same escaping for any
other ExpectExec/ExpectQuery calls that use raw SQL strings.
---
Nitpick comments:
In `@pkg/sink/mysql/mysql_writer_ddl_index_rewrite_test.go`:
- Around line 190-193: The boolean variable named `unchanged` is misleading
because it actually receives the `changed` return value from
restoreAnonymousIndexToNamedIndex; rename `unchanged` to `changed` in the test
(line where you call restoreAnonymousIndexToNamedIndex and the subsequent
require assertions) and update the assertions accordingly (e.g.,
require.False(t, changed) and any related messages) so variable names reflect
their meaning while keeping the same expected outcomes; locate usages around
unchangedQuery, unchanged, err and replace with unchangedQuery, changed, err and
adjust the require calls to reference changed.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 0cd0a678-dba5-4e54-b567-b3eaace7797d
📒 Files selected for processing (8)
logservice/schemastore/persist_storage_ddl_handlers.gologservice/schemastore/types.gologservice/schemastore/utils_test.gopkg/common/event/ddl_event.gopkg/sink/mysql/ddl_index_rewrite.gopkg/sink/mysql/mysql_writer_ddl.gopkg/sink/mysql/mysql_writer_ddl_index_rewrite_test.gotests/integration_tests/ddl_wait/run.sh
🚧 Files skipped from review as they are similar to previous changes (3)
- logservice/schemastore/types.go
- pkg/sink/mysql/ddl_index_rewrite.go
- pkg/sink/mysql/mysql_writer_ddl.go
|
/test all |
Signed-off-by: wk989898 <nhsmwk@gmail.com>
|
/test all |
What problem does this PR solve?
Issue Number: close #2327
What is changed and how it works?
How does it fix anonymous ADD INDEX?
ALTER TABLE ... ADD INDEX (col)relies on the database to auto-generate an index name. Upstream TiDB may generate a name (especially when collisions happen, e.g. existing index names), but the downstream database may generate a different name if repeated execution.Check List
Tests
Questions
Will it cause performance regression or break compatibility?
Do you need to update user documentation, design documentation or monitoring documentation?
Release note
Summary by CodeRabbit
New Features
Refactor
Tests