Skip to content

event: fix a bug of ddlEvent#4414

Open
asddongmen wants to merge 3 commits intopingcap:masterfrom
asddongmen:0310-fix-small-bug
Open

event: fix a bug of ddlEvent#4414
asddongmen wants to merge 3 commits intopingcap:masterfrom
asddongmen:0310-fix-small-bug

Conversation

@asddongmen
Copy link
Collaborator

@asddongmen asddongmen commented Mar 10, 2026

What problem does this PR solve?

Issue Number: close #4415

What is changed and how it works?

This PR fixes a correctness bug in pkg/common/event/DDLEvent.decodeV1 that appears when MultipleTableInfos is non-empty.
The previous implementation decoded the tail layout with an incorrect offset sequence: after reading the final count field, it did not advance the cursor before reading each table-info size. As a result, the decoder could treat the count as a size, mis-parse payload boundaries, and potentially return errors or corrupted data in multi-table DDL scenarios.

What changed

  • Reworked tail parsing for decodeV1 to follow the real wire format order.
  • Added defensive bounds checks for every size read (multiple table info, table info, and dispatcher ID) to prevent
    invalid slicing.
  • Preserved MultipleTableInfos order during reverse tail parsing.
  • Reset MultipleTableInfos and TableInfo before decode to avoid stale data when reusing event objects.
  • Kept existing event semantics and post-unmarshal initialization behavior.

Check List

Tests

  • Unit test

Questions

Will it cause performance regression or break compatibility?
Do you need to update user documentation, design documentation or monitoring documentation?

Release note

None

Summary by CodeRabbit

  • Bug Fixes

    • Enhanced validation and error handling for data event processing to prevent corrupted or malformed data from causing failures.
    • Added comprehensive boundary checks and size validations to ensure data integrity during decoding operations.
  • Tests

    • Added test coverage for multi-table event processing and round-trip encoding/decoding validation.
    • Added test coverage for data validation edge cases to ensure proper error handling.

Signed-off-by: dongmen <414110582@qq.com>
@ti-chi-bot
Copy link

ti-chi-bot bot commented Mar 10, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@ti-chi-bot ti-chi-bot bot added do-not-merge/needs-linked-issue release-note-none Denotes a PR that doesn't merit a release note. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Mar 10, 2026
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 10, 2026

📝 Walkthrough

Walkthrough

This pull request fixes a bug in DDLEvent.decodeV1 where MultipleTableInfos were incorrectly parsed when count > 0. The fix adds comprehensive validation, corrects tail layout parsing with proper boundary management, and validates data integrity throughout the decoding process.

Changes

Cohort / File(s) Summary
DDL Event Decoding Fix
pkg/common/event/ddl_event.go
Reworked decodeV1 to correctly parse MultipleTableInfos tail layout: reads count from end, iterates backwards through table infos, validates boundaries and sizes, pre-allocates slice storage, and ensures proper pointer management. Adds explicit error handling for invalid/missing data segments.
DDL Event Tests
pkg/common/event/ddl_event_test.go
Added TestDDLEventDecodeV1WithMultipleTableInfos to validate round-trip encoding/decoding of DDLEvents with multiple table infos, verifying preservation of DispatcherID, TableInfo, and MultipleTableInfos entries across unmarshal cycles.
Table Info Validation
pkg/common/table_info.go, pkg/common/table_info_test.go
Added validation in UnmarshalJSONToTableInfo to check minimum data length (8 bytes) and ensure columnSchemaDataSize does not exceed remaining payload. Includes comprehensive tests for error cases (nil data, short data, overflow) and round-trip validation of table metadata preservation.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested labels

lgtm, approved, size/XL

Suggested reviewers

  • wk989898
  • wlwilliamx
  • lidezhu

Poem

🐰 A rabbit hops through bytes so tight,
Fixing tails that weren't parsed right,
Boundaries checked with caring might,
MultipleTableInfos shine in light,
Decoding flows now perfectly right! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly identifies the main change: fixing a bug in DDLEvent decoding for multi-table scenarios.
Description check ✅ Passed The PR description adequately addresses the template requirements: identifies the bug, explains what changed, specifies unit tests were added, and provides release notes.
Linked Issues check ✅ Passed The code changes directly implement the fixes required by issue #4415: reworked tail parsing, added bounds checks, preserved order, and reset fields to prevent stale data.
Out of Scope Changes check ✅ Passed All changes are directly scoped to fixing the DDLEvent.decodeV1 bug: validation in TableInfo, test coverage for the fix, and no unrelated modifications.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ti-chi-bot ti-chi-bot bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Mar 10, 2026
@gemini-code-assist
Copy link

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical bug in the DDLEvent decoding logic, specifically for events that include multiple table information entries. The changes significantly enhance the robustness and correctness of the decoding process by implementing proper iteration, boundary checks, and data validation. This ensures that DDLEvents are reliably parsed, preventing potential data corruption or crashes that could arise from malformed event data.

Highlights

  • DDLEvent Decoding Fix: Refactored the decodeV1 method in DDLEvent to correctly handle the decoding of MultipleTableInfos by iterating through them in reverse order and properly managing data pointers.
  • Robustness and Validation: Added comprehensive validation checks during DDLEvent decoding to prevent out-of-bounds errors and ensure data integrity when processing potentially malformed event data.
  • New Unit Test: Introduced a new unit test, TestDDLEventDecodeV1WithMultipleTableInfos, to specifically verify the correct encoding and decoding of DDLEvents containing multiple table information entries, including a check for stale data on object reuse.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • pkg/common/event/ddl_event.go
    • Fixed a bug in decodeV1 where MultipleTableInfos were not correctly decoded from the raw byte data.
    • Added boundary checks and error handling for invalid event data lengths and sizes during the decoding process.
    • Improved the logic for parsing TableInfo and MultipleTableInfos to ensure accurate data extraction.
  • pkg/common/event/ddl_event_test.go
    • Added TestDDLEventDecodeV1WithMultipleTableInfos to validate the fix for decoding multiple table infos and ensure proper object reuse behavior.
Activity
  • No human activity has been recorded on this pull request since its creation.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request refactors the decodeV1 method in pkg/common/event/ddl_event.go to improve the robustness of DDLEvent data decoding. This includes adding numerous bounds checks and size validations when parsing various data fields like MultipleTableInfos, TableInfo, and DispatcherID from a byte slice, aiming to prevent potential out-of-bounds panics and ensure data integrity. A new test case, TestDDLEventDecodeV1WithMultipleTableInfos, was added in pkg/common/event/ddl_event_test.go to validate the correct decoding of events containing multiple table information entries. The review comments highlight a high-severity Denial of Service vulnerability where the common.UnmarshalJSONToTableInfo function is called without ensuring its input tableInfoDataSize is at least 8 bytes, which could lead to a runtime panic. This issue is present in both the multiple table info decoding loop and the single table info decoding, requiring additional checks to prevent these panics.

Signed-off-by: dongmen <414110582@qq.com>
Signed-off-by: dongmen <414110582@qq.com>
@asddongmen asddongmen marked this pull request as ready for review March 10, 2026 08:25
@ti-chi-bot ti-chi-bot bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 10, 2026
}
end -= 8

t.MultipleTableInfos = t.MultipleTableInfos[:0]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
t.MultipleTableInfos = t.MultipleTableInfos[:0]

@ti-chi-bot ti-chi-bot bot added the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Mar 10, 2026
@ti-chi-bot
Copy link

ti-chi-bot bot commented Mar 10, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: wk989898

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot
Copy link

ti-chi-bot bot commented Mar 10, 2026

[LGTM Timeline notifier]

Timeline:

  • 2026-03-10 08:36:09.006814631 +0000 UTC m=+339200.518872292: ☑️ agreed by wk989898.

@ti-chi-bot ti-chi-bot bot added the approved label Mar 10, 2026
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
pkg/common/event/ddl_event_test.go (1)

165-170: Add a reuse case that shrinks the decoded payload.

Re-unmarshalling the same bytes only proves we do not append duplicates. It does not cover the new t.TableInfo = nil branch or shrinking MultipleTableInfos from N to 0/1, which are the other stale-state cases this change is trying to prevent.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/common/event/ddl_event_test.go` around lines 165 - 170, Add a reuse test
that verifies unmarshalling can shrink fields: after the existing reuse
assertions, reassign reverseEvent to a state with non-nil MultipleTableInfos (or
call Unmarshal once with >1 entries), then Unmarshal bytes that encode fewer
entries (including a message where t.TableInfo == nil) and assert that
reverseEvent.MultipleTableInfos length shrinks to the expected smaller size (0
or 1) and that any previously present TableInfo was cleared; target the
reverseEvent.Unmarshal call and the MultipleTableInfos and t.TableInfo fields to
ensure the nil/resize branch is exercised.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/common/event/ddl_event.go`:
- Around line 347-360: The decode/reset path clears TableInfo and
MultipleTableInfos but fails to clear t.PostTxnFlushed, allowing callbacks from
a previous DDLEvent to survive; add a reset for t.PostTxnFlushed (e.g. set
t.PostTxnFlushed = nil or t.PostTxnFlushed = t.PostTxnFlushed[:0] as
appropriate) alongside the existing clears (near where t.MultipleTableInfos is
reset) so reused DDLEvent instances don't retain prior callbacks.
- Around line 415-417: The json.Unmarshal error returned from
json.Unmarshal(data[:restDataEnd], t) should be wrapped with a stack trace using
errors.Trace(err) and the package github.com/pingcap/errors must be imported;
update the error handling so that instead of returning the raw err you return
errors.Trace(err) (referencing json.Unmarshal, the t variable and
data[:restDataEnd]) and add the import for github.com/pingcap/errors to match
the pattern used in peer files like mounter.go.

---

Nitpick comments:
In `@pkg/common/event/ddl_event_test.go`:
- Around line 165-170: Add a reuse test that verifies unmarshalling can shrink
fields: after the existing reuse assertions, reassign reverseEvent to a state
with non-nil MultipleTableInfos (or call Unmarshal once with >1 entries), then
Unmarshal bytes that encode fewer entries (including a message where t.TableInfo
== nil) and assert that reverseEvent.MultipleTableInfos length shrinks to the
expected smaller size (0 or 1) and that any previously present TableInfo was
cleared; target the reverseEvent.Unmarshal call and the MultipleTableInfos and
t.TableInfo fields to ensure the nil/resize branch is exercised.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 8cee430a-9a1f-41e1-880b-2d095fb80e5c

📥 Commits

Reviewing files that changed from the base of the PR and between 6423bf1 and f5bee90.

📒 Files selected for processing (4)
  • pkg/common/event/ddl_event.go
  • pkg/common/event/ddl_event_test.go
  • pkg/common/table_info.go
  • pkg/common/table_info_test.go

Comment on lines 347 to +360
t.eventSize = int64(len(data))

end := len(data)
multipleTableInfosDataSize := binary.BigEndian.Uint64(data[end-8 : end])
for i := 0; i < int(multipleTableInfosDataSize); i++ {
tableInfoDataSize := binary.BigEndian.Uint64(data[end-8 : end])
tableInfoData := data[end-8-int(tableInfoDataSize) : end-8]
info, err := common.UnmarshalJSONToTableInfo(tableInfoData)
if err != nil {
return err
if end < 8 {
return fmt.Errorf("invalid DDLEvent data: length %d is too short", len(data))
}

multipleTableInfoCount := binary.BigEndian.Uint64(data[end-8 : end])
if multipleTableInfoCount > uint64((end-8)/8) {
return fmt.Errorf("invalid DDLEvent data: too many multiple table infos, count=%d", multipleTableInfoCount)
}
end -= 8

t.MultipleTableInfos = t.MultipleTableInfos[:0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Clear PostTxnFlushed on reused receivers.

This reset path handles TableInfo and MultipleTableInfos, but PostTxnFlushed is also excluded from JSON and will survive into the next decode. If the same DDLEvent instance is recycled, a later PostFlush() can execute callbacks from the previous event.

🧹 Proposed fix
 func (t *DDLEvent) decodeV1(data []byte) error {
 	// restData | dispatcherIDData | dispatcherIDDataSize | tableInfoData | tableInfoDataSize | multipleTableInfos | multipleTableInfosDataSize
 	t.eventSize = int64(len(data))
+	t.PostTxnFlushed = t.PostTxnFlushed[:0]
 
 	end := len(data)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
t.eventSize = int64(len(data))
end := len(data)
multipleTableInfosDataSize := binary.BigEndian.Uint64(data[end-8 : end])
for i := 0; i < int(multipleTableInfosDataSize); i++ {
tableInfoDataSize := binary.BigEndian.Uint64(data[end-8 : end])
tableInfoData := data[end-8-int(tableInfoDataSize) : end-8]
info, err := common.UnmarshalJSONToTableInfo(tableInfoData)
if err != nil {
return err
if end < 8 {
return fmt.Errorf("invalid DDLEvent data: length %d is too short", len(data))
}
multipleTableInfoCount := binary.BigEndian.Uint64(data[end-8 : end])
if multipleTableInfoCount > uint64((end-8)/8) {
return fmt.Errorf("invalid DDLEvent data: too many multiple table infos, count=%d", multipleTableInfoCount)
}
end -= 8
t.MultipleTableInfos = t.MultipleTableInfos[:0]
t.eventSize = int64(len(data))
t.PostTxnFlushed = t.PostTxnFlushed[:0]
end := len(data)
if end < 8 {
return fmt.Errorf("invalid DDLEvent data: length %d is too short", len(data))
}
multipleTableInfoCount := binary.BigEndian.Uint64(data[end-8 : end])
if multipleTableInfoCount > uint64((end-8)/8) {
return fmt.Errorf("invalid DDLEvent data: too many multiple table infos, count=%d", multipleTableInfoCount)
}
end -= 8
t.MultipleTableInfos = t.MultipleTableInfos[:0]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/common/event/ddl_event.go` around lines 347 - 360, The decode/reset path
clears TableInfo and MultipleTableInfos but fails to clear t.PostTxnFlushed,
allowing callbacks from a previous DDLEvent to survive; add a reset for
t.PostTxnFlushed (e.g. set t.PostTxnFlushed = nil or t.PostTxnFlushed =
t.PostTxnFlushed[:0] as appropriate) alongside the existing clears (near where
t.MultipleTableInfos is reset) so reused DDLEvent instances don't retain prior
callbacks.

Comment on lines +415 to 417
err = json.Unmarshal(data[:restDataEnd], t)
if err != nil {
return err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, let's read the file around lines 415-417
head -n 425 pkg/common/event/ddl_event.go | tail -n 30

Repository: pingcap/ticdc

Length of output: 838


🏁 Script executed:

# Also check if errors package is already imported
grep -n "^import\|errors" pkg/common/event/ddl_event.go | head -20

Repository: pingcap/ticdc

Length of output: 111


🏁 Script executed:

# Search for json.Unmarshal usage in the file
grep -n "json.Unmarshal" pkg/common/event/ddl_event.go

Repository: pingcap/ticdc

Length of output: 105


🏁 Script executed:

# Search for errors.Trace usage pattern in the file
grep -n "errors.Trace\|errors.WrapError" pkg/common/event/ddl_event.go

Repository: pingcap/ticdc

Length of output: 39


🏁 Script executed:

# Check the full import section
head -n 30 pkg/common/event/ddl_event.go

Repository: pingcap/ticdc

Length of output: 810


🏁 Script executed:

# Search for errors.Trace usage in the codebase to see the pattern
rg "errors\.Trace|errors\.WrapError" -A 1 -B 1 --type go | head -50

Repository: pingcap/ticdc

Length of output: 1710


🏁 Script executed:

# Check if github.com/pingcap/errors is imported anywhere in this directory
rg "github.com/pingcap/errors" pkg/common/event/ --type go

Repository: pingcap/ticdc

Length of output: 298


🏁 Script executed:

# Look for how other json.Unmarshal calls are handled in this directory
rg "json\.Unmarshal" pkg/common/event/ -A 2 --type go

Repository: pingcap/ticdc

Length of output: 389


Wrap the JSON decode error with a stack trace and add missing import.

Line 415-417 returns the json.Unmarshal error unwrapped. Import github.com/pingcap/errors and wrap with errors.Trace(err) to maintain consistency with error handling patterns in peer files like mounter.go in this directory and conform to the coding guideline for third-party library errors.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/common/event/ddl_event.go` around lines 415 - 417, The json.Unmarshal
error returned from json.Unmarshal(data[:restDataEnd], t) should be wrapped with
a stack trace using errors.Trace(err) and the package github.com/pingcap/errors
must be imported; update the error handling so that instead of returning the raw
err you return errors.Trace(err) (referencing json.Unmarshal, the t variable and
data[:restDataEnd]) and add the import for github.com/pingcap/errors to match
the pattern used in peer files like mounter.go.

@asddongmen
Copy link
Collaborator Author

/test all

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved needs-1-more-lgtm Indicates a PR needs 1 more LGTM. release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DDLEvent.decodeV1 mis-parses MultipleTableInfos when count > 0

2 participants