fix: surface state:modified nodes in summary lineage graph#1192
fix: surface state:modified nodes in summary lineage graph#1192dtaniwaki wants to merge 1 commit intoDataRecce:mainfrom
Conversation
Signed-off-by: Daisuke Taniwaki <daisuketaniwaki@gmail.com>
Codecov Report❌ Patch coverage is
... and 5 files with indirect coverage changes 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
Fixes the PR summary lineage graph so it can surface state:modified nodes even when their SQL checksum hasn’t changed (e.g., macro/config-only changes), by plumbing the adapter-computed lineage_diff.diff into the summary graph builder and allowing that diff to override checksum-derived status.
Changes:
- Pass
lineage_diff.diffinto_build_lineage_graphfromgenerate_markdown_summary. - Add a
Node.apply_diff()override mechanism (_forced_change_status) so externally computed diffs can drivechange_status. - Add unit tests covering
apply_diffbehavior and_build_lineage_graph(..., diff=...).
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
recce/summary.py |
Allows lineage graph node status to be overridden by adapter diff and wires diff into markdown summary generation. |
tests/test_summary.py |
Adds tests ensuring diff-driven “modified” nodes appear in the graph and in “What’s Changed”. |
| graph_no_diff = _build_lineage_graph(curr_lineage, base_lineage) | ||
| graph_with_none = _build_lineage_graph(curr_lineage, base_lineage, None) |
There was a problem hiding this comment.
In this new test, _build_lineage_graph’s signature is (base, current, ...), but the call passes curr_lineage as the first argument and base_lineage as the second. That reverses the meaning of added/removed nodes and can hide regressions because modified_set doesn’t distinguish added vs removed. Please swap the arguments to _build_lineage_graph(base_lineage, curr_lineage, ...) in both calls here so the test matches production usage (see generate_markdown_summary calling it with lineage_diff.base then lineage_diff.current).
| graph_no_diff = _build_lineage_graph(curr_lineage, base_lineage) | |
| graph_with_none = _build_lineage_graph(curr_lineage, base_lineage, None) | |
| graph_no_diff = _build_lineage_graph(base_lineage, curr_lineage) | |
| graph_with_none = _build_lineage_graph(base_lineage, curr_lineage, None) |
Code Review — PR 1192SummaryFixes Findings[Warning] "Code" label is misleading for macro/config-only changesFile: [Warning] Pre-existing shared mutable class-level defaults in
|
|
@dtaniwaki thanks for the PR! Can you please resolve the conflict in If you feel the issues are not valid, please say as such - and address them as necessary. |
Thanks for maintaining this project! I'd appreciate a review of this bug fix when you have a chance.
PR checklist
What type of PR is this?
fix
What this PR does / why we need it:
When a dbt macro or project config changes,
state:modifiedcorrectly returns the affected downstream nodes — even if their SQL body (and thus file checksum) is unchanged. However,generate_markdown_summarywas building the lineage graph from checksums alone, never passing thelineage_diff.diffthat the adapter had already computed. As a result, nodes changed only by macros or configs were silently invisible in the PR summary comment.The fix is a one-line change in
generate_markdown_summary: passlineage_diff.diffto_build_lineage_graph. A small supporting mechanism (apply_diff/_forced_change_status) lets the graph override the checksum-based status for nodes thatstate:modifiedflagged but whose SQL didn't change.Which issue(s) this PR fixes:
N/A
Special notes for your reviewer:
The
diffdict was already being populated correctly by the adapter (it callsstate:modifiedand maps each node to aNodeDiff). The bug was purely in the summary layer not consuming it.Does this PR introduce a user-facing change?: