Skip to content

Skip plotting benchmarks without data#9085

Open
bernhardmgruber wants to merge 2 commits into
NVIDIA:mainfrom
bernhardmgruber:fix_sol
Open

Skip plotting benchmarks without data#9085
bernhardmgruber wants to merge 2 commits into
NVIDIA:mainfrom
bernhardmgruber:fix_sol

Conversation

@bernhardmgruber
Copy link
Copy Markdown
Contributor

Fixes: #6729

@bernhardmgruber bernhardmgruber requested a review from a team as a code owner May 21, 2026 08:08
@bernhardmgruber bernhardmgruber requested a review from shwina May 21, 2026 08:08
@github-project-automation github-project-automation Bot moved this to Todo in CCCL May 21, 2026
@cccl-authenticator-app cccl-authenticator-app Bot moved this from Todo to In Review in CCCL May 21, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 21, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 595b0631-a5d4-492e-becc-0d55c728fd84

📥 Commits

Reviewing files that changed from the base of the PR and between 0fd8182 and 45e3daa.

📒 Files selected for processing (1)
  • benchmarks/scripts/sol.py

📝 Walkthrough

Summary by CodeRabbit

  • Bug Fixes
    • Improved robustness of benchmark data processing with stronger validation and conditional filtering to avoid empty results.
    • Added warnings and graceful handling when individual benchmarks or sub-benchmarks produce missing or invalid data.
    • Stops aggregation/plotting and reports an error when all benchmarks are skipped, preventing misleading outputs.

important:

Walkthrough

benchmarks/scripts/sol.py adds defensive filtering and validation: offset-type filtering no longer collapses dfs to empty, alg_dfs skips sub-benchmarks with empty data or missing bw, and sol() returns early if all benchmarks are skipped.

Changes

Data Validation and Skipping

Layer / File(s) Summary
Offset-type filter and empty-result check
benchmarks/scripts/sol.py
filter_by_offset_type filters into a temporary subset and only replaces df when that subset is non-empty (lines 30-34).
alg_dfs validation and skip logic
benchmarks/scripts/sol.py
alg_dfs computes fused_algname early, warns and skips sub-benchmarks when the post-filter df is empty or when bw has no non-null values, and defers variant coercion and bw scaling until after checks (lines 63-76).
sol early-return on all skipped
benchmarks/scripts/sol.py
sol() uses alg_dfs result and prints an error and returns early if no benchmarks remain (lines 174-178).

Assessment against linked issues

Objective Addressed Explanation
Skip benchmarks lacking bandwidth data [#6729]

Out-of-scope changes

Code Change Explanation
32-bit offset-type filtering refactor (benchmarks/scripts/sol.py, lines 30-34) This change improves offset-type filtering behavior but does not address missing bw data referenced in #6729.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: ea645a47-557f-4498-8d12-c8cfb4ef03fe

📥 Commits

Reviewing files that changed from the base of the PR and between 589097d and 0fd8182.

📒 Files selected for processing (1)
  • benchmarks/scripts/sol.py

Comment thread benchmarks/scripts/sol.py
Comment on lines +63 to +74
fused_algname = algname.replace("bench.", "") + "." + subbench
if df.empty:
print(
f"WARNING: Skipped {fused_algname} because no data is present"
)
print(df)
continue
if df["bw"].dropna().empty:
print(
f"WARNING: Skipped {fused_algname} because it does not report bandwidth"
)
continue
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot May 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

important: Guard for missing bw column before indexing at Line 70. Right now df["bw"] can still raise KeyError if the benchmark output omits bw entirely (not just NaN values), which defeats the hardening in this PR. Use a schema check in the skip condition, e.g. treat ("bw" not in df.columns) the same as missing bandwidth data.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a KeyError is appropriate, since then the input is wrong.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bernhardmgruber Understood — the five affected benchmarks have bw present but all-NaN, so dropna().empty is the right check for the actual issue. A fully absent bw column is structurally malformed input and a KeyError is a reasonable loud failure there. Fair enough.

Comment thread benchmarks/scripts/sol.py Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important: alg_dfs() can now return an empty dict when all selected benchmarks are skipped, which causes alg_bws() to crash because medians remains None. We should add an early return with an error or warning message

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In Review

Development

Successfully merging this pull request may close these issues.

sol.py fails to plot some benchmarks

2 participants