Add report generation for OSU Benchmark #807

allkoow · 2026-02-12T17:07:02Z

Summary

New report generation strategy for OSU: parses OSU stdout into CSV with three output types:

Latency – headers with Avg Latency(us) (full and short),
Bandwidth – Bandwidth (MB/s) or MB/s,
Multiple bandwidth – MB/s and Messages/s.

Skip benchmarks with no table data (osu_init, osu_hello).
Add comparison report with tables and charts.

Test Plan

Tested on internal environment:

Report generation: verified report generation (.csv) for osu_latency, osu_bw, osu_get_bw, osu_put_bw, osu_bibw, osu_mbw_mr, osu_multi_lat.
Comparison report: verified comparison report for scenarios with 2 runs: osu_bw, osu_latency.

coderabbitai · 2026-02-12T17:07:27Z

📝 Walkthrough

Walkthrough

Adds OSU benchmark parsing and CSV emission per run, a report-generation strategy, a comparison report producing tables/charts, registers and exports the new report/strategy, and extends tests to cover parsing, exports, and registry state.

Changes

Cohort / File(s)	Summary
Report generation strategy `src/cloudai/workloads/osu_bench/report_generation_strategy.py`	New module that parses `stdout.txt` for OSU benchmark outputs into a pandas DataFrame and writes `osu_bench.csv`; adds BenchmarkType enum and parsing helpers; implements `OSUBenchReportGenerationStrategy` with `can_handle_directory` and `generate_report`.
Comparison report `src/cloudai/workloads/osu_bench/osu_comparison_report.py`	New `OSUBenchComparisonReport` ComparisonReport subclass that loads per-run `osu_bench.csv`, validates/normalizes data, and builds metric-specific tables and Bokeh charts (latency, bandwidth, message rate).
Workload package exports `src/cloudai/workloads/osu_bench/__init__.py`	Exports updated to include `OSUBenchComparisonReport` and `OSUBenchReportGenerationStrategy` in `__all__`.
Workload behavior tweak `src/cloudai/workloads/osu_bench/osu_bench.py`	Adds early-success path: `osu_hello` and `osu_init` benchmarks short-circuit to a successful status before standard output validation.
Registration `src/cloudai/registration.py`	Registers `OSUBenchTestDefinition` with `OSUBenchReportGenerationStrategy` and adds a scenario report `osu_bench_comparison` using `OSUBenchComparisonReport` with `ComparisonReportConfig(enable=True, group_by=["benchmark"])`.
Tests `tests/report_generation_strategy/test_osu_bench_report_generation_strategy.py`, `tests/test_init.py`, `tests/test_reporter.py`, `tests/test_test_scenario.py`	Adds tests for `extract_osu_bench_data` (multiple OSU formats and edge cases); updates imports/expectations to include `OSUBenchComparisonReport` and `OSUBenchReportGenerationStrategy`; test fixtures now snapshot/restore `reg.report_configs` alongside `reg.scenario_reports`; adjusts default reporter count.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐇 I nibbled stdout, counted bytes and time,

Rows became carrots, neat and prime.
CSV trails and Bokeh lights gleam,
Tables hop forward—one happy stream.
Hooray, the bench reports follow my dream! 🥕

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically summarizes the main change: adding report generation functionality for OSU Benchmark.
Description check	✅ Passed	The description is directly related to the changeset, providing context about the report generation strategy, benchmark types handled, and test coverage for the new OSU Benchmark reporting feature.
Merge Conflict Detection	✅ Passed	✅ No merge conflicts detected when merging into `main`

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

No actionable comments were generated in the recent review. 🎉

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

greptile-apps · 2026-02-12T17:12:35Z

Greptile Overview

Greptile Summary

This PR implements comprehensive report generation for OSU benchmarks, parsing stdout into CSV format and creating comparison reports with tables and charts.

Key Changes:

Adds new parsing strategy that handles three OSU benchmark types: latency (with avg values), bandwidth (single metric), and multiple bandwidth (MB/s + Messages/s)
Implements comparison report functionality that groups test runs and generates visualizations for each metric type
Updates success validation to skip table data checks for special benchmarks (osu_hello, osu_init) that only produce summary output
Includes comprehensive test coverage for all parsing scenarios including edge cases (empty files, missing headers, headers without data)

Implementation Quality:

Edge case handling is thorough with proper validation for missing files, empty output, and malformed data
Previous review feedback has been addressed (column validation, empty dataframe handling, duplicate comments removed)
Test coverage is extensive with 7 test cases covering normal operation and edge cases
Code follows existing patterns in the codebase (similar to NCCL and NIXL bench implementations)

Confidence Score: 4/5

This PR is safe to merge with minor caveats around the @cache decorator
The implementation is solid with comprehensive test coverage and proper edge case handling. All previous review comments have been addressed. The main consideration is the @cache decorator on extract_osu_bench_data which could return stale data if files change, but the developer confirmed this is acceptable since benchmark result files don't change after creation. The parsing logic is straightforward with appropriate validation at each step.
No files require special attention - all changes follow established patterns and include proper test coverage

Important Files Changed

Filename	Overview
src/cloudai/workloads/osu_bench/report_generation_strategy.py	New report generation strategy that parses OSU stdout into CSV with three benchmark types (latency, bandwidth, multiple bandwidth). Handles edge cases like missing files, empty output, and benchmarks without table data.
src/cloudai/workloads/osu_bench/osu_comparison_report.py	Adds comparison report functionality for OSU benchmarks with tables and charts. Properly validates metric presence across all dataframes before generating visualizations.
src/cloudai/workloads/osu_bench/osu_bench.py	Updated success validation to handle special benchmarks (`osu_hello`, `osu_init`) that don't produce table output, skipping the "# Size" marker check for these cases.
tests/report_generation_strategy/test_osu_bench_report_generation_strategy.py	Comprehensive test coverage for OSU parsing including all three benchmark types, edge cases (missing file, empty file, no header, header without data), and proper data validation.

Sequence Diagram

sequenceDiagram
    participant Test as OSU Benchmark Test
    participant RGS as OSUBenchReportGenerationStrategy
    participant Parser as extract_osu_bench_data
    participant CR as OSUBenchComparisonReport
    
    Test->>Test: Run benchmark & write stdout.txt
    Test->>RGS: generate_report()
    RGS->>RGS: can_handle_directory()
    RGS->>Parser: extract_osu_bench_data(stdout.txt)
    Parser->>Parser: Detect benchmark type from header
    Parser->>Parser: Parse data rows (size + metrics)
    Parser-->>RGS: Return DataFrame
    RGS->>RGS: Write osu_bench.csv
    
    Note over CR: Comparison across multiple runs
    CR->>CR: load_test_runs()
    loop For each test run
        CR->>CR: extract_data_as_df()
        CR->>CR: Read osu_bench.csv
    end
    CR->>CR: _has_metric() check
    CR->>CR: create_tables() for each metric
    CR->>CR: create_charts() for each metric
    CR->>CR: Generate HTML report

_{Last reviewed commit: 7ee6d6c}

greptile-apps

_{8 files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

coderabbitai

Actionable comments posted: 3

🤖 Fix all issues with AI agents

In `@src/cloudai/workloads/osu_bench/osu_comparison_report.py`:
- Around line 65-142: The code reads each group's CSVs twice because
create_tables and create_charts each call extract_data_as_df independently; add
a simple per-report cache to avoid duplicate reads by implementing a helper
(e.g., _get_group_dfs) that checks/sets a dict on self (e.g., self._dfs_cache)
keyed by a stable identifier for the group (like tuple(item.tr.path) or
id(group)), and returns the list produced by [self.extract_data_as_df(item.tr)
for item in group.items]; replace the direct list comprehensions in
create_tables and create_charts with calls to _get_group_dfs and clear/init
self._dfs_cache appropriately (e.g., in report constructor or before processing)
so extract_data_as_df is invoked only once per group.

In `@src/cloudai/workloads/osu_bench/report_generation_strategy.py`:
- Around line 64-85: The two branches in _parse_data_row that handle
BenchmarkType.BANDWIDTH and the LATENCY fallback both return [parts[0],
parts[1]]; simplify by removing the explicit BANDWIDTH branch and collapsing
them into a single fallback that returns [parts[0], parts[1]] for any
non-MULTIPLE_BANDWIDTH case (after validating parts and the message-size int),
keeping the MULTIPLE_BANDWIDTH branch unchanged; this reduces duplication while
preserving behavior tied to _columns_for_type.

In
`@tests/report_generation_strategy/test_osu_bench_report_generation_strategy.py`:
- Around line 63-112: Add three edge-case tests for extract_osu_bench_data to
cover its early-return branches: (1) "file not found" — call
extract_osu_bench_data with a non-existent Path and assert it returns an empty
DataFrame (no columns, shape (0,0) or however empty is represented in your
project), (2) "empty file" — create an empty stdout file and assert
extract_osu_bench_data returns an empty DataFrame, and (3) "unrecognized header"
— write a stdout file containing non-OSU output (e.g., "osu_hello" text) and
assert extract_osu_bench_data returns an empty DataFrame; reference the function
extract_osu_bench_data and the module osu_bench.py so tests exercise the
special-case handling added there.

src/cloudai/workloads/osu_bench/osu_comparison_report.py

src/cloudai/workloads/osu_bench/report_generation_strategy.py

tests/report_generation_strategy/test_osu_bench_report_generation_strategy.py

greptile-apps

_{9 files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps

_{9 files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps

_{9 files reviewed, 2 comments}

_{Edit Code Review Agent Settings | Greptile}

src/cloudai/workloads/osu_bench/report_generation_strategy.py

src/cloudai/workloads/osu_bench/osu_comparison_report.py

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

tests/test_test_scenario.py (1)
528-556: 🧹 Nitpick | 🔵 Trivial

Count updated correctly, but OSU bench mapping is not tested in test_custom_reporters.

The total count of 18 accounts for the new OSUBenchTestDefinition registration. However, test_custom_reporters parametrize list (lines 531–553) only has 16 entries and doesn't include OSUBenchTestDefinition → {OSUBenchReportGenerationStrategy}. Consider adding it for completeness.
♻️ Suggested addition to parametrize

Add the import at the top of the file:
from cloudai.workloads.osu_bench import OSUBenchReportGenerationStrategy, OSUBenchTestDefinition
Then add to the parametrize list:
             (AiconfiguratorTestDefinition, {AiconfiguratorReportGenerationStrategy}),
+            (OSUBenchTestDefinition, {OSUBenchReportGenerationStrategy}),

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@src/cloudai/workloads/osu_bench/osu_comparison_report.py`:
- Around line 71-148: The create_tables and create_charts methods repeat the
same check-and-append logic for three metrics; introduce a class-level metric
descriptor (e.g. _METRICS = (("avg_lat","Latency","Time (us)"),
("mb_sec","Bandwidth","Bandwidth (MB/s)"), ("messages_sec","Message
Rate","Messages/s"))) and refactor both methods to loop over _METRICS: for each
(col, title, y_label) build dfs via extract_data_as_df, call
self._has_metric(dfs, col) and then call self.create_table(group, dfs=dfs,
title=title, info_columns=list(self.INFO_COLUMNS), data_columns=[col]) in
create_tables and self.create_chart(group, dfs, title, list(self.INFO_COLUMNS),
[col], y_label) in create_charts; keep references to _has_metric, create_table,
create_chart and INFO_COLUMNS so behavior is unchanged.
- Around line 52-64: In extract_data_as_df, avoid calling df["size"].astype(int)
directly because non-numeric or NaN values will raise IntCastingNaNError;
instead coerce the column to numeric and drop invalid rows before casting: use
pd.to_numeric(df["size"], errors="coerce") (or lazy.pd equivalent) to convert,
call dropna() on that column (or drop rows where it is NaN), then safely cast to
int (or to a nullable integer dtype) and return the cleaned df; reference df,
csv_path, TestRun, and the extract_data_as_df method when making the change.

src/cloudai/workloads/osu_bench/osu_comparison_report.py

greptile-apps

_{9 files reviewed, 2 comments}

_{Edit Code Review Agent Settings | Greptile}

src/cloudai/workloads/osu_bench/report_generation_strategy.py

tests/report_generation_strategy/test_osu_bench_report_generation_strategy.py

greptile-apps

_{9 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

src/cloudai/workloads/osu_bench/osu_comparison_report.py

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

greptile-apps

_{9 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

src/cloudai/workloads/osu_bench/report_generation_strategy.py

greptile-apps

_{9 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

src/cloudai/workloads/osu_bench/report_generation_strategy.py

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

greptile-apps

_{9 files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

podkidyshev · 2026-02-13T13:56:45Z

@allkoow there's a merge conflict, please resolve

I missclicked!! sorry. at least it resolved automatically

greptile-apps

_{9 files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

Add report generation for OSU Benchmark

5556f02

allkoow requested review from amaslenn, jeffnvidia and srivatsankrishnan as code owners February 12, 2026 17:07

Merge branch 'main' into ako/osu-report-gen

b20fbe0

greptile-apps bot reviewed Feb 12, 2026

View reviewed changes

Update number of reporters in ut

d7958bb

coderabbitai bot reviewed Feb 12, 2026

View reviewed changes

src/cloudai/workloads/osu_bench/osu_comparison_report.py Show resolved Hide resolved

src/cloudai/workloads/osu_bench/report_generation_strategy.py Show resolved Hide resolved

tests/report_generation_strategy/test_osu_bench_report_generation_strategy.py Show resolved Hide resolved

greptile-apps bot reviewed Feb 12, 2026

View reviewed changes

Update copyright year range in osu_bench

23850e6

greptile-apps bot reviewed Feb 12, 2026

View reviewed changes

Cover edge cases in osu bench ut

38c3a35