Skip to content

Display kernel stats and framework op stats in GB-25 CI#2316

Open
gbaraldi wants to merge 1 commit intoEnzymeAD:mainfrom
gbaraldi:gb25-pm-counters
Open

Display kernel stats and framework op stats in GB-25 CI#2316
gbaraldi wants to merge 1 commit intoEnzymeAD:mainfrom
gbaraldi:gb25-pm-counters

Conversation

@gbaraldi
Copy link
Copy Markdown

Summary

Enhance the GB-25 CI "Display profile results" step to show per-kernel profiling data from the xplane trace.

Changes

  • Display kernel stats (register count, occupancy) for each profiled kernel
  • Display framework op stats (memory bandwidth, operational intensity, bottleneck classification) per op
  • Remove stale mst3 reactant_commit matrix entry (branch no longer exists)

These Reactant APIs (get_kernel_stats, get_framework_op_stats) are already on main (merged in EnzymeAD/Reactant.jl#2703 and #2705). When PM counters are enabled in the GB-25 with_profiler call (PRONTOLab/GB-25#267), the framework op stats will include measured memory bandwidth and operational intensity from CUPTI PM sampling.

Example output (from local H100 run)

=== Kernel Stats ===
  gemm_fusion_dot_1: regs=32 occ=59.375%
  loop_add_fusion: regs=16 occ=31.25%

=== Framework Op Stats ===
  gemm_fusion_dot.1: bw=2.2 GB/s, oi=2.55, bound=HBM
  loop_add_fusion: bw=0.0 GB/s, oi=0.14, bound=HBM

🤖 Generated with Claude Code

Enhance the "Display profile results" step to show per-kernel
hardware counter data from the profiled loop2 run:
- Kernel stats: register count and occupancy per kernel
- Framework op stats: memory bandwidth, operational intensity,
  and bottleneck classification per op

These APIs are available on Reactant main (merged in #2703/#2705).
When PM counters are collected (via GB-25 add-pm-counters branch),
the framework op stats will include measured_memory_bw and
operational_intensity from CUPTI PM sampling.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant