Skip to content

Bench/context sweep snapshot#627

Open
huangzhenhua111 wants to merge 4 commits intoUbiquitousLearning:mainfrom
huangzhenhua111:bench/context-sweep-snapshot
Open

Bench/context sweep snapshot#627
huangzhenhua111 wants to merge 4 commits intoUbiquitousLearning:mainfrom
huangzhenhua111:bench/context-sweep-snapshot

Conversation

@huangzhenhua111
Copy link

What

  • Add env-gated matmul shape aggregation (MLLM_MATMUL_SHAPE_LOG) to report top GEMM/SGEMM shapes by estimated FLOPs.

  • Add sweep_context_v2.sh to run context sweep (prefill TTFT + decode-heavy).

  • Add snapshot generator script (make_snapshot_nopandas.py) and ship a ready-to-share snapshot bundle:

    • raw CSV (bench_artifacts/data/*.csv)
    • plots (bench_artifacts/plots/*.png)
    • 1-page summary (bench_artifacts/snapshot.md)
    • reproduction guide (bench_artifacts/README.md)

Why

  • Provide a minimal “one-click bench + stable CSV + at least one memory metric (peak RSS + KV estimate)” deliverable for the monthly package.
  • Make bottlenecks measurable (tinyBLAS small-tile GEMM dominates; shapes are now visible).

How to reproduce

  • Tested on x86_64 (WSL/Ubuntu), CPU backend
  • See bench_artifacts/README.md.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant