Skip to content

Optimize CP sequence KL communication for GSPO/OPSM#1948

Open
zzdeae86 wants to merge 2 commits into
THUDM:mainfrom
zzdeae86:codex/optimize-cp-gspo-kl
Open

Optimize CP sequence KL communication for GSPO/OPSM#1948
zzdeae86 wants to merge 2 commits into
THUDM:mainfrom
zzdeae86:codex/optimize-cp-gspo-kl

Conversation

@zzdeae86
Copy link
Copy Markdown

Summary

This optimizes the context-parallel sequence-KL path used by GSPO/OPSM.

Previously, GSPO/OPSM gathered full response-length log-prob tensors across CP ranks before computing sequence-level KL. This PR changes the path to compute local KL numerators on each CP rank and all-reduce only per-sample scalars, reducing CP communication for long responses.

It also skips entropy computation when entropy_coef == 0.

Changes

  • Add get_local_response_mask_with_cp
  • Compute GSPO sequence KL from local CP chunks plus scalar all-reduce
  • Update OPSM sequence KL/mask computation to avoid full response all-gather
  • Add CP=2/4 distributed unit coverage for the local sequence-KL path

Validation

  • python3 -m py_compile ...
  • git diff --check
  • .venv/bin/python -m pytest --import-mode=importlib

Local pytest result on Mac:

  • 262 passed, 11 failed
  • Related tests passed:
    • tests/test_gspo_cp_sequence_kl.py
    • tests/test_loss_cp_invariance.py
    • tests/test_metric_report_dist.py

Remaining local failures appear environment-related on this Mac setup, involving Megatron/Ray/SGLang router stubs and tokenizer-dependent tests.

@zzdeae86
Copy link
Copy Markdown
Author

This PR addresses #1062 by reducing CP communication in the GSPO/OPSM sequence-KL path. The local CP correctness tests pass, and I would appreciate feedback on whether you want an additional benchmark or integration test.

@zzdeae86
Copy link
Copy Markdown
Author

Hi maintainers, I do not see CI checks on this PR yet. Could you please help trigger the default checks or let me know if any label is needed? For this change, cpu-unittest and possibly run-ci-changed should be relevant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant