Skip to content

[train] Add "sum" loss reduction mode#1260

Draft
tyler-griggs wants to merge 1 commit intomainfrom
tgriggs/sum-loss-reduction
Draft

[train] Add "sum" loss reduction mode#1260
tyler-griggs wants to merge 1 commit intomainfrom
tgriggs/sum-loss-reduction

Conversation

@tyler-griggs
Copy link
Member

Summary

  • Adds loss_reduction: "sum" option to reduce_loss()
  • Raw sum over valid tokens — gradient scales with batch size and sequence length

Test plan

  • Existing loss tests pass

🤖 Generated with Claude Code

Adds raw sum reduction to reduce_loss(), where gradient magnitude
scales with batch size and sequence length. Useful when LR is
explicitly scaled with batch size.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant