fix(sft): sanitize generation config to prevent save_pretrained crash by Neonkraft · Pull Request #24 · OpenEuroLLM/post-training

Neonkraft · 2026-04-29T14:38:49Z

Summary

OLMo-3 Think models ship a generation_config.json with temperature/top_p set but do_sample=False. This is harmless at training time (we never call model.generate), but transformers >= 5.x runs strict validation inside GenerationConfig.save_pretrained and rejects the inconsistency:

ValueError: GenerationConfig is invalid:
  - `temperature` is set to 0.6 -- this flag is only used in sample-based
    generation modes. You should set `do_sample=True` or unset `temperature`.

Since every checkpoint save calls model.save_pretrained, this crashes the job at the very first checkpoint. The fix sets do_sample=True in-memory on the trainer's model immediately after construction — local to the run, upstream Hub files unmodified.

AllenAI's open-instruct solves the same issue by stripping the sampling params instead; we prefer setting do_sample=True to preserve the model's recommended inference settings in saved checkpoints.

Type of change

…LMo-3 Think OLMo-3 Think models ship temperature/top_p with do_sample=False. transformers >= 5.x strict validation rejects this in GenerationConfig.save_pretrained, crashing every checkpoint save. Set do_sample=True in-memory on the trainer's model after construction. The upstream Hub files are unmodified; saved checkpoints preserve the model's recommended inference settings.

KonstiNik

Thanks for the fix. One question:

Is it worth applying this to dpo as well? dpo.py:59-66 constructs DPOTrainer(model=name_or_path) the same way and saves checkpoints via the same path. Suggest moving _sanitize_generation_config to common.py and calling it from both build_sft_trainer and build_dpo_trainer. What do you think?

Neonkraft · 2026-04-30T14:21:29Z

Yes, makes sense. Fixed.

KonstiNik

Great that DPO is covered as well!
As a forward-looking note: HF's strict validator actually checks eight sampling-mode parameters, not just three — min_p, top_h, typical_p, epsilon_cutoff and eta_cutoff aren't covered by the current heuristic (configuration_utils.py:626-654). Fine for OLMo-3 Think specifically, but worth keeping in mind for future models that ship with those params set.

Neonkraft · 2026-04-30T15:34:25Z

Thanks for pointing this out. Might as well deal with it now, because LeBlanc's Law. Fixed.

KonstiNik · 2026-05-01T08:13:03Z

Great addition! Approving from my side.

Neonkraft added 2 commits April 28, 2026 22:07

Merge branch 'main' into fix/olmo3-generation-config

d90fce3

Neonkraft requested a review from KonstiNik April 29, 2026 14:38

KonstiNik reviewed Apr 30, 2026

View reviewed changes

fix: move sanitize_generation_config to common and apply to DPO

beddbdf

KonstiNik approved these changes Apr 30, 2026

View reviewed changes

fix: extend sampling-param heuristic to all 8 HF validator params

8c161b0

Neonkraft changed the title ~~fix(sft): sanitize generation config to prevent save_pretrained crash on OLMo-3 Think~~ fix(sft): sanitize generation config to prevent save_pretrained crash Apr 30, 2026

Neonkraft merged commit 7045e45 into main May 1, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(sft): sanitize generation config to prevent save_pretrained crash#24

fix(sft): sanitize generation config to prevent save_pretrained crash#24
Neonkraft merged 4 commits intomainfrom
fix/olmo3-generation-config

Neonkraft commented Apr 29, 2026

Uh oh!

KonstiNik left a comment •

edited

Loading

Uh oh!

Neonkraft commented Apr 30, 2026

Uh oh!

KonstiNik left a comment

Uh oh!

Neonkraft commented Apr 30, 2026

Uh oh!

KonstiNik commented May 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Neonkraft commented Apr 29, 2026

Summary

Type of change

Uh oh!

KonstiNik left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Neonkraft commented Apr 30, 2026

Uh oh!

KonstiNik left a comment

Choose a reason for hiding this comment

Uh oh!

Neonkraft commented Apr 30, 2026

Uh oh!

KonstiNik commented May 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

KonstiNik left a comment •

edited

Loading