Fix FLUX2 Klein load-time VRAM spikes on low-memory GPUs. by mhofer1976 · Pull Request #726 · ostris/ai-toolkit

mhofer1976 · 2026-02-25T05:59:03Z

Keep the transformer and Qwen text encoder off CUDA during initial load/quantization in low-VRAM mode so model startup avoids full-model OOM before offloading and quantization can take effect.

Summary

avoid moving the full FLUX2 transformer to CUDA before quantization, which caused startup OOM on low-memory GPUs
keep the transformer/text-encoder on CPU during low-VRAM model preparation and only move as needed
use qtype_te for Klein Qwen text-encoder quantization instead of the transformer qtype

Test plan

Reproduce prior failure on FLUX2 Klein 9B at Loading transformer with low VRAM settings
Verify loader no longer calls full-model CUDA move before quantization in the FLUX2 path
Verify Klein TE path no longer eagerly loads full TE to CUDA and uses qtype_te
Run a full training smoke test on a low-VRAM GPU and confirm model loads and begins training

Keep the transformer and Qwen text encoder off CUDA during initial load/quantization in low-VRAM mode so model startup avoids full-model OOM before offloading and quantization can take effect. Co-authored-by: Cursor <cursoragent@cursor.com>

inflamously · 2026-03-04T23:33:47Z

Omg yes please, anything that fixes these annoying VRAM spikes between sampling and at random steps. I have a 5090 RTX that can run this flux 2 klein 9b without any problems but the longer the run the more it spikes.

jaretburkett · 2026-04-01T15:37:15Z

Thank you @mhofer1976

mhofer1976 force-pushed the fix/flux2-klein-low-vram-loading branch from 280fb32 to 02dd161 Compare February 25, 2026 06:01

Merge branch 'main' into fix/flux2-klein-low-vram-loading

181428e

jaretburkett merged commit f213e3b into ostris:main Apr 1, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix FLUX2 Klein load-time VRAM spikes on low-memory GPUs.#726

Fix FLUX2 Klein load-time VRAM spikes on low-memory GPUs.#726
jaretburkett merged 2 commits intoostris:mainfrom
mhofer1976:fix/flux2-klein-low-vram-loading

mhofer1976 commented Feb 25, 2026

Uh oh!

inflamously commented Mar 4, 2026

Uh oh!

jaretburkett commented Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

mhofer1976 commented Feb 25, 2026

Summary

Test plan

Uh oh!

inflamously commented Mar 4, 2026

Uh oh!

jaretburkett commented Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants