Goal
Add LoRA support to the FSDP trainer path and provide a minimal math LoRA recipe to verify the end-to-end workflow.
Context
We want AstraFlow to support parameter-efficient fine-tuning with FSDP. The main thing to be careful about is weight transfer: when LoRA is enabled, the trainer may only update adapter weights, while rollout workers still need to load the correct base model + latest LoRA adapter. No delta transfer is needed for lora training.
Scope
This issue includes:
- Add LoRA config support for the FSDP trainer.
- Initialize the model with LoRA enabled during FSDP training.
- Handling lora weight load in sglang.
- Save LoRA adapter checkpoints with enough metadata for rollout workers.
- Make sure LoRA weights can be transferred from trainer to rollout worker.
- Add a minimal math LoRA recipe.
Goal
Add LoRA support to the FSDP trainer path and provide a minimal math LoRA recipe to verify the end-to-end workflow.
Context
We want AstraFlow to support parameter-efficient fine-tuning with FSDP. The main thing to be careful about is weight transfer: when LoRA is enabled, the trainer may only update adapter weights, while rollout workers still need to load the correct base model + latest LoRA adapter. No delta transfer is needed for lora training.
Scope
This issue includes: