Look into global-batch load balancing loss used in routing in Qwen3 MoE and see if it is implemented in some Megatron repo. TODO: test both Megatron-LM and Megatron-Bridge CPT setups to see loss differences.