Shaheen Nabi shaheennabi

Thanks for tuning here👋

Who I am

╔════════════════════╗ ║ research -- thinking, reasoning models ║ ╚════════════════════╝

I study how large language models perform multi-step reasoning and how training and post-training methods can improve their reliability, efficiency, and scalability.

My work focuses on the post-training stack for LLMs — supervised fine-tuning (SFT), preference optimization, reinforcement learning methods such as RLVR, and inference-time compute strategies that improve reasoning without requiring larger models.

I’m also interested in the interpretability of reasoning models: understanding the internal mechanisms that support multi-step reasoning and diagnosing failures such as shortcut reasoning, reward hacking, and unfaithful chain-of-thought.

Currently building and open-sourcing implementations of reasoning-focused training pipelines and contributing to LLM infrastructure and post-training frameworks.

* I love SpaceX rockets *

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shaheen Nabi shaheennabi

Achievements

Achievements

Block or report shaheennabi

Thanks for tuning here👋

Who I am

Pinned Loading

Uh oh!