LiteLLMBackend: mot._thinking not populated for vLLM thinking models

## Observed behaviour

When using `LiteLLMBackend` to route to a vLLM-served thinking model (e.g. Qwen3 with `--reasoning-parser qwen3`), `mot._thinking` is never populated — the reasoning trace is silently discarded.

## Expected behaviour

`mot._thinking` should contain the model's reasoning trace, consistent with the behaviour of `OpenAIBackend` after #1063.

## Root cause

vLLM surfaces the reasoning trace under the raw key `"reasoning"` in the message dict, not `"reasoning_content"`. LiteLLM's normalisation layer (`utils.py`, `text_choices["reasoning_content"] = delta.get("reasoning_content")`) only reads `reasoning_content` from the wire — it never reads `reasoning`. The field is dropped before the response reaches Mellea's `LiteLLMBackend.processing()`.

As a result, `message.reasoning_content` is `None` and `mot._thinking` stays `""`.

This differs from the `OpenAIBackend` fix in #1063: there the raw openai SDK object preserved the extra field in `model_extra`, so a fallback probe was possible. With LiteLLM the field is lost earlier in the stack.

## Reproducer

Route a vLLM thinking model through LiteLLM (e.g. using the `openai/` provider prefix pointing at a vLLM server with `--reasoning-parser qwen3`) and observe that `mot._thinking` is `""` after generation completes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LiteLLMBackend: mot._thinking not populated for vLLM thinking models #1070

Observed behaviour

Expected behaviour

Root cause

Reproducer

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

LiteLLMBackend: mot._thinking not populated for vLLM thinking models #1070

Description

Observed behaviour

Expected behaviour

Root cause

Reproducer

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions