OpenAI backend: mot._thinking never populated for vLLM thinking models (reasoning vs reasoning_content field name)

## Observed behaviour

When using `OpenAIBackend` with a vLLM-served thinking model (Qwen3 with `--reasoning-parser qwen3`), the thinking trace is silently discarded. `mot._thinking` remains `None` even when the model has produced reasoning content.

## Root cause

In `mellea/backends/openai.py`, `processing()` probes for the thinking trace as:

```python
if hasattr(message, "reasoning_content"):
    thinking_chunk = message.reasoning_content
```

However, vLLM 0.20.2 does not expose this as a Python attribute on the openai SDK message object. The thinking trace is present in the raw response dict under the key `"reasoning"`, only visible via `model_dump()`:

```python
resp.choices[0].model_dump()
# {
#   "message": {
#     "content": null,
#     "reasoning": "2 + 2 equals 4.",   ← here, not 'reasoning_content'
#     ...
#   }
# }
```

`hasattr(message, "reasoning_content")` returns `False`, so `mot._thinking` is never set. The thinking trace is present in `mot._meta["oai_chat_response"]` but never surfaced.

## Expected behaviour

`processing()` should also check the raw response dict for a `"reasoning"` key (vLLM's field name) when `reasoning_content` is absent, and populate `mot._thinking` accordingly.

This is independent of the silent-empty-string issue (#1060) — even when a model produces both text and reasoning content, the thinking trace is currently lost for vLLM-served models.

## Environment

- Backend: `OpenAIBackend` (OpenAI-compatible)
- Model: `Qwen/Qwen3-Coder-Next-FP8`
- Inference server: vLLM 0.20.2, flags: `--reasoning-parser qwen3 --enable-auto-tool-choice --tool-call-parser hermes --tensor-parallel-size 2 --quantization fp8 --max-model-len 262144`
- Hardware: 2× GPU (tensor parallel), IBM LSF cluster

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OpenAI backend: mot._thinking never populated for vLLM thinking models (reasoning vs reasoning_content field name) #1061

Observed behaviour

Root cause

Expected behaviour

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

OpenAI backend: mot._thinking never populated for vLLM thinking models (reasoning vs reasoning_content field name) #1061

Description

Observed behaviour

Root cause

Expected behaviour

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions