`Jinja Exception: System message must be at the beginning` with custom provider, docker model runner and qwen 3.5 family models

`docker-agent version 1.42.0`

In **theory**, when sending a set of messages with the OpenAI API to a local model, there should be only one message with a `system` role, and it must be the first message in the list.

Until now, inserting multiple messages with a `system` role hadn't really caused any issues (most chat templates are fairly permissive).

However, now, Qwen3.5's Jinja chat template rejects a message list of this type. 

Qwen3.5's official `chat_template.jinja` contains logic like:

```
{%- for message in messages %}
{%- set content = render_content(message.content, true)|trim %}
{%- if message.role == "system" %}
{%- if not loop.first %}
{{- raise_exception('System message must be at the beginning.')
}}
{%- endif %}
```
> https://huggingface.co/unsloth/Qwen3.5-2B/blob/main/chat_template.jinja#L85

This has an impact (error) on certain use cases of `docker-agent` with Docker Model Runner when using a custom provider.

### 1. `docker-agent` + `dmr` provider (it works)

If I use the model `huggingface.co/unsloth/qwen3.5-2b-gguf:Q4_K_M` with the `dmr` provider, I don't have any issues — every system-type message (such as those used for skill detection) is concatenated with the first system message:

**`docker-agent` provider**:

```yaml
models:

  brain:
    provider: dmr
    model: huggingface.co/unsloth/qwen3.5-2b-gguf:Q4_K_M
    temperature: 0.0
    top_p: 0.95
    presence_penalty: 1.5 
    max_tokens: 65536
```

**Request**:

```json
{
  "max_tokens": 65536,
  "messages": [
    {
      "content": "You are Bob, a coding expert\n\n## Custom Shell Tools\n\n### execute_command\nExecute a shell command and return its stdout and stderr output.\n- `command`: The shell command to execute.\n\n\nSkills provide specialized instructions for specific tasks. When a user's request matches a skill's description, use read_skill to load its instructions.\n\n<available_skills>\n  <skill>\n    <name>what-time-is-it</name>\n    <description>display the current date and time</description>\n  </skill>\n  <skill>\n    <name>greetings</name>\n    <description>when the user writes \"node greetings\" to somebody, the agent will run this skill with the appropriate parameter.</description>\n  </skill>\n  <skill>\n    <name>vulcan-salute</name>\n    <description>when the user writes \"vulcan salute\" or \"vulcan-salute\" to somebody, the agent will run this skill with the appropriate parameter.</description>\n  </skill>\n</available_skills>",
      "role": "system"
    },
    {
      "content": "what is your quest?",
      "role": "user"
    }
  ],
  "model": "huggingface.co/unsloth/qwen3.5-2b-gguf:Q4_K_M",
  "parallel_tool_calls": true,
  "presence_penalty": 1.5,
  "stream": true,
  "stream_options": {
    "include_usage": true
  },
  "temperature": 0,
  "tools": [
    {
      "function": {
        "description": "Execute a shell command and return its stdout and stderr output.",
        "name": "execute_command",
        "parameters": {
          "properties": {
            "command": {
              "description": "The shell command to execute.",
              "type": "string"
            }
          },
          "type": "object"
        }
      },
      "type": "function"
    },
    {
      "function": {
        "description": "Read the content of a skill by name. Use this when a user's request matches an available skill.",
        "name": "read_skill",
        "parameters": {
          "properties": {
            "name": {
              "description": "The name of the skill to read",
              "type": "string"
            }
          },
          "required": [
            "name"
          ],
          "type": "object"
        }
      },
      "type": "function"
    }
  ],
  "top_p": 0.95
}
```

### 2. `docker-agent` + custom provider for `dmr` (error)

If I use the model `huggingface.co/unsloth/qwen3.5-2b-gguf:Q4_K_M` with a custom provider, I'm getting this error:

```raw
all models failed: error receiving from stream: HTTP 500: POST
"http://host.docker.internal:12434/engines/v1/chat/completions": 500 Internal Server Error
{"code":500,"message":"\n------------\nWhile executing CallExpression at line 85, column 32 in
source:\n...first %}↵            {{- raise_exception('System message must be at the beginnin...\n
^\nError: Jinja Exception: System message must be at the beginning.","type":"server_error"}
```

> But why do I need a custom provider for Docker Model Runner? Because I need to connect to Docker Model Runner from inside a Docker Sandbox.

Upon examining the request content, I noticed that the custom provider created 3 messages with a `system` role:
- the one I defined in the `docker-agent` configuration file
- another one for a custom shell tool I defined
- another one related to skills

**`docker-agent` provider**:

```yaml
providers:
  host_dmr_provider:
    api_type: openai_chatcompletions
    base_url: http://host.docker.internal:12434/engines/v1

models:

  brain:
    provider: host_dmr_provider
    #provider: dmr
    model: huggingface.co/unsloth/qwen3.5-2b-gguf:Q4_K_M
    temperature: 0.0
    top_p: 0.95
    presence_penalty: 1.5
    max_tokens: 65536
```

**Request**:

```json
{
  "max_tokens": 65536,
  "messages": [
    {
      "content": "You are Bob, a coding expert\n",
      "role": "system"
    },
    {
      "content": "## Custom Shell Tools\n\n### execute_command\nExecute a shell command and return its stdout and stderr output.\n- `command`: The shell command to execute.\n\n",
      "role": "system"
    },
    {
      "content": "Skills provide specialized instructions for specific tasks. When a user's request matches a skill's description, use read_skill to load its instructions.\n\n<available_skills>\n  <skill>\n    <name>greetings</name>\n    <description>when the user writes \"node greetings\" to somebody, the agent will run this skill with the appropriate parameter.</description>\n  </skill>\n  <skill>\n    <name>vulcan-salute</name>\n    <description>when the user writes \"vulcan salute\" or \"vulcan-salute\" to somebody, the agent will run this skill with the appropriate parameter.</description>\n  </skill>\n  <skill>\n    <name>what-time-is-it</name>\n    <description>display the current date and time</description>\n  </skill>\n</available_skills>",
      "role": "system"
    },
    {
      "content": "what is your favourite colour?",
      "role": "user"
    }
  ],
  "model": "huggingface.co/unsloth/qwen3.5-2b-gguf:Q4_K_M",
  "parallel_tool_calls": true,
  "presence_penalty": 1.5,
  "stream": true,
  "stream_options": {
    "include_usage": true
  },
  "temperature": 0,
  "tools": [
    {
      "function": {
        "description": "Execute a shell command and return its stdout and stderr output.",
        "name": "execute_command",
        "parameters": {
          "additionalProperties": false,
          "properties": {
            "command": {
              "description": "The shell command to execute.",
              "type": [
                "string",
                "null"
              ]
            }
          },
          "required": [
            "command"
          ],
          "type": "object"
        }
      },
      "type": "function"
    },
    {
      "function": {
        "description": "Read the content of a skill by name. Use this when a user's request matches an available skill.",
        "name": "read_skill",
        "parameters": {
          "additionalProperties": false,
          "properties": {
            "name": {
              "description": "The name of the skill to read",
              "type": "string"
            }
          },
          "required": [
            "name"
          ],
          "type": "object"
        }
      },
      "type": "function"
    }
  ],
  "top_p": 0.95
}
```

### 3. `docker-agent` + custom provider for `ollama` (it works)

**`docker-agent` ollama custom provider**:

```yaml
providers:
  host_ollama_provider:
    api_type: openai_chatcompletions
    base_url: http://host.docker.internal:11434/v1

models:

  brain:
    provider: host_ollama_provider
    #provider: dmr
    model: qwen3.5:2b
    temperature: 0.0
    top_p: 0.95
    presence_penalty: 1.5
    max_tokens: 65536
```

I took a look at Ollama's code — there is a renderer for Qwen3.5 models that works around the theory and assumes that there can be multiple system messages in a conversation: https://github.com/ollama/ollama/blob/main/model/renderers/qwen35.go#L138

Ollama doesn't execute the Jinja template embedded in the GGUF. Tt directly generates the token string in Go even if it finds system messages at a position other than 0, and injects them in the middle of the conversation.

```golang
if message.Role == "user" || (message.Role == "system" && i != 0) {
	sb.WriteString(imStartTag + message.Role + "\n" + content + imEndTag + "\n")
}
```

DMR related issue: https://github.com/docker/model-runner/issues/827

### Conclusion

In my opinion, in a multi-agent system that shares its history, this issue is likely to occur again when using DMR.

I have several workarounds to continue preparing my demos with `docker-agent` + DMR + `sbx`:
- don't use models from the Qwen 3.5 family
- don't use `sbx` in order to use the `dmr` provider
- patch the model (that's my next plan 🤓)

I'm not sure what the best strategy is to fix this:
- do it on the `docker-agent` side, but the issue will arise with other agents using DMR (e.g. with shared history)
- do it like Ollama, and create a specific renderer on the DMR side
- document how to patch the model
- provide our own version of the model

I'm going to work on the last two points.

Here is the source code of my experiments if you need to reproduce them: https://codeberg.org/docker-agents/custom-provider-tests

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`Jinja Exception: System message must be at the beginning` with custom provider, docker model runner and qwen 3.5 family models #2327

1. `docker-agent` + `dmr` provider (it works)

2. `docker-agent` + custom provider for `dmr` (error)

3. `docker-agent` + custom provider for `ollama` (it works)

Conclusion

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Jinja Exception: System message must be at the beginning with custom provider, docker model runner and qwen 3.5 family models #2327

Description

1. docker-agent + dmr provider (it works)

2. docker-agent + custom provider for dmr (error)

3. docker-agent + custom provider for ollama (it works)

Conclusion

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`Jinja Exception: System message must be at the beginning` with custom provider, docker model runner and qwen 3.5 family models #2327

1. `docker-agent` + `dmr` provider (it works)

2. `docker-agent` + custom provider for `dmr` (error)

3. `docker-agent` + custom provider for `ollama` (it works)