docker-agent version 1.42.0
In theory, when sending a set of messages with the OpenAI API to a local model, there should be only one message with a system role, and it must be the first message in the list.
Until now, inserting multiple messages with a system role hadn't really caused any issues (most chat templates are fairly permissive).
However, now, Qwen3.5's Jinja chat template rejects a message list of this type.
Qwen3.5's official chat_template.jinja contains logic like:
{%- for message in messages %}
{%- set content = render_content(message.content, true)|trim %}
{%- if message.role == "system" %}
{%- if not loop.first %}
{{- raise_exception('System message must be at the beginning.')
}}
{%- endif %}
https://huggingface.co/unsloth/Qwen3.5-2B/blob/main/chat_template.jinja#L85
This has an impact (error) on certain use cases of docker-agent with Docker Model Runner when using a custom provider.
1. docker-agent + dmr provider (it works)
If I use the model huggingface.co/unsloth/qwen3.5-2b-gguf:Q4_K_M with the dmr provider, I don't have any issues — every system-type message (such as those used for skill detection) is concatenated with the first system message:
docker-agent provider:
models:
brain:
provider: dmr
model: huggingface.co/unsloth/qwen3.5-2b-gguf:Q4_K_M
temperature: 0.0
top_p: 0.95
presence_penalty: 1.5
max_tokens: 65536
Request:
{
"max_tokens": 65536,
"messages": [
{
"content": "You are Bob, a coding expert\n\n## Custom Shell Tools\n\n### execute_command\nExecute a shell command and return its stdout and stderr output.\n- `command`: The shell command to execute.\n\n\nSkills provide specialized instructions for specific tasks. When a user's request matches a skill's description, use read_skill to load its instructions.\n\n<available_skills>\n <skill>\n <name>what-time-is-it</name>\n <description>display the current date and time</description>\n </skill>\n <skill>\n <name>greetings</name>\n <description>when the user writes \"node greetings\" to somebody, the agent will run this skill with the appropriate parameter.</description>\n </skill>\n <skill>\n <name>vulcan-salute</name>\n <description>when the user writes \"vulcan salute\" or \"vulcan-salute\" to somebody, the agent will run this skill with the appropriate parameter.</description>\n </skill>\n</available_skills>",
"role": "system"
},
{
"content": "what is your quest?",
"role": "user"
}
],
"model": "huggingface.co/unsloth/qwen3.5-2b-gguf:Q4_K_M",
"parallel_tool_calls": true,
"presence_penalty": 1.5,
"stream": true,
"stream_options": {
"include_usage": true
},
"temperature": 0,
"tools": [
{
"function": {
"description": "Execute a shell command and return its stdout and stderr output.",
"name": "execute_command",
"parameters": {
"properties": {
"command": {
"description": "The shell command to execute.",
"type": "string"
}
},
"type": "object"
}
},
"type": "function"
},
{
"function": {
"description": "Read the content of a skill by name. Use this when a user's request matches an available skill.",
"name": "read_skill",
"parameters": {
"properties": {
"name": {
"description": "The name of the skill to read",
"type": "string"
}
},
"required": [
"name"
],
"type": "object"
}
},
"type": "function"
}
],
"top_p": 0.95
}
2. docker-agent + custom provider for dmr (error)
If I use the model huggingface.co/unsloth/qwen3.5-2b-gguf:Q4_K_M with a custom provider, I'm getting this error:
all models failed: error receiving from stream: HTTP 500: POST
"http://host.docker.internal:12434/engines/v1/chat/completions": 500 Internal Server Error
{"code":500,"message":"\n------------\nWhile executing CallExpression at line 85, column 32 in
source:\n...first %}↵ {{- raise_exception('System message must be at the beginnin...\n
^\nError: Jinja Exception: System message must be at the beginning.","type":"server_error"}
But why do I need a custom provider for Docker Model Runner? Because I need to connect to Docker Model Runner from inside a Docker Sandbox.
Upon examining the request content, I noticed that the custom provider created 3 messages with a system role:
- the one I defined in the
docker-agent configuration file
- another one for a custom shell tool I defined
- another one related to skills
docker-agent provider:
providers:
host_dmr_provider:
api_type: openai_chatcompletions
base_url: http://host.docker.internal:12434/engines/v1
models:
brain:
provider: host_dmr_provider
#provider: dmr
model: huggingface.co/unsloth/qwen3.5-2b-gguf:Q4_K_M
temperature: 0.0
top_p: 0.95
presence_penalty: 1.5
max_tokens: 65536
Request:
{
"max_tokens": 65536,
"messages": [
{
"content": "You are Bob, a coding expert\n",
"role": "system"
},
{
"content": "## Custom Shell Tools\n\n### execute_command\nExecute a shell command and return its stdout and stderr output.\n- `command`: The shell command to execute.\n\n",
"role": "system"
},
{
"content": "Skills provide specialized instructions for specific tasks. When a user's request matches a skill's description, use read_skill to load its instructions.\n\n<available_skills>\n <skill>\n <name>greetings</name>\n <description>when the user writes \"node greetings\" to somebody, the agent will run this skill with the appropriate parameter.</description>\n </skill>\n <skill>\n <name>vulcan-salute</name>\n <description>when the user writes \"vulcan salute\" or \"vulcan-salute\" to somebody, the agent will run this skill with the appropriate parameter.</description>\n </skill>\n <skill>\n <name>what-time-is-it</name>\n <description>display the current date and time</description>\n </skill>\n</available_skills>",
"role": "system"
},
{
"content": "what is your favourite colour?",
"role": "user"
}
],
"model": "huggingface.co/unsloth/qwen3.5-2b-gguf:Q4_K_M",
"parallel_tool_calls": true,
"presence_penalty": 1.5,
"stream": true,
"stream_options": {
"include_usage": true
},
"temperature": 0,
"tools": [
{
"function": {
"description": "Execute a shell command and return its stdout and stderr output.",
"name": "execute_command",
"parameters": {
"additionalProperties": false,
"properties": {
"command": {
"description": "The shell command to execute.",
"type": [
"string",
"null"
]
}
},
"required": [
"command"
],
"type": "object"
}
},
"type": "function"
},
{
"function": {
"description": "Read the content of a skill by name. Use this when a user's request matches an available skill.",
"name": "read_skill",
"parameters": {
"additionalProperties": false,
"properties": {
"name": {
"description": "The name of the skill to read",
"type": "string"
}
},
"required": [
"name"
],
"type": "object"
}
},
"type": "function"
}
],
"top_p": 0.95
}
3. docker-agent + custom provider for ollama (it works)
docker-agent ollama custom provider:
providers:
host_ollama_provider:
api_type: openai_chatcompletions
base_url: http://host.docker.internal:11434/v1
models:
brain:
provider: host_ollama_provider
#provider: dmr
model: qwen3.5:2b
temperature: 0.0
top_p: 0.95
presence_penalty: 1.5
max_tokens: 65536
I took a look at Ollama's code — there is a renderer for Qwen3.5 models that works around the theory and assumes that there can be multiple system messages in a conversation: https://github.com/ollama/ollama/blob/main/model/renderers/qwen35.go#L138
Ollama doesn't execute the Jinja template embedded in the GGUF. Tt directly generates the token string in Go even if it finds system messages at a position other than 0, and injects them in the middle of the conversation.
if message.Role == "user" || (message.Role == "system" && i != 0) {
sb.WriteString(imStartTag + message.Role + "\n" + content + imEndTag + "\n")
}
DMR related issue: docker/model-runner#827
Conclusion
In my opinion, in a multi-agent system that shares its history, this issue is likely to occur again when using DMR.
I have several workarounds to continue preparing my demos with docker-agent + DMR + sbx:
- don't use models from the Qwen 3.5 family
- don't use
sbx in order to use the dmr provider
- patch the model (that's my next plan 🤓)
I'm not sure what the best strategy is to fix this:
- do it on the
docker-agent side, but the issue will arise with other agents using DMR (e.g. with shared history)
- do it like Ollama, and create a specific renderer on the DMR side
- document how to patch the model
- provide our own version of the model
I'm going to work on the last two points.
Here is the source code of my experiments if you need to reproduce them: https://codeberg.org/docker-agents/custom-provider-tests
docker-agent version 1.42.0In theory, when sending a set of messages with the OpenAI API to a local model, there should be only one message with a
systemrole, and it must be the first message in the list.Until now, inserting multiple messages with a
systemrole hadn't really caused any issues (most chat templates are fairly permissive).However, now, Qwen3.5's Jinja chat template rejects a message list of this type.
Qwen3.5's official
chat_template.jinjacontains logic like:This has an impact (error) on certain use cases of
docker-agentwith Docker Model Runner when using a custom provider.1.
docker-agent+dmrprovider (it works)If I use the model
huggingface.co/unsloth/qwen3.5-2b-gguf:Q4_K_Mwith thedmrprovider, I don't have any issues — every system-type message (such as those used for skill detection) is concatenated with the first system message:docker-agentprovider:Request:
{ "max_tokens": 65536, "messages": [ { "content": "You are Bob, a coding expert\n\n## Custom Shell Tools\n\n### execute_command\nExecute a shell command and return its stdout and stderr output.\n- `command`: The shell command to execute.\n\n\nSkills provide specialized instructions for specific tasks. When a user's request matches a skill's description, use read_skill to load its instructions.\n\n<available_skills>\n <skill>\n <name>what-time-is-it</name>\n <description>display the current date and time</description>\n </skill>\n <skill>\n <name>greetings</name>\n <description>when the user writes \"node greetings\" to somebody, the agent will run this skill with the appropriate parameter.</description>\n </skill>\n <skill>\n <name>vulcan-salute</name>\n <description>when the user writes \"vulcan salute\" or \"vulcan-salute\" to somebody, the agent will run this skill with the appropriate parameter.</description>\n </skill>\n</available_skills>", "role": "system" }, { "content": "what is your quest?", "role": "user" } ], "model": "huggingface.co/unsloth/qwen3.5-2b-gguf:Q4_K_M", "parallel_tool_calls": true, "presence_penalty": 1.5, "stream": true, "stream_options": { "include_usage": true }, "temperature": 0, "tools": [ { "function": { "description": "Execute a shell command and return its stdout and stderr output.", "name": "execute_command", "parameters": { "properties": { "command": { "description": "The shell command to execute.", "type": "string" } }, "type": "object" } }, "type": "function" }, { "function": { "description": "Read the content of a skill by name. Use this when a user's request matches an available skill.", "name": "read_skill", "parameters": { "properties": { "name": { "description": "The name of the skill to read", "type": "string" } }, "required": [ "name" ], "type": "object" } }, "type": "function" } ], "top_p": 0.95 }2.
docker-agent+ custom provider fordmr(error)If I use the model
huggingface.co/unsloth/qwen3.5-2b-gguf:Q4_K_Mwith a custom provider, I'm getting this error:Upon examining the request content, I noticed that the custom provider created 3 messages with a
systemrole:docker-agentconfiguration filedocker-agentprovider:Request:
{ "max_tokens": 65536, "messages": [ { "content": "You are Bob, a coding expert\n", "role": "system" }, { "content": "## Custom Shell Tools\n\n### execute_command\nExecute a shell command and return its stdout and stderr output.\n- `command`: The shell command to execute.\n\n", "role": "system" }, { "content": "Skills provide specialized instructions for specific tasks. When a user's request matches a skill's description, use read_skill to load its instructions.\n\n<available_skills>\n <skill>\n <name>greetings</name>\n <description>when the user writes \"node greetings\" to somebody, the agent will run this skill with the appropriate parameter.</description>\n </skill>\n <skill>\n <name>vulcan-salute</name>\n <description>when the user writes \"vulcan salute\" or \"vulcan-salute\" to somebody, the agent will run this skill with the appropriate parameter.</description>\n </skill>\n <skill>\n <name>what-time-is-it</name>\n <description>display the current date and time</description>\n </skill>\n</available_skills>", "role": "system" }, { "content": "what is your favourite colour?", "role": "user" } ], "model": "huggingface.co/unsloth/qwen3.5-2b-gguf:Q4_K_M", "parallel_tool_calls": true, "presence_penalty": 1.5, "stream": true, "stream_options": { "include_usage": true }, "temperature": 0, "tools": [ { "function": { "description": "Execute a shell command and return its stdout and stderr output.", "name": "execute_command", "parameters": { "additionalProperties": false, "properties": { "command": { "description": "The shell command to execute.", "type": [ "string", "null" ] } }, "required": [ "command" ], "type": "object" } }, "type": "function" }, { "function": { "description": "Read the content of a skill by name. Use this when a user's request matches an available skill.", "name": "read_skill", "parameters": { "additionalProperties": false, "properties": { "name": { "description": "The name of the skill to read", "type": "string" } }, "required": [ "name" ], "type": "object" } }, "type": "function" } ], "top_p": 0.95 }3.
docker-agent+ custom provider forollama(it works)docker-agentollama custom provider:I took a look at Ollama's code — there is a renderer for Qwen3.5 models that works around the theory and assumes that there can be multiple system messages in a conversation: https://github.com/ollama/ollama/blob/main/model/renderers/qwen35.go#L138
Ollama doesn't execute the Jinja template embedded in the GGUF. Tt directly generates the token string in Go even if it finds system messages at a position other than 0, and injects them in the middle of the conversation.
DMR related issue: docker/model-runner#827
Conclusion
In my opinion, in a multi-agent system that shares its history, this issue is likely to occur again when using DMR.
I have several workarounds to continue preparing my demos with
docker-agent+ DMR +sbx:sbxin order to use thedmrproviderI'm not sure what the best strategy is to fix this:
docker-agentside, but the issue will arise with other agents using DMR (e.g. with shared history)I'm going to work on the last two points.
Here is the source code of my experiments if you need to reproduce them: https://codeberg.org/docker-agents/custom-provider-tests