-
Notifications
You must be signed in to change notification settings - Fork 12.2k
Open
Labels
coreAnything pertaining to core functionality of the application (opencode server stuff)Anything pertaining to core functionality of the application (opencode server stuff)docs
Description
Question
On my 5090, I've tested two models so far with locally hosted vLLM:
Qwen3-32B-AWQ
- Will actually call
bashcommands and CRUD files - Spends 20k tokens trying to install flutter dependencies, it's 50/50 if it will even build a test project.
Qwen2.5-Coder-32B-Instruct-AWQ
- Does not run any
bashor CRUD - Will only output .json
1. Tool-Use Capability
Which open-weight models reliably support tool calling (bash execution, filesystem CRUD, API calls) when served through vLLM?
2. Model Behavior Differences
Why would Qwen3-32B-AWQ execute shell commands while Qwen2.5-Coder-32B-Instruct-AWQ only output structured JSON responses? Is there a way to change any configs to allow qwen coder to use opencode properly?
3. Best Models for Local Coding Agents
Which locally runnable models to work with opencode under roughly 40B parameters (5090 32gb) currently perform best for coding agents that must plan tasks, edit files, and execute shell commands?
4. Quantization Impact
Does AWQ quantization significantly degrade planning, reasoning, or tool-calling ability in coding models compared with FP16 or GPTQ?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
coreAnything pertaining to core functionality of the application (opencode server stuff)Anything pertaining to core functionality of the application (opencode server stuff)docs