Configuration files and shell functions for running local LLMs with llama.cpp, including MCP tool support and a web search server powered by Gemini.
- llama.cpp installed and on PATH (
llama-server,llama-cli) - uv (for running the MCP search server and tools)
- Claude Code (optional, for
llama-claude) - Python 3.11+
- GGUF model files (see Models)
-
Clone the repo:
git clone <repo-url> ~/Documents/llama.cpp-config
-
Create your
mcp.jsonfrom the example and add your Gemini API key:cp mcp.json.example mcp.json
Update the path to
web-search-mcp.pyto match your setup. -
Source the profile script by adding this line to
~/.zshrc:source "$HOME/Documents/llama.cpp-config/llama-profile.sh"
-
Download models — llama.cpp stores them in
~/Library/Caches/llama.cpp/.
-
Clone the repo:
git clone <repo-url> "$HOME\Documents\llamacppconfig"
-
Create your
mcp.jsonfrom the example and add your Gemini API key:cp mcp.json.example mcp.json
Update the path to
web-search-mcp.pyto match your username. -
Source the profile script by adding this line to your PowerShell
$PROFILE:. "$HOME\Documents\llamacppconfig\llama-profile.ps1"
-
Download models to
~/AppData/Local/llama.cpp/.
| Command | Platform | Description |
|---|---|---|
llama-chat |
Both | Starts llama-server with the chat model, MCP proxy, vision support, and opens the web UI |
llama-code |
Windows | Starts llama-server with the code model (thinking enabled) and opens the web UI |
llama-claude |
Windows | Points Claude Code at your local llama-server as an OpenAI-compatible backend |
llama-test |
Windows | Quick CLI test of a model with optional context size and reasoning budget params |
Edit the model paths at the top of the profile scripts to match your own models.
- macOS (
llama-profile.sh): Qwen 3.5-4B Q8_0 + mmproj - Windows (
llama-profile.ps1): Qwen 3.5-9B Q8_0 + mmproj, OmniCoder-9B
| File | Description |
|---|---|
llama-profile.sh |
Zsh functions for macOS — sourced from ~/.zshrc |
llama-profile.ps1 |
PowerShell functions for Windows — dot-sourced from $PROFILE |
mcp.json.example |
Template for MCP server config (copy to mcp.json) |
web-search-mcp.py |
MCP server providing web_search and web_fetch tools via Gemini |
webui-config.json |
System prompt for the llama.cpp web UI |