Tinkerflow-AI/Model Management.md at main · RickCreator87/Tinkerflow-AI

You got it — I’ll deliver all remaining Jekyll documentation pages, one by one, clean, production‑read AI Gateway Ollama — Model Management (Jekyll Markdown)

`markdown

layout: docs title: "Model Management" description: "How to register, configure, version, and manage models in AI Gateway Ollama." nav_order: 5

Model Management

AI Gateway Ollama provides a clean, governance‑aligned way to expose Ollama models as stable services.
This page explains how to register models, version them, and control access.

📁 Defining Models in config.yml

Models are declared under the models: section:

`yaml models: llama3: model: "llama3" version: "latest" max_tokens: 4096

coder: model: "codellama" version: "13b" max_tokens: 8192 `

Each entry creates a named service accessible through the API.

🔖 Model Fields

Field	Type	Required	Description
model	string	yes	Name of the Ollama model
version	string	no	Version tag or alias
max_tokens	number	no	Hard limit per request

🧩 Why Use Named Services?

Instead of exposing raw Ollama model names, the gateway lets you:

Create stable service names (llama3, coder)
Swap backend models without breaking clients
Pin versions for reproducibility
Apply per‑model access rules
Enforce per‑model token limits

This is essential for governance and long‑term maintainability.

🔄 Updating Models

To update a model:

Pull or create the new model in Ollama
Update the gateway config
Restart the gateway

Example:

yaml models: llama3: model: "llama3" version: "8b"

Clients continue using:

model: "llama3"

…without needing to change anything.

🧪 Testing a Model

bash curl -X POST http://localhost:8080/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "llama3", "messages": [{"role": "user", "content": "Test"}] }'

🔐 Per‑Model Access Control

If authentication is enabled, you can restrict models by scope:

yaml auth: enabled: true keys: - id: "developer" key: "DEV_KEY" scopes: ["chat:write", "models:llama3"]

This ensures sensitive or expensive models are protected.

🧱 Model Version Pinning

For reproducibility:

yaml models: llama3: model: "llama3" version: "3.1.0"

This prevents accidental upgrades.

🧰 Model Aliasing

Expose multiple names for the same backend model:

yaml models: general: model: "llama3" creative: model: "llama3" temperature: 1.2

Useful for UX‑focused or persona‑based endpoints.

🧮 Token Limits

Override per‑model limits:

yaml models: coder: model: "codellama" max_tokens: 16384

The gateway enforces this even if Ollama does not.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`markdown

layout: docs title: "Model Management" description: "How to register, configure, version, and manage models in AI Gateway Ollama." nav_order: 5

Uh oh!

FilesExpand file tree

Model Management.md

Latest commit

History

Model Management.md

File metadata and controls

`markdown

layout: docs title: "Model Management" description: "How to register, configure, version, and manage models in AI Gateway Ollama." nav_order: 5