|
| 1 | +# Run a Local LLM in Your .NET 10 API with Ollama |
| 2 | + |
| 3 | +## How Microsoft.Extensions.AI Makes Your API AI-Ready Without Locking You Into One Provider |
| 4 | + |
| 5 | +Every developer wants AI in their app. The problem is getting started: API keys, cloud costs, rate limits, and the fear of betting your architecture on one vendor. What if you could add a working AI endpoint to your .NET 10 API in under an hour — for free, running entirely on your laptop? |
| 6 | + |
| 7 | +This article shows you exactly how, using [Ollama](https://ollama.com) for a local LLM and `Microsoft.Extensions.AI` as a provider-agnostic abstraction. |
| 8 | + |
| 9 | +📖 **Tutorial Repository:** [AngularNetTutorial on GitHub](https://github.com/workcontrolgit/AngularNetTutorial) |
| 10 | + |
| 11 | +--- |
| 12 | + |
| 13 | +This article is part of the **AngularNetTutorial** series. The full-stack tutorial — covering Angular 20, .NET 10 Web API, and OAuth 2.0 with Duende IdentityServer — has been published at [Building Modern Web Applications with Angular, .NET, and OAuth 2.0](https://medium.com/scrum-and-coke/building-modern-web-applications-with-angular-net-and-oauth-2-0-complete-tutorial-series-7ea97ed3fc56). **This article kicks off Series 6 by adding AI capabilities to the existing TalentManagement API — without breaking any existing functionality for developers who don't have Ollama installed.** |
| 14 | + |
| 15 | +--- |
| 16 | + |
| 17 | +## 🎓 What You'll Learn |
| 18 | + |
| 19 | +* **Microsoft.Extensions.AI abstraction** — How `IChatClient` lets you swap LLM providers by changing one line |
| 20 | +* **Ollama integration** — Pull a free local model and connect it to your .NET API in minutes |
| 21 | +* **Feature flag gating** — Why `[FeatureGate("AiEnabled")]` is the safest way to ship AI without breaking existing users |
| 22 | +* **Clean Architecture placement** — Where AI interfaces, implementations, and controllers belong in the layer structure |
| 23 | +* **Provider-agnostic DI** — How to register `AddOllamaChatClient()` so the rest of the app never knows which provider you're using |
| 24 | + |
| 25 | +--- |
| 26 | + |
| 27 | +## 📋 Prerequisites |
| 28 | + |
| 29 | +**Before following this article, you should have:** |
| 30 | + |
| 31 | +* **TalentManagement stack running** — Complete [Series 0–5](https://medium.com/scrum-and-coke/building-modern-web-applications-with-angular-net-and-oauth-2-0-complete-tutorial-series-7ea97ed3fc56) or clone the tutorial repo |
| 32 | +* **.NET 10 SDK** — `dotnet --version` should show `10.x` |
| 33 | +* **Ollama installed** — Download from [ollama.com](https://ollama.com/download) (free, no account required) |
| 34 | +* **llama3.2 model pulled** — `ollama pull llama3.2` (~2 GB download) |
| 35 | +* **Basic C# and Clean Architecture familiarity** — Understanding of interfaces, DI, and MediatR helps |
| 36 | + |
| 37 | +**Not set up yet?** Follow the [AngularNetTutorial setup guide](https://github.com/workcontrolgit/AngularNetTutorial) first. |
| 38 | + |
| 39 | +--- |
| 40 | + |
| 41 | +## 🎯 The Problem |
| 42 | + |
| 43 | +Adding AI to a production .NET API sounds daunting. Most tutorials show you how to call OpenAI with an API key — which is fine until you hit a rate limit, get an unexpected bill, or need to demo the app offline. Developers following a tutorial shouldn't need a credit card. |
| 44 | + |
| 45 | +Beyond getting started, there's an architectural risk: if your AI code reaches directly into the OpenAI SDK, switching providers later means touching every file that calls it. You've created a tight dependency on one vendor. |
| 46 | + |
| 47 | +**Common pain points:** |
| 48 | + |
| 49 | +* **Vendor lock-in** — Switching from OpenAI to Azure OpenAI (or Ollama) requires rewriting service code |
| 50 | +* **Cost barrier** — Cloud LLMs require API keys, rate limits, and billing setup before you can write a single test |
| 51 | +* **Feature flag complexity** — Without proper gating, enabling AI affects every user — even those on machines without Ollama installed |
| 52 | + |
| 53 | +--- |
| 54 | + |
| 55 | +## 💡 The Solution |
| 56 | + |
| 57 | +[Microsoft.Extensions.AI](https://learn.microsoft.com/en-us/dotnet/ai/microsoft-extensions-ai) provides a single `IChatClient` interface that works identically across providers. Register `AddOllamaChatClient()` in development, `AddAzureOpenAIChatClient()` in production — your `IAiChatService` implementation doesn't change. |
| 58 | + |
| 59 | +[Ollama](https://ollama.com) runs open-weight models like `llama3.2` locally. No API key. No cloud. Works offline. Perfect for tutorials and development. |
| 60 | + |
| 61 | +We gate the entire `AiController` behind a `[FeatureGate("AiEnabled")]` attribute. When `"AiEnabled": false` (the default), the controller doesn't even respond to requests — no Ollama connection is attempted, the rest of the API is unaffected. |
| 62 | + |
| 63 | +**Key benefits:** |
| 64 | + |
| 65 | +* ✅ **Zero cost** — Ollama is free; no API key, no credit card, no rate limits |
| 66 | +* ✅ **Provider-agnostic** — Swap Ollama → Azure OpenAI → Anthropic by changing one DI registration line |
| 67 | +* ✅ **Safe coexistence** — Feature flag default `false` means original tutorial (Series 0–5) works unchanged |
| 68 | +* ✅ **Clean Architecture** — Interface in Application, implementation in Infrastructure.Shared, controller in WebApi |
| 69 | + |
| 70 | +--- |
| 71 | + |
| 72 | +## 🚀 How It Works |
| 73 | + |
| 74 | +### Step 1: Install Ollama and Pull a Model |
| 75 | + |
| 76 | +Download Ollama from [ollama.com/download](https://ollama.com/download) for your OS. After installation: |
| 77 | + |
| 78 | +```bash |
| 79 | +# Pull the llama3.2 model (~2 GB — fast, capable, great for tutorials) |
| 80 | +ollama pull llama3.2 |
| 81 | + |
| 82 | +# Start the Ollama server (runs at http://localhost:11434) |
| 83 | +ollama serve |
| 84 | + |
| 85 | +# Verify it's running |
| 86 | +curl http://localhost:11434/api/tags |
| 87 | +``` |
| 88 | + |
| 89 | +**What this does:** Ollama downloads model weights and runs a local HTTP server that accepts chat requests. Our .NET API will call this endpoint internally — no external network traffic. |
| 90 | + |
| 91 | +### Step 2: Add NuGet Packages |
| 92 | + |
| 93 | +The AI packages split across two projects to maintain Clean Architecture separation: |
| 94 | + |
| 95 | +**`TalentManagementAPI.WebApi.csproj`** — The Ollama provider lives here: |
| 96 | + |
| 97 | +```xml |
| 98 | +<PackageReference Include="Microsoft.Extensions.AI.Ollama" Version="9.5.0" /> |
| 99 | +``` |
| 100 | + |
| 101 | +**`TalentManagementAPI.Infrastructure.Shared.csproj`** — The abstraction lives here: |
| 102 | + |
| 103 | +```xml |
| 104 | +<PackageReference Include="Microsoft.Extensions.AI" Version="9.5.0" /> |
| 105 | +``` |
| 106 | + |
| 107 | +**Why split?** The Application and Infrastructure layers must never reference provider-specific packages (`Microsoft.Extensions.AI.Ollama`). Only WebApi knows which provider is registered. Infrastructure.Shared only knows about `IChatClient` from `Microsoft.Extensions.AI`. |
| 108 | + |
| 109 | +### Step 3: Add Feature Flag and Ollama Config |
| 110 | + |
| 111 | +In `TalentManagementAPI.WebApi/appsettings.json`, add `AiEnabled` to the existing `FeatureManagement` section and a new `Ollama` section: |
| 112 | + |
| 113 | +```json |
| 114 | +"FeatureManagement": { |
| 115 | + "AuthEnabled": true, |
| 116 | + "CacheEnabled": true, |
| 117 | + "AiEnabled": false |
| 118 | +}, |
| 119 | +"Ollama": { |
| 120 | + "BaseUrl": "http://localhost:11434", |
| 121 | + "Model": "llama3.2" |
| 122 | +} |
| 123 | +``` |
| 124 | + |
| 125 | +**Key point:** `"AiEnabled": false` is the default. Developers who haven't installed Ollama can still clone and run the full stack — the AI endpoint simply returns 404. To activate AI features, change this to `true` and ensure Ollama is running. |
| 126 | + |
| 127 | +### Step 4: Define the Application Interface |
| 128 | + |
| 129 | +Create `TalentManagementAPI.Application/Interfaces/IAiChatService.cs`: |
| 130 | + |
| 131 | +```csharp |
| 132 | +namespace TalentManagementAPI.Application.Interfaces |
| 133 | +{ |
| 134 | + public interface IAiChatService |
| 135 | + { |
| 136 | + Task<string> ChatAsync(string message, string? systemPrompt = null, |
| 137 | + CancellationToken cancellationToken = default); |
| 138 | + } |
| 139 | +} |
| 140 | +``` |
| 141 | + |
| 142 | +**Why an interface?** The Application layer defines *what* the service does — not *how*. This follows the Dependency Inversion Principle: high-level modules (Application) don't depend on low-level details (Ollama SDK). Tests can inject a mock `IAiChatService` without needing Ollama running. |
| 143 | + |
| 144 | +### Step 5: Implement in Infrastructure.Shared |
| 145 | + |
| 146 | +Create `TalentManagementAPI.Infrastructure.Shared/Services/OllamaAiService.cs`: |
| 147 | + |
| 148 | +```csharp |
| 149 | +using Microsoft.Extensions.AI; |
| 150 | +using TalentManagementAPI.Application.Interfaces; |
| 151 | + |
| 152 | +namespace TalentManagementAPI.Infrastructure.Shared.Services |
| 153 | +{ |
| 154 | + public class OllamaAiService : IAiChatService |
| 155 | + { |
| 156 | + private readonly IChatClient _chatClient; |
| 157 | + |
| 158 | + public OllamaAiService(IChatClient chatClient) |
| 159 | + { |
| 160 | + _chatClient = chatClient; |
| 161 | + } |
| 162 | + |
| 163 | + public async Task<string> ChatAsync(string message, string? systemPrompt = null, |
| 164 | + CancellationToken cancellationToken = default) |
| 165 | + { |
| 166 | + var messages = new List<ChatMessage>(); |
| 167 | + |
| 168 | + if (!string.IsNullOrWhiteSpace(systemPrompt)) |
| 169 | + messages.Add(new ChatMessage(ChatRole.System, systemPrompt)); |
| 170 | + |
| 171 | + messages.Add(new ChatMessage(ChatRole.User, message)); |
| 172 | + |
| 173 | + var response = await _chatClient.CompleteAsync(messages, cancellationToken: cancellationToken); |
| 174 | + return response.Message.Text ?? string.Empty; |
| 175 | + } |
| 176 | + } |
| 177 | +} |
| 178 | +``` |
| 179 | + |
| 180 | +**What this does:** `OllamaAiService` takes `IChatClient` from DI — it has no idea it's talking to Ollama specifically. The `CompleteAsync` method sends the message list and returns the model's reply. An optional system prompt lets callers control the AI's persona or constraints. |
| 181 | + |
| 182 | +### Step 6: Register Services |
| 183 | + |
| 184 | +In `Infrastructure.Shared/ServiceRegistration.cs`, add the `IAiChatService` → `OllamaAiService` binding: |
| 185 | + |
| 186 | +```csharp |
| 187 | +using TalentManagementAPI.Application.Interfaces; |
| 188 | +using TalentManagementAPI.Infrastructure.Shared.Services; |
| 189 | + |
| 190 | +public static void AddSharedInfrastructure(this IServiceCollection services, IConfiguration _config) |
| 191 | +{ |
| 192 | + services.Configure<MailSettings>(_config.GetSection("MailSettings")); |
| 193 | + services.AddTransient<IDateTimeService, DateTimeService>(); |
| 194 | + services.AddTransient<IEmailService, EmailService>(); |
| 195 | + services.AddTransient<IMockService, MockService>(); |
| 196 | + services.AddTransient<IAiChatService, OllamaAiService>(); |
| 197 | +} |
| 198 | +``` |
| 199 | + |
| 200 | +In `WebApi/Program.cs`, register the Ollama provider for `IChatClient`: |
| 201 | + |
| 202 | +```csharp |
| 203 | +// Register application services |
| 204 | +builder.Services.AddApplicationLayer(); |
| 205 | +builder.Services.AddPersistenceInfrastructure(builder.Configuration); |
| 206 | +builder.Services.AddSharedInfrastructure(builder.Configuration); |
| 207 | + |
| 208 | +// Register Ollama chat client (IChatClient) — used by OllamaAiService |
| 209 | +// AiController is gated by [FeatureGate("AiEnabled")], so no calls are made when AI is disabled |
| 210 | +var ollamaBaseUrl = builder.Configuration["Ollama:BaseUrl"] ?? "http://localhost:11434"; |
| 211 | +var ollamaModel = builder.Configuration["Ollama:Model"] ?? "llama3.2"; |
| 212 | +builder.Services.AddOllamaChatClient(ollamaModel, new Uri(ollamaBaseUrl)); |
| 213 | +``` |
| 214 | + |
| 215 | +**What this does:** `AddOllamaChatClient()` registers `IChatClient` in the DI container pointing to Ollama. `OllamaAiService` receives this via constructor injection. If you later want to use Azure OpenAI, you'd replace `AddOllamaChatClient()` with `AddAzureOpenAIChatClient()` — and nothing else changes. |
| 216 | + |
| 217 | +### Step 7: Create the AI Controller |
| 218 | + |
| 219 | +Create `TalentManagementAPI.WebApi/Controllers/v1/AiController.cs`: |
| 220 | + |
| 221 | +```csharp |
| 222 | +using Asp.Versioning; |
| 223 | +using Microsoft.AspNetCore.Authorization; |
| 224 | +using Microsoft.AspNetCore.Mvc; |
| 225 | +using Microsoft.FeatureManagement.Mvc; |
| 226 | +using TalentManagementAPI.Application.Interfaces; |
| 227 | + |
| 228 | +namespace TalentManagementAPI.WebApi.Controllers.v1 |
| 229 | +{ |
| 230 | + [FeatureGate("AiEnabled")] |
| 231 | + [ApiVersion("1.0")] |
| 232 | + [AllowAnonymous] |
| 233 | + [Route("api/v{version:apiVersion}/ai")] |
| 234 | + public sealed class AiController : BaseApiController |
| 235 | + { |
| 236 | + private readonly IAiChatService _aiChatService; |
| 237 | + |
| 238 | + public AiController(IAiChatService aiChatService) |
| 239 | + { |
| 240 | + _aiChatService = aiChatService; |
| 241 | + } |
| 242 | + |
| 243 | + /// <summary> |
| 244 | + /// Send a message to the AI assistant and receive a reply. |
| 245 | + /// </summary> |
| 246 | + [HttpPost("chat")] |
| 247 | + public async Task<IActionResult> Chat([FromBody] AiChatRequest request, |
| 248 | + CancellationToken cancellationToken) |
| 249 | + { |
| 250 | + var reply = await _aiChatService.ChatAsync( |
| 251 | + request.Message, request.SystemPrompt, cancellationToken); |
| 252 | + return Ok(new AiChatResponse(reply)); |
| 253 | + } |
| 254 | + } |
| 255 | + |
| 256 | + public record AiChatRequest(string Message, string? SystemPrompt = null); |
| 257 | + public record AiChatResponse(string Reply); |
| 258 | +} |
| 259 | +``` |
| 260 | + |
| 261 | +**What `[FeatureGate("AiEnabled")]` does:** When `AiEnabled` is `false` in `appsettings.json`, ASP.NET Core returns a `404 Not Found` for all routes under this controller. Ollama is never called. The controller doesn't appear in Swagger. To the rest of the app, it doesn't exist. |
| 262 | + |
| 263 | +When `AiEnabled` is `true`, the endpoint becomes fully active. No other code changes needed. |
| 264 | + |
| 265 | +--- |
| 266 | + |
| 267 | +## 💻 Try It Yourself |
| 268 | + |
| 269 | +**Enable AI features** by setting `"AiEnabled": true` in `appsettings.json` and starting Ollama: |
| 270 | + |
| 271 | +```bash |
| 272 | +# Terminal 1: Start Ollama (if not already running) |
| 273 | +ollama serve |
| 274 | + |
| 275 | +# Terminal 2: Start the .NET API (from the ApiResources submodule) |
| 276 | +cd ApiResources/TalentManagement-API |
| 277 | +dotnet run |
| 278 | +``` |
| 279 | + |
| 280 | +Open Swagger at `https://localhost:44378/swagger` and find the **Ai** section. |
| 281 | + |
| 282 | + |
| 283 | + |
| 284 | +Click **POST /api/v1/ai/chat**, then **Try it out**, and send: |
| 285 | + |
| 286 | +```json |
| 287 | +{ |
| 288 | + "message": "What is the difference between OAuth 2.0 and OIDC?", |
| 289 | + "systemPrompt": "You are a helpful assistant specializing in identity and security." |
| 290 | +} |
| 291 | +``` |
| 292 | + |
| 293 | +You'll see Ollama's reply in the response body within a few seconds. |
| 294 | + |
| 295 | +**To test without Swagger** — use curl: |
| 296 | + |
| 297 | +```bash |
| 298 | +curl -X POST https://localhost:44378/api/v1/ai/chat \ |
| 299 | + -H "Content-Type: application/json" \ |
| 300 | + -k \ |
| 301 | + -d '{"message": "Explain JWT tokens in one paragraph."}' |
| 302 | +``` |
| 303 | + |
| 304 | +**To verify the feature flag** — set `"AiEnabled": false`, restart the API, and try the same curl. You'll get a `404` — the controller is invisible. |
| 305 | + |
| 306 | +--- |
| 307 | + |
| 308 | +## 📊 Real-World Impact |
| 309 | + |
| 310 | +**Before this approach:** |
| 311 | + |
| 312 | +* ❌ AI code is tightly coupled to OpenAI SDK — migrating to another provider requires rewriting service code |
| 313 | +* ❌ Tutorial readers need an API key and billing account just to run the demo |
| 314 | +* ❌ Enabling AI in `develop` branch breaks builds for developers without Ollama |
| 315 | + |
| 316 | +**After this approach:** |
| 317 | + |
| 318 | +* ✅ Swap Ollama for Azure OpenAI in production by changing one DI line — application code unchanged |
| 319 | +* ✅ Zero-cost, zero-signup AI during development — every tutorial reader can follow along |
| 320 | +* ✅ Feature flag default `false` means the full Series 0–5 stack runs unchanged — AI is opt-in |
| 321 | + |
| 322 | +--- |
| 323 | + |
| 324 | +## 🌟 Why This Matters |
| 325 | + |
| 326 | +The `IChatClient` abstraction from Microsoft is the `.NET HTTP Client` of AI — a standard interface the ecosystem is aligning around. By building on it now, your code is forward-compatible with whatever provider becomes the best choice in 12 months. |
| 327 | + |
| 328 | +For tutorial purposes, Ollama removes the biggest barrier to learning: access. Every developer on every OS can pull `llama3.2`, type `ollama serve`, and have a working LLM in their local environment. No billing, no configuration, no waiting for API access. |
| 329 | + |
| 330 | +The feature flag pattern ensures this is safe to ship: the codebase always builds, always runs, and the original Series 0–5 experience is completely unchanged. AI features activate on demand. |
| 331 | + |
| 332 | +**Transferable skills:** |
| 333 | + |
| 334 | +* **Provider-agnostic AI abstractions** — The `IChatClient` pattern applies equally to Azure OpenAI, Anthropic, Google, and Hugging Face endpoints |
| 335 | +* **Feature flag architecture** — The `[FeatureGate]` pattern applies to any experimental or optional feature |
| 336 | +* **Clean Architecture for external services** — Interface in Application, implementation in Infrastructure, provider registration in WebApi |
| 337 | + |
| 338 | +--- |
| 339 | + |
| 340 | +## 🤝 Community & Support |
| 341 | + |
| 342 | +**Questions or feedback?** The tutorial repository welcomes: |
| 343 | + |
| 344 | +* ⭐ **GitHub stars** — Help others discover it! |
| 345 | +* 🐛 **Issue reports** — Found a bug or have a suggestion? |
| 346 | +* 💬 **Discussions** — Ask questions, share your use cases |
| 347 | +* 🚀 **Pull requests** — Improvements always appreciated |
| 348 | + |
| 349 | +**Found this helpful?** Share it with your team and follow for more full-stack development content! |
| 350 | + |
| 351 | +--- |
| 352 | + |
| 353 | +## 📖 Series Navigation |
| 354 | + |
| 355 | +**AngularNetTutorial Blog Series:** |
| 356 | + |
| 357 | +* [Building Modern Web Applications with Angular, .NET, and OAuth 2.0](https://medium.com/scrum-and-coke/building-modern-web-applications-with-angular-net-and-oauth-2-0-complete-tutorial-series-7ea97ed3fc56) — Main tutorial |
| 358 | +* [Stop Juggling Multiple Repos: Manage Your Full-Stack App Like a Workspace](../series-0-architecture/0.1-git-submodule-workspace.md) — Git Submodules |
| 359 | +* [End-to-End Testing Made Simple: How Playwright Transforms Testing](../series-0-architecture/0.2-playwright-testing.md) — Playwright Overview |
| 360 | +* [Speed Up Your Dashboard: Easy Response Caching in .NET 10 With EasyCaching](../series-2-dotnet-api/2.5-dotnet-easycaching.md) — Response Caching (Series 2.5) |
| 361 | +* **This Article** — Run a Local LLM in Your .NET 10 API with Ollama (Series 6.1) |
| 362 | +* [Build an HR AI Assistant That Knows Your Data](6.2-dotnet-ai-hr-assistant.md) — HR AI Assistant (Series 6.2) |
| 363 | +* [Add an AI Chat Widget to Angular with Streaming](6.3-angular-ai-chat-widget.md) — Angular Chat Widget (Series 6.3) |
| 364 | + |
| 365 | +--- |
| 366 | + |
| 367 | +**📌 Tags:** #dotnet #csharp #ai #ollama #llm #microsoftextensionsai #cleanarchitecture #aspnetcore #webapi #featureflags #fullstack #angular #oauth2 #locallm #generativeai |
0 commit comments