Add article 6.1: Run a Local LLM in Your .NET 10 API with Ollama

workcontrolgit · workcontrolgit · commit 4640c61bb06d · 2026-03-15T23:14:51.000-04:00
diff --git a/ApiResources/TalentManagement-API b/ApiResources/TalentManagement-API
@@ -1 +1 @@
-Subproject commit a10079105016eb078c8a4fffb3063c48de0e0b00
+Subproject commit 010144395c4e19e824edab0154ccf8c581ad624c
diff --git a/blogs/series-6-ai-app-features/6.1-dotnet-ai-foundation.md b/blogs/series-6-ai-app-features/6.1-dotnet-ai-foundation.md
@@ -0,0 +1,367 @@
+# Run a Local LLM in Your .NET 10 API with Ollama
+
+## How Microsoft.Extensions.AI Makes Your API AI-Ready Without Locking You Into One Provider
+
+Every developer wants AI in their app. The problem is getting started: API keys, cloud costs, rate limits, and the fear of betting your architecture on one vendor. What if you could add a working AI endpoint to your .NET 10 API in under an hour — for free, running entirely on your laptop?
+
+This article shows you exactly how, using [Ollama](https://ollama.com) for a local LLM and `Microsoft.Extensions.AI` as a provider-agnostic abstraction.
+
+📖 **Tutorial Repository:** [AngularNetTutorial on GitHub](https://github.com/workcontrolgit/AngularNetTutorial)
+
+---
+
+This article is part of the **AngularNetTutorial** series. The full-stack tutorial — covering Angular 20, .NET 10 Web API, and OAuth 2.0 with Duende IdentityServer — has been published at [Building Modern Web Applications with Angular, .NET, and OAuth 2.0](https://medium.com/scrum-and-coke/building-modern-web-applications-with-angular-net-and-oauth-2-0-complete-tutorial-series-7ea97ed3fc56). **This article kicks off Series 6 by adding AI capabilities to the existing TalentManagement API — without breaking any existing functionality for developers who don't have Ollama installed.**
+
+---
+
+## 🎓 What You'll Learn
+
+* **Microsoft.Extensions.AI abstraction** — How `IChatClient` lets you swap LLM providers by changing one line
+* **Ollama integration** — Pull a free local model and connect it to your .NET API in minutes
+* **Feature flag gating** — Why `[FeatureGate("AiEnabled")]` is the safest way to ship AI without breaking existing users
+* **Clean Architecture placement** — Where AI interfaces, implementations, and controllers belong in the layer structure
+* **Provider-agnostic DI** — How to register `AddOllamaChatClient()` so the rest of the app never knows which provider you're using
+
+---
+
+## 📋 Prerequisites
+
+**Before following this article, you should have:**
+
+* **TalentManagement stack running** — Complete [Series 0–5](https://medium.com/scrum-and-coke/building-modern-web-applications-with-angular-net-and-oauth-2-0-complete-tutorial-series-7ea97ed3fc56) or clone the tutorial repo
+* **.NET 10 SDK** — `dotnet --version` should show `10.x`
+* **Ollama installed** — Download from [ollama.com](https://ollama.com/download) (free, no account required)
+* **llama3.2 model pulled** — `ollama pull llama3.2` (~2 GB download)
+* **Basic C# and Clean Architecture familiarity** — Understanding of interfaces, DI, and MediatR helps
+
+**Not set up yet?** Follow the [AngularNetTutorial setup guide](https://github.com/workcontrolgit/AngularNetTutorial) first.
+
+---
+
+## 🎯 The Problem
+
+Adding AI to a production .NET API sounds daunting. Most tutorials show you how to call OpenAI with an API key — which is fine until you hit a rate limit, get an unexpected bill, or need to demo the app offline. Developers following a tutorial shouldn't need a credit card.
+
+Beyond getting started, there's an architectural risk: if your AI code reaches directly into the OpenAI SDK, switching providers later means touching every file that calls it. You've created a tight dependency on one vendor.
+
+**Common pain points:**
+
+* **Vendor lock-in** — Switching from OpenAI to Azure OpenAI (or Ollama) requires rewriting service code
+* **Cost barrier** — Cloud LLMs require API keys, rate limits, and billing setup before you can write a single test
+* **Feature flag complexity** — Without proper gating, enabling AI affects every user — even those on machines without Ollama installed
+
+---
+
+## 💡 The Solution
+
+[Microsoft.Extensions.AI](https://learn.microsoft.com/en-us/dotnet/ai/microsoft-extensions-ai) provides a single `IChatClient` interface that works identically across providers. Register `AddOllamaChatClient()` in development, `AddAzureOpenAIChatClient()` in production — your `IAiChatService` implementation doesn't change.
+
+[Ollama](https://ollama.com) runs open-weight models like `llama3.2` locally. No API key. No cloud. Works offline. Perfect for tutorials and development.
+
+We gate the entire `AiController` behind a `[FeatureGate("AiEnabled")]` attribute. When `"AiEnabled": false` (the default), the controller doesn't even respond to requests — no Ollama connection is attempted, the rest of the API is unaffected.
+
+**Key benefits:**
+
+* ✅ **Zero cost** — Ollama is free; no API key, no credit card, no rate limits
+* ✅ **Provider-agnostic** — Swap Ollama → Azure OpenAI → Anthropic by changing one DI registration line
+* ✅ **Safe coexistence** — Feature flag default `false` means original tutorial (Series 0–5) works unchanged
+* ✅ **Clean Architecture** — Interface in Application, implementation in Infrastructure.Shared, controller in WebApi
+
+---
+
+## 🚀 How It Works
+
+### Step 1: Install Ollama and Pull a Model
+
+Download Ollama from [ollama.com/download](https://ollama.com/download) for your OS. After installation:
+
+```bash
+# Pull the llama3.2 model (~2 GB — fast, capable, great for tutorials)
+ollama pull llama3.2
+
+# Start the Ollama server (runs at http://localhost:11434)
+ollama serve
+
+# Verify it's running
+curl http://localhost:11434/api/tags
+```
+
+**What this does:** Ollama downloads model weights and runs a local HTTP server that accepts chat requests. Our .NET API will call this endpoint internally — no external network traffic.
+
+### Step 2: Add NuGet Packages
+
+The AI packages split across two projects to maintain Clean Architecture separation:
+
+**`TalentManagementAPI.WebApi.csproj`** — The Ollama provider lives here:
+
+```xml
+<PackageReference Include="Microsoft.Extensions.AI.Ollama" Version="9.5.0" />
+```
+
+**`TalentManagementAPI.Infrastructure.Shared.csproj`** — The abstraction lives here:
+
+```xml
+<PackageReference Include="Microsoft.Extensions.AI" Version="9.5.0" />
+```
+
+**Why split?** The Application and Infrastructure layers must never reference provider-specific packages (`Microsoft.Extensions.AI.Ollama`). Only WebApi knows which provider is registered. Infrastructure.Shared only knows about `IChatClient` from `Microsoft.Extensions.AI`.
+
+### Step 3: Add Feature Flag and Ollama Config
+
+In `TalentManagementAPI.WebApi/appsettings.json`, add `AiEnabled` to the existing `FeatureManagement` section and a new `Ollama` section:
+
+```json
+"FeatureManagement": {
+  "AuthEnabled": true,
+  "CacheEnabled": true,
+  "AiEnabled": false
+},
+"Ollama": {
+  "BaseUrl": "http://localhost:11434",
+  "Model": "llama3.2"
+}
+```
+
+**Key point:** `"AiEnabled": false` is the default. Developers who haven't installed Ollama can still clone and run the full stack — the AI endpoint simply returns 404. To activate AI features, change this to `true` and ensure Ollama is running.
+
+### Step 4: Define the Application Interface
+
+Create `TalentManagementAPI.Application/Interfaces/IAiChatService.cs`:
+
+```csharp
+namespace TalentManagementAPI.Application.Interfaces
+{
+    public interface IAiChatService
+    {
+        Task<string> ChatAsync(string message, string? systemPrompt = null,
+            CancellationToken cancellationToken = default);
+    }
+}
+```
+
+**Why an interface?** The Application layer defines *what* the service does — not *how*. This follows the Dependency Inversion Principle: high-level modules (Application) don't depend on low-level details (Ollama SDK). Tests can inject a mock `IAiChatService` without needing Ollama running.
+
+### Step 5: Implement in Infrastructure.Shared
+
+Create `TalentManagementAPI.Infrastructure.Shared/Services/OllamaAiService.cs`:
+
+```csharp
+using Microsoft.Extensions.AI;
+using TalentManagementAPI.Application.Interfaces;
+
+namespace TalentManagementAPI.Infrastructure.Shared.Services
+{
+    public class OllamaAiService : IAiChatService
+    {
+        private readonly IChatClient _chatClient;
+
+        public OllamaAiService(IChatClient chatClient)
+        {
+            _chatClient = chatClient;
+        }
+
+        public async Task<string> ChatAsync(string message, string? systemPrompt = null,
+            CancellationToken cancellationToken = default)
+        {
+            var messages = new List<ChatMessage>();
+
+            if (!string.IsNullOrWhiteSpace(systemPrompt))
+                messages.Add(new ChatMessage(ChatRole.System, systemPrompt));
+
+            messages.Add(new ChatMessage(ChatRole.User, message));
+
+            var response = await _chatClient.CompleteAsync(messages, cancellationToken: cancellationToken);
+            return response.Message.Text ?? string.Empty;
+        }
+    }
+}
+```
+
+**What this does:** `OllamaAiService` takes `IChatClient` from DI — it has no idea it's talking to Ollama specifically. The `CompleteAsync` method sends the message list and returns the model's reply. An optional system prompt lets callers control the AI's persona or constraints.
+
+### Step 6: Register Services
+
+In `Infrastructure.Shared/ServiceRegistration.cs`, add the `IAiChatService` → `OllamaAiService` binding:
+
+```csharp
+using TalentManagementAPI.Application.Interfaces;
+using TalentManagementAPI.Infrastructure.Shared.Services;
+
+public static void AddSharedInfrastructure(this IServiceCollection services, IConfiguration _config)
+{
+    services.Configure<MailSettings>(_config.GetSection("MailSettings"));
+    services.AddTransient<IDateTimeService, DateTimeService>();
+    services.AddTransient<IEmailService, EmailService>();
+    services.AddTransient<IMockService, MockService>();
+    services.AddTransient<IAiChatService, OllamaAiService>();
+}
+```
+
+In `WebApi/Program.cs`, register the Ollama provider for `IChatClient`:
+
+```csharp
+// Register application services
+builder.Services.AddApplicationLayer();
+builder.Services.AddPersistenceInfrastructure(builder.Configuration);
+builder.Services.AddSharedInfrastructure(builder.Configuration);
+
+// Register Ollama chat client (IChatClient) — used by OllamaAiService
+// AiController is gated by [FeatureGate("AiEnabled")], so no calls are made when AI is disabled
+var ollamaBaseUrl = builder.Configuration["Ollama:BaseUrl"] ?? "http://localhost:11434";
+var ollamaModel = builder.Configuration["Ollama:Model"] ?? "llama3.2";
+builder.Services.AddOllamaChatClient(ollamaModel, new Uri(ollamaBaseUrl));
+```
+
+**What this does:** `AddOllamaChatClient()` registers `IChatClient` in the DI container pointing to Ollama. `OllamaAiService` receives this via constructor injection. If you later want to use Azure OpenAI, you'd replace `AddOllamaChatClient()` with `AddAzureOpenAIChatClient()` — and nothing else changes.
+
+### Step 7: Create the AI Controller
+
+Create `TalentManagementAPI.WebApi/Controllers/v1/AiController.cs`:
+
+```csharp
+using Asp.Versioning;
+using Microsoft.AspNetCore.Authorization;
+using Microsoft.AspNetCore.Mvc;
+using Microsoft.FeatureManagement.Mvc;
+using TalentManagementAPI.Application.Interfaces;
+
+namespace TalentManagementAPI.WebApi.Controllers.v1
+{
+    [FeatureGate("AiEnabled")]
+    [ApiVersion("1.0")]
+    [AllowAnonymous]
+    [Route("api/v{version:apiVersion}/ai")]
+    public sealed class AiController : BaseApiController
+    {
+        private readonly IAiChatService _aiChatService;
+
+        public AiController(IAiChatService aiChatService)
+        {
+            _aiChatService = aiChatService;
+        }
+
+        /// <summary>
+        /// Send a message to the AI assistant and receive a reply.
+        /// </summary>
+        [HttpPost("chat")]
+        public async Task<IActionResult> Chat([FromBody] AiChatRequest request,
+            CancellationToken cancellationToken)
+        {
+            var reply = await _aiChatService.ChatAsync(
+                request.Message, request.SystemPrompt, cancellationToken);
+            return Ok(new AiChatResponse(reply));
+        }
+    }
+
+    public record AiChatRequest(string Message, string? SystemPrompt = null);
+    public record AiChatResponse(string Reply);
+}
+```
+
+**What `[FeatureGate("AiEnabled")]` does:** When `AiEnabled` is `false` in `appsettings.json`, ASP.NET Core returns a `404 Not Found` for all routes under this controller. Ollama is never called. The controller doesn't appear in Swagger. To the rest of the app, it doesn't exist.
+
+When `AiEnabled` is `true`, the endpoint becomes fully active. No other code changes needed.
+
+---
+
+## 💻 Try It Yourself
+
+**Enable AI features** by setting `"AiEnabled": true` in `appsettings.json` and starting Ollama:
+
+```bash
+# Terminal 1: Start Ollama (if not already running)
+ollama serve
+
+# Terminal 2: Start the .NET API (from the ApiResources submodule)
+cd ApiResources/TalentManagement-API
+dotnet run
+```
+
+Open Swagger at `https://localhost:44378/swagger` and find the **Ai** section.
+
+![Swagger AI Chat endpoint showing POST /api/v1/ai/chat](../../docs/images/ai/swagger-ai-chat-endpoint.png)
+
+Click **POST /api/v1/ai/chat**, then **Try it out**, and send:
+
+```json
+{
+  "message": "What is the difference between OAuth 2.0 and OIDC?",
+  "systemPrompt": "You are a helpful assistant specializing in identity and security."
+}
+```
+
+You'll see Ollama's reply in the response body within a few seconds.
+
+**To test without Swagger** — use curl:
+
+```bash
+curl -X POST https://localhost:44378/api/v1/ai/chat \
+  -H "Content-Type: application/json" \
+  -k \
+  -d '{"message": "Explain JWT tokens in one paragraph."}'
+```
+
+**To verify the feature flag** — set `"AiEnabled": false`, restart the API, and try the same curl. You'll get a `404` — the controller is invisible.
+
+---
+
+## 📊 Real-World Impact
+
+**Before this approach:**
+
+* ❌ AI code is tightly coupled to OpenAI SDK — migrating to another provider requires rewriting service code
+* ❌ Tutorial readers need an API key and billing account just to run the demo
+* ❌ Enabling AI in `develop` branch breaks builds for developers without Ollama
+
+**After this approach:**
+
+* ✅ Swap Ollama for Azure OpenAI in production by changing one DI line — application code unchanged
+* ✅ Zero-cost, zero-signup AI during development — every tutorial reader can follow along
+* ✅ Feature flag default `false` means the full Series 0–5 stack runs unchanged — AI is opt-in
+
+---
+
+## 🌟 Why This Matters
+
+The `IChatClient` abstraction from Microsoft is the `.NET HTTP Client` of AI — a standard interface the ecosystem is aligning around. By building on it now, your code is forward-compatible with whatever provider becomes the best choice in 12 months.
+
+For tutorial purposes, Ollama removes the biggest barrier to learning: access. Every developer on every OS can pull `llama3.2`, type `ollama serve`, and have a working LLM in their local environment. No billing, no configuration, no waiting for API access.
+
+The feature flag pattern ensures this is safe to ship: the codebase always builds, always runs, and the original Series 0–5 experience is completely unchanged. AI features activate on demand.
+
+**Transferable skills:**
+
+* **Provider-agnostic AI abstractions** — The `IChatClient` pattern applies equally to Azure OpenAI, Anthropic, Google, and Hugging Face endpoints
+* **Feature flag architecture** — The `[FeatureGate]` pattern applies to any experimental or optional feature
+* **Clean Architecture for external services** — Interface in Application, implementation in Infrastructure, provider registration in WebApi
+
+---
+
+## 🤝 Community & Support
+
+**Questions or feedback?** The tutorial repository welcomes:
+
+* ⭐ **GitHub stars** — Help others discover it!
+* 🐛 **Issue reports** — Found a bug or have a suggestion?
+* 💬 **Discussions** — Ask questions, share your use cases
+* 🚀 **Pull requests** — Improvements always appreciated
+
+**Found this helpful?** Share it with your team and follow for more full-stack development content!
+
+---
+
+## 📖 Series Navigation
+
+**AngularNetTutorial Blog Series:**
+
+* [Building Modern Web Applications with Angular, .NET, and OAuth 2.0](https://medium.com/scrum-and-coke/building-modern-web-applications-with-angular-net-and-oauth-2-0-complete-tutorial-series-7ea97ed3fc56) — Main tutorial
+* [Stop Juggling Multiple Repos: Manage Your Full-Stack App Like a Workspace](../series-0-architecture/0.1-git-submodule-workspace.md) — Git Submodules
+* [End-to-End Testing Made Simple: How Playwright Transforms Testing](../series-0-architecture/0.2-playwright-testing.md) — Playwright Overview
+* [Speed Up Your Dashboard: Easy Response Caching in .NET 10 With EasyCaching](../series-2-dotnet-api/2.5-dotnet-easycaching.md) — Response Caching (Series 2.5)
+* **This Article** — Run a Local LLM in Your .NET 10 API with Ollama (Series 6.1)
+* [Build an HR AI Assistant That Knows Your Data](6.2-dotnet-ai-hr-assistant.md) — HR AI Assistant (Series 6.2)
+* [Add an AI Chat Widget to Angular with Streaming](6.3-angular-ai-chat-widget.md) — Angular Chat Widget (Series 6.3)
+
+---
+
+**📌 Tags:** #dotnet #csharp #ai #ollama #llm #microsoftextensionsai #cleanarchitecture #aspnetcore #webapi #featureflags #fullstack #angular #oauth2 #locallm #generativeai