You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
**Why split?** The Application and Infrastructure layers must never reference provider-specific packages (`Microsoft.Extensions.AI.Ollama`). Only WebApi knows which provider is registered. Infrastructure.Shared only knows about `IChatClient` from `Microsoft.Extensions.AI`.
101
+
**Why OllamaSharp instead of `Microsoft.Extensions.AI`?** OllamaSharp provides native streaming support via `IAsyncEnumerable<>` (`await foreach`), so tokens stream out of Ollama as they are generated — important when local model responses take several seconds. `Microsoft.Extensions.AI` is the right abstraction when you need to swap between Azure OpenAI, OpenAI, and Ollama without touching service code. For this tutorial, OllamaSharp keeps the dependency footprint minimal: one package, only in Infrastructure.Shared, no provider registration boilerplate in Program.cs.
108
102
109
103
### Step 3: Add Feature Flag and Ollama Config
110
104
@@ -118,10 +112,19 @@ In `TalentManagementAPI.WebApi/appsettings.json`, add `AiEnabled` to the existin
118
112
},
119
113
"Ollama": {
120
114
"BaseUrl": "http://localhost:11434",
121
-
"Model": "llama3.2"
115
+
"Model": "llama3.2",
116
+
"EmbeddingModel": "nomic-embed-text",
117
+
"CacheTtlMinutes": 60
122
118
}
123
119
```
124
120
121
+
**What each field does:**
122
+
123
+
***`BaseUrl`** — where Ollama is listening (`ollama serve` defaults to port 11434)
124
+
***`Model`** — the chat model to use; `llama3.2` is pulled in Step 1
125
+
***`EmbeddingModel`** — used in later articles (6.5+) for semantic search; `nomic-embed-text` is a compact, high-quality embedding model
126
+
***`CacheTtlMinutes`** — how long AI responses are cached in-memory; identical questions within this window return instantly without hitting Ollama again (introduced in the `CachingAiChatService` below)
127
+
125
128
**Key point:**`"AiEnabled": false` is the default. Developers who haven't installed Ollama can still clone and run the full stack — the AI endpoint simply returns 404. To activate AI features, change this to `true` and ensure Ollama is running.
**What this does:**`OllamaAiService` takes `IChatClient` from DI — it has no idea it's talking to Ollama specifically. The `CompleteAsync` method sends the message list and returns the model's reply. An optional system prompt lets callers control the AI's persona or constraints.
199
+
**What this does:**`OllamaAiService` takes `IOllamaApiClient` from DI (registered in Step 6). OllamaSharp streams tokens back using `IAsyncEnumerable<>` — the `await foreach` loop accumulates each chunk into a `MessageBuilder`, then returns the fully assembled reply. An optional system prompt lets callers control the AI's persona or constraints without the service knowing anything about the caller's intent.
181
200
182
201
### Step 6: Register Services
183
202
184
-
In `Infrastructure.Shared/ServiceRegistration.cs`, add the `IAiChatService`→ `OllamaAiService` binding:
203
+
In `Infrastructure.Shared/ServiceRegistration.cs`, register `IOllamaApiClient` and wire `IAiChatService`to a caching decorator that wraps `OllamaAiService`:
**What this does:**`AddOllamaChatClient()` registers `IChatClient` in the DI container pointing to Ollama. `OllamaAiService` receives this via constructor injection. If you later want to use Azure OpenAI, you'd replace `AddOllamaChatClient()` with `AddAzureOpenAIChatClient()` — and nothing else changes.
247
+
**What the caching decorator does:**`CachingAiChatService` wraps `OllamaAiService`. On the first call for a given `(message, systemPrompt)` pair, it calls Ollama and stores the reply. On subsequent identical calls within the TTL window, it returns the cached reply — skipping the 1–4 second Ollama inference. The `IAiResponseMetadata` flag tells the controller whether the response was a cache hit, which is surfaced as the `X-AI-Cache: HIT/MISS` response header.
**What `[FeatureGate("AiEnabled")]` does:** When `AiEnabled` is `false` in `appsettings.json`, ASP.NET Core returns a `404 Not Found` for all routes under this controller. Ollama is never called. The controller doesn't appear in Swagger. To the rest of the app, it doesn't exist.
310
+
**Why per-method checks instead of `[FeatureGate]` on the class?**
311
+
312
+
The `[FeatureGate("AiEnabled")]` attribute returns `404 Not Found` when the feature is disabled — a misleading status for a known endpoint. The per-method check returns `503 Service Unavailable` with a clear `detail` message explaining exactly what to enable and where. This is far more helpful to developers hitting the endpoint for the first time.
313
+
314
+
**`IFeatureManagerSnapshot`** — the snapshot variant reads the feature flags once per request and caches the result for the request lifetime. This avoids multiple config reads per action.
315
+
316
+
**`IAiResponseMetadata`** — a scoped flag (set by `CachingAiChatService` in Step 6) that records whether the response came from the cache. `SetAiCacheHeader()` surfaces this as `X-AI-Cache: HIT` or `MISS` in every response — visible in the browser Network tab and Swagger, making it easy to see when caching is working.
262
317
263
-
When `AiEnabled` is `true`, the endpoint becomes fully active. No other code changes needed.
318
+
When `AiEnabled` is `true`, the endpoint is fully active. No other code changes needed.
In `TalentManagementAPI.WebApi/Controllers/v1/AiController.cs`, add the `hr-insight`endpoint and request record:
220
+
In `TalentManagementAPI.WebApi/Controllers/v1/AiController.cs`, add the `hr-insight`action. This builds on the controller created in Article 6.1 — `_featureManager` (`IFeatureManagerSnapshot`) and `_aiMetadata` (`IAiResponseMetadata`) are already injected in the constructor alongside `IAiChatService`. The `hr-insight` action follows the exact same per-method feature flag pattern as `chat`.
Copy file name to clipboardExpand all lines: blogs/series-6-ai-app-features/6.3-angular-ai-chat-widget.md
+4Lines changed: 4 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -82,6 +82,8 @@ export const environment = {
82
82
83
83
Add the same `aiEnabled: false` line under `// Feature Flags`. Production defaults to off.
84
84
85
+
> **Note for repo cloners:** If you cloned the tutorial repo, `environment.ts` may already have `aiEnabled: true` (set during development of later articles). To follow this article from scratch — seeing the disabled state first — flip it to `false`, then back to `true` when you're ready to test the chat UI.
86
+
85
87
**Why default to false?** Readers who are following the original Series 0–5 tutorial don't have Ollama running. If the chat widget made API calls with `AiEnabled: false` in the API, every request would return `503 Service Unavailable` — a broken experience. Defaulting the flag to `false` in the Angular environment means the chat UI is never shown to those readers, so their app continues to work exactly as it did before.
86
88
87
89
---
@@ -150,6 +152,8 @@ export * from './base-api.service';
150
152
// ... existing exports
151
153
```
152
154
155
+
> **Note:** The `ai.service.ts` file shown here covers the two methods needed for Articles 6.3 and 6.4 (`chat` and `hrInsight`). Later articles (6.5+) add `nlEmployeeSearch()` and `semanticPositionSearch()` to the same service file. If you clone the repo, you will see those additional methods — they are safe to ignore until you reach those articles.
0 commit comments