From 81d094c30679c88932d4ba1326ba2ea2f2273424 Mon Sep 17 00:00:00 2001 From: Chris Baker <67105654+zacnaloen@users.noreply.github.com> Date: Wed, 13 May 2026 08:46:47 +0100 Subject: [PATCH 1/6] Document agentless web platform direction --- docs/agentless-web-platform.md | 142 +++++++++++++++++++++++++++++++++ 1 file changed, 142 insertions(+) create mode 100644 docs/agentless-web-platform.md diff --git a/docs/agentless-web-platform.md b/docs/agentless-web-platform.md new file mode 100644 index 0000000..6d9ee57 --- /dev/null +++ b/docs/agentless-web-platform.md @@ -0,0 +1,142 @@ +# Agentless Web Platform Direction + +## Product Shape + +Run Performance Monitor as a central service for a SQL Server estate: + +- install one collector/web host on an application server +- register many SQL Server instances to monitor +- connect remotely with least-privilege SQL credentials +- collect DMV, Query Store, Extended Events, configuration, capacity, and job telemetry +- store estate-wide history centrally +- serve the data through a web API and browser UI + +The monitored SQL Servers should not require SQL Agent jobs for Performance Monitor collection. They may still expose SQL Agent metadata for the running-jobs collector where permissions allow it. + +## Why This Fits The Existing Code + +The repository already contains both halves of this design: + +- Full Edition has a rich SQL schema, retention logic, reporting views, alerting, analysis, and dashboard query surface. +- Lite Edition already proves agentless remote collection from a desktop process into local DuckDB. + +The web platform should reuse those strengths rather than starting over: + +- reuse Lite's remote collector query logic and scheduler behavior +- reuse Dashboard's analysis, alerting, MCP, plan parsing, and reporting concepts +- keep the install scripts for creating a central repository database +- replace per-target SQL Agent scheduling with a service-hosted scheduler + +## Target Architecture + +```mermaid +flowchart LR + Web["Web UI"] --> Api["ASP.NET Core API"] + Api --> Repo["Central PerformanceMonitor repository"] + Api --> Analysis["Analysis and alert services"] + + Worker["Collector Worker Service"] --> Estate["SQL Server estate"] + Worker --> Repo + Worker --> Scheduler["Collector scheduler"] + + Admin["Admin UI"] --> Api + Scheduler --> Repo +``` + +### Collector Worker + +The collector worker runs on the monitoring server, not on every SQL Server. It owns: + +- server inventory +- connection credentials +- per-server schedule state +- collection leases so only one worker collects a given server/collector at a time +- retry/backoff for unreachable servers +- permission gating when a collector is not allowed on a target +- retention jobs that run centrally + +The first implementation should be a .NET Worker Service because the repo is already .NET 8 and the collector code is C#. + +### Central Repository + +Use a central SQL Server database first, because the Full Edition already assumes the `PerformanceMonitor` schema and the Dashboard reads that shape today. + +Needed changes: + +- add an estate dimension to collection tables, likely `server_id` +- add inventory/config tables for monitored instances +- keep collector logs per server and collector +- run retention centrally without SQL Agent +- preserve view names where possible so existing Dashboard query logic can be ported gradually + +DuckDB remains useful for Lite, local demos, or an embedded mode, but the estate server should default to SQL Server storage. + +### Web API + +The API should expose stable endpoints over the central repository: + +- server inventory and health summary +- time-series resource data +- query performance drilldowns +- wait, blocking, deadlock, memory, tempdb, file I/O, and capacity views +- alert history and mute rules +- analysis findings +- collection health and last-run status + +Dashboard's existing `DatabaseService.*` query methods are the best starting point for API handlers, but they should move behind service interfaces so WPF and web can share query semantics while the web UI gets HTTP-friendly DTOs. + +### Web UI + +The first website should be an operational console, not a marketing surface: + +- estate overview with health cards and recent changes +- server detail page with tabs for workload, waits, resources, memory, locking, system events, and capacity +- query detail pages with plan viewing and history +- collector health page showing stale/erroring collectors +- admin area for servers, credentials, schedules, alerting, and retention + +The UI should be dense, searchable, and comparative. This is a dev-estate command center, so the priority is scanability and fast drilldown. + +## Permission Model + +The monitored servers should only need read-style permissions: + +- `VIEW SERVER STATE` or `VIEW SERVER PERFORMANCE STATE` where available +- `VIEW ANY DATABASE` if database inventory is required +- Query Store read access per database where query store views are collected +- `msdb` read roles only if job monitoring is enabled +- optional permissions for Extended Events and external community procedures + +The service account owns collection scheduling and storage writes only in the central repository. + +## Migration Phases + +1. Extract shared collector contracts. + Move collector names, schedule definitions, server models, health state, and result DTOs into shared non-WPF projects. + +2. Create a central repository model. + Add server identity to collection storage and create central inventory/configuration tables. + +3. Build the headless collector worker. + Port the Lite remote collector loop to a .NET Worker Service that writes to the central repository. + +4. Add API endpoints over existing dashboard queries. + Start with overview, waits, CPU, memory, query stats, blocking, and collector health. + +5. Build the first web console. + Use the API to serve an estate overview and server detail pages. + +6. De-emphasize SQL Agent installation. + Make SQL Agent jobs optional/legacy for Full Edition, with the agentless service as the preferred deployment. + +## First Thin Slice + +The safest first build target is: + +- create `PerformanceMonitor.Collector.Service` +- create `PerformanceMonitor.Shared` +- copy the Lite schedule model and remote collector dispatcher behind shared interfaces +- implement one collector end-to-end against the central repository, starting with server properties or wait stats +- expose collector health through a minimal API endpoint + +That proves the hardest architectural question without needing to port the whole dashboard at once. From 54184dc2c344dc8c5b94d6965f09a14db96aa8d3 Mon Sep 17 00:00:00 2001 From: Chris Baker <67105654+zacnaloen@users.noreply.github.com> Date: Wed, 13 May 2026 09:24:31 +0100 Subject: [PATCH 2/6] Add headless estate monitor prototype --- .gitignore | 10 +- Headless/Models/CollectorScheduleOptions.cs | 8 + Headless/Models/MonitorOptions.cs | 29 + Headless/Models/MonitoredServerOptions.cs | 26 + Headless/Models/TelemetryModels.cs | 69 ++ Headless/PerformanceMonitor.Headless.csproj | 33 + Headless/Program.cs | 57 ++ .../Services/SqlEstateCollectorService.cs | 409 +++++++++++ Headless/Storage/HeadlessStore.cs | 657 ++++++++++++++++++ Headless/appsettings.example.json | 37 + Headless/wwwroot/app.js | 313 +++++++++ Headless/wwwroot/index.html | 99 +++ Headless/wwwroot/styles.css | 450 ++++++++++++ PerformanceMonitor.sln | 6 + docs/headless-monitor.md | 157 +++++ 15 files changed, 2358 insertions(+), 2 deletions(-) create mode 100644 Headless/Models/CollectorScheduleOptions.cs create mode 100644 Headless/Models/MonitorOptions.cs create mode 100644 Headless/Models/MonitoredServerOptions.cs create mode 100644 Headless/Models/TelemetryModels.cs create mode 100644 Headless/PerformanceMonitor.Headless.csproj create mode 100644 Headless/Program.cs create mode 100644 Headless/Services/SqlEstateCollectorService.cs create mode 100644 Headless/Storage/HeadlessStore.cs create mode 100644 Headless/appsettings.example.json create mode 100644 Headless/wwwroot/app.js create mode 100644 Headless/wwwroot/index.html create mode 100644 Headless/wwwroot/styles.css create mode 100644 docs/headless-monitor.md diff --git a/.gitignore b/.gitignore index 5af2ea7..77f4050 100644 --- a/.gitignore +++ b/.gitignore @@ -18,8 +18,14 @@ releases/ packages/ *.nupkg -# SQLite databases -*.sqlite +# SQLite databases +*.sqlite + +# Headless runtime data +Headless/data/ +*.duckdb +*.duckdb.wal +*.parquet # Lock files *.lock diff --git a/Headless/Models/CollectorScheduleOptions.cs b/Headless/Models/CollectorScheduleOptions.cs new file mode 100644 index 0000000..010c37b --- /dev/null +++ b/Headless/Models/CollectorScheduleOptions.cs @@ -0,0 +1,8 @@ +namespace PerformanceMonitor.Headless.Models; + +public sealed class CollectorScheduleOptions +{ + public string Name { get; set; } = ""; + public bool Enabled { get; set; } = true; + public int FrequencySeconds { get; set; } = 60; +} diff --git a/Headless/Models/MonitorOptions.cs b/Headless/Models/MonitorOptions.cs new file mode 100644 index 0000000..c19f4a4 --- /dev/null +++ b/Headless/Models/MonitorOptions.cs @@ -0,0 +1,29 @@ +namespace PerformanceMonitor.Headless.Models; + +public sealed class MonitorOptions +{ + public string StoragePath { get; set; } = "data\\headless\\performance-monitor.duckdb"; + public string ArchiveDirectory { get; set; } = "data\\headless\\parquet"; + public int CollectionIntervalSeconds { get; set; } = 60; + public int MaxConcurrentServers { get; set; } = 8; + public int CommandTimeoutSeconds { get; set; } = 30; + public int ArchiveIntervalMinutes { get; set; } = 60; + public int HotDataDays { get; set; } = 7; + public List Collectors { get; set; } = []; + public List Servers { get; set; } = []; + + public IReadOnlyList GetEffectiveCollectors() + { + if (Collectors.Count > 0) + { + return Collectors; + } + + return + [ + new() { Name = "server_properties", FrequencySeconds = 3600 }, + new() { Name = "wait_stats", FrequencySeconds = 60 }, + new() { Name = "cpu_utilization", FrequencySeconds = 60 } + ]; + } +} diff --git a/Headless/Models/MonitoredServerOptions.cs b/Headless/Models/MonitoredServerOptions.cs new file mode 100644 index 0000000..7500c10 --- /dev/null +++ b/Headless/Models/MonitoredServerOptions.cs @@ -0,0 +1,26 @@ +namespace PerformanceMonitor.Headless.Models; + +public sealed class MonitoredServerOptions +{ + public string Id { get; set; } = ""; + public string DisplayName { get; set; } = ""; + public string? ConnectionString { get; set; } + public string? ConnectionStringEnvironmentVariable { get; set; } + public bool Enabled { get; set; } = true; + + public string ServerNameForStorage => string.IsNullOrWhiteSpace(DisplayName) ? Id : DisplayName; + + public string ResolveConnectionString() + { + if (!string.IsNullOrWhiteSpace(ConnectionStringEnvironmentVariable)) + { + var fromEnvironment = Environment.GetEnvironmentVariable(ConnectionStringEnvironmentVariable); + if (!string.IsNullOrWhiteSpace(fromEnvironment)) + { + return Environment.ExpandEnvironmentVariables(fromEnvironment); + } + } + + return Environment.ExpandEnvironmentVariables(ConnectionString ?? ""); + } +} diff --git a/Headless/Models/TelemetryModels.cs b/Headless/Models/TelemetryModels.cs new file mode 100644 index 0000000..7fd5b04 --- /dev/null +++ b/Headless/Models/TelemetryModels.cs @@ -0,0 +1,69 @@ +namespace PerformanceMonitor.Headless.Models; + +public sealed record ServerHealthDto( + string ServerId, + string DisplayName, + bool IsEnabled, + DateTime? LastSeenTime, + string LastStatus, + string? LastError, + string? ProductVersion, + string? Edition, + int? SqlMajorVersion, + string HealthState, + string HealthReason, + int ActiveAlertCount); + +public sealed record CollectionLogDto( + DateTime CollectionTime, + string ServerId, + string ServerName, + string CollectorName, + string Status, + int RowsCollected, + int DurationMs, + string? ErrorMessage); + +public sealed record TopWaitDto( + string WaitType, + long WaitTimeDeltaMs, + long SignalWaitTimeDeltaMs, + long WaitingTasksDelta); + +public sealed record CpuSampleDto( + DateTime SampleTime, + int SqlServerCpuUtilization, + int OtherProcessCpuUtilization); + +public sealed record EstateSummaryDto( + int ServerCount, + int GreenCount, + int YellowCount, + int RedCount, + int ErrorCount, + int DisabledCount, + DateTime GeneratedAt, + IReadOnlyList Servers); + +public sealed record ServerPropertiesSnapshot( + string MachineName, + string? InstanceName, + string ProductVersion, + string ProductLevel, + string Edition, + int EngineEdition, + int SqlMajorVersion, + int CpuCount, + long PhysicalMemoryMb, + DateTime SqlServerStartTime); + +public sealed record WaitStatSnapshot( + string WaitType, + long WaitingTasksCount, + long WaitTimeMs, + long SignalWaitTimeMs); + +public sealed record CpuSample( + DateTime SampleTime, + int SqlServerCpuUtilization, + int OtherProcessCpuUtilization); diff --git a/Headless/PerformanceMonitor.Headless.csproj b/Headless/PerformanceMonitor.Headless.csproj new file mode 100644 index 0000000..9e0a7c0 --- /dev/null +++ b/Headless/PerformanceMonitor.Headless.csproj @@ -0,0 +1,33 @@ + + + net8.0 + enable + enable + PerformanceMonitor.Headless + PerformanceMonitor.Headless + SQL Server Performance Monitor Headless + 2.10.0 + 2.10.0.0 + 2.10.0.0 + 2.10.0-headless + Darling Data, LLC + Copyright (c) 2026 Darling Data, LLC + true + latest-recommended + CA1001;CA1305;CA1845;CA1848;CA1861;CA2007;CA2100 + + + + + + + + + + + + + PreserveNewest + + + diff --git a/Headless/Program.cs b/Headless/Program.cs new file mode 100644 index 0000000..b083641 --- /dev/null +++ b/Headless/Program.cs @@ -0,0 +1,57 @@ +using PerformanceMonitor.Headless.Models; +using PerformanceMonitor.Headless.Services; +using PerformanceMonitor.Headless.Storage; + +var builder = WebApplication.CreateBuilder(args); + +builder.Host.UseWindowsService(); +builder.Services.Configure(builder.Configuration.GetSection("Monitor")); +builder.Services.AddSingleton(); +builder.Services.AddHostedService(); + +var app = builder.Build(); + +app.UseDefaultFiles(); +app.UseStaticFiles(); + +app.MapGet("/api/health", () => Results.Ok(new { status = "ok", generated_at = DateTime.UtcNow })); + +app.MapGet("/api/storage", (HeadlessStore store) => Results.Ok(new +{ + duckdb = store.DatabasePath, + parquet = store.ArchiveDirectory +})); + +app.MapGet("/api/summary", async (HeadlessStore store, CancellationToken cancellationToken) + => Results.Ok(await store.GetEstateSummaryAsync(cancellationToken))); + +app.MapGet("/api/servers", async (HeadlessStore store, CancellationToken cancellationToken) + => Results.Ok(await store.GetServersAsync(cancellationToken))); + +app.MapGet("/api/collection-log", async (HeadlessStore store, int? limit, CancellationToken cancellationToken) + => Results.Ok(await store.GetCollectionLogAsync(limit ?? 200, cancellationToken))); + +app.MapGet("/api/servers/{serverId}/waits", async ( + string serverId, + HeadlessStore store, + int? hours, + int? limit, + CancellationToken cancellationToken) => +{ + var rows = await store.GetTopWaitsAsync(serverId, hours ?? 1, limit ?? 20, cancellationToken); + return Results.Ok(rows); +}); + +app.MapGet("/api/servers/{serverId}/cpu", async ( + string serverId, + HeadlessStore store, + int? hours, + CancellationToken cancellationToken) => +{ + var rows = await store.GetCpuSamplesAsync(serverId, hours ?? 1, cancellationToken); + return Results.Ok(rows); +}); + +app.MapFallbackToFile("index.html"); + +app.Run(); diff --git a/Headless/Services/SqlEstateCollectorService.cs b/Headless/Services/SqlEstateCollectorService.cs new file mode 100644 index 0000000..cbec1b6 --- /dev/null +++ b/Headless/Services/SqlEstateCollectorService.cs @@ -0,0 +1,409 @@ +using System.Data; +using System.Diagnostics; +using Microsoft.Data.SqlClient; +using Microsoft.Extensions.Options; +using PerformanceMonitor.Headless.Models; +using PerformanceMonitor.Headless.Storage; + +namespace PerformanceMonitor.Headless.Services; + +public sealed class SqlEstateCollectorService : BackgroundService +{ + private static readonly HashSet IgnoredWaitTypes = new(StringComparer.OrdinalIgnoreCase) + { + "BROKER_EVENTHANDLER", "BROKER_RECEIVE_WAITFOR", "BROKER_TASK_STOP", "BROKER_TO_FLUSH", + "BROKER_TRANSMITTER", "CHECKPOINT_QUEUE", "CHKPT", "CLR_AUTO_EVENT", "CLR_MANUAL_EVENT", + "DIRTY_PAGE_POLL", "DISPATCHER_QUEUE_SEMAPHORE", "EXECSYNC", "FSAGENT", "FT_IFTS_SCHEDULER_IDLE_WAIT", + "HADR_FILESTREAM_IOMGR_IOCOMPLETION", "KSOURCE_WAKEUP", "LAZYWRITER_SLEEP", "LOGMGR_QUEUE", + "ONDEMAND_TASK_QUEUE", "PARALLEL_REDO_DRAIN_WORKER", "PARALLEL_REDO_LOG_CACHE", + "PARALLEL_REDO_TRAN_LIST", "PARALLEL_REDO_WORKER_SYNC", "PARALLEL_REDO_WORKER_WAIT_WORK", + "PREEMPTIVE_XE_GETTARGETSTATE", "PWAIT_ALL_COMPONENTS_INITIALIZED", "PWAIT_DIRECTLOGCONSUMER_GETNEXT", + "QDS_PERSIST_TASK_MAIN_LOOP_SLEEP", "QDS_ASYNC_QUEUE", "QDS_CLEANUP_STALE_QUERIES_TASK_MAIN_LOOP_SLEEP", + "REQUEST_FOR_DEADLOCK_SEARCH", "RESOURCE_QUEUE", "SERVER_IDLE_CHECK", "SLEEP_BPOOL_FLUSH", + "SLEEP_DBSTARTUP", "SLEEP_DCOMSTARTUP", "SLEEP_MASTERDBREADY", "SLEEP_MASTERMDREADY", + "SLEEP_MASTERUPGRADED", "SLEEP_MSDBSTARTUP", "SLEEP_SYSTEMTASK", "SLEEP_TASK", + "SLEEP_TEMPDBSTARTUP", "SNI_HTTP_ACCEPT", "SOS_WORK_DISPATCHER", "SP_SERVER_DIAGNOSTICS_SLEEP", + "SQLTRACE_BUFFER_FLUSH", "SQLTRACE_INCREMENTAL_FLUSH_SLEEP", "SQLTRACE_WAIT_ENTRIES", + "WAIT_FOR_RESULTS", "WAITFOR", "WAITFOR_TASKSHUTDOWN", "XE_DISPATCHER_JOIN", + "XE_DISPATCHER_WAIT", "XE_TIMER_EVENT" + }; + + private readonly MonitorOptions _options; + private readonly HeadlessStore _store; + private readonly ILogger _logger; + private readonly Dictionary<(string ServerId, string CollectorName), DateTime> _lastRuns = new(); + private DateTime _lastArchiveTime = DateTime.UtcNow; + + public SqlEstateCollectorService( + IOptions options, + HeadlessStore store, + ILogger logger) + { + _options = options.Value; + _store = store; + _logger = logger; + } + + protected override async Task ExecuteAsync(CancellationToken stoppingToken) + { + await _store.InitializeAsync(stoppingToken); + await _store.UpsertConfiguredServersAsync(_options.Servers, stoppingToken); + + _logger.LogInformation( + "Headless monitor started. DuckDB={DatabasePath}; Parquet={ArchiveDirectory}", + _store.DatabasePath, + _store.ArchiveDirectory); + + await RunCollectionCycleAsync(stoppingToken); + + using var timer = new PeriodicTimer(TimeSpan.FromSeconds(Math.Max(10, _options.CollectionIntervalSeconds))); + while (await timer.WaitForNextTickAsync(stoppingToken)) + { + await RunCollectionCycleAsync(stoppingToken); + } + } + + private async Task RunCollectionCycleAsync(CancellationToken cancellationToken) + { + var enabledServers = _options.Servers + .Where(s => s.Enabled) + .Where(s => !string.IsNullOrWhiteSpace(s.Id)) + .ToList(); + + if (enabledServers.Count == 0) + { + _logger.LogDebug("No enabled servers configured"); + return; + } + + using var throttle = new SemaphoreSlim(Math.Max(1, _options.MaxConcurrentServers)); + var tasks = enabledServers.Select(async server => + { + await throttle.WaitAsync(cancellationToken); + try + { + await CollectServerAsync(server, cancellationToken); + } + finally + { + throttle.Release(); + } + }); + + await Task.WhenAll(tasks); + await ArchiveIfDueAsync(cancellationToken); + } + + private async Task CollectServerAsync(MonitoredServerOptions server, CancellationToken cancellationToken) + { + var connectionString = server.ResolveConnectionString(); + if (string.IsNullOrWhiteSpace(connectionString)) + { + await _store.SetServerStatusAsync(server, "ERROR", "No connection string configured", null, cancellationToken); + return; + } + + try + { + await using var connection = new SqlConnection(connectionString); + await connection.OpenAsync(cancellationToken); + + await _store.SetServerStatusAsync(server, "ONLINE", null, null, cancellationToken); + + foreach (var collector in _options.GetEffectiveCollectors().Where(c => c.Enabled)) + { + if (!IsDue(server.Id, collector)) + { + continue; + } + + await RunCollectorAsync(server, connection, collector.Name, cancellationToken); + MarkRun(server.Id, collector.Name); + } + } + catch (Exception ex) when (ex is SqlException or InvalidOperationException) + { + _logger.LogWarning(ex, "Connection failed for server {Server}", server.ServerNameForStorage); + await _store.SetServerStatusAsync(server, "ERROR", ex.Message, null, cancellationToken); + } + } + + private async Task RunCollectorAsync( + MonitoredServerOptions server, + SqlConnection connection, + string collectorName, + CancellationToken cancellationToken) + { + var startTime = DateTime.UtcNow; + var totalWatch = Stopwatch.StartNew(); + var sqlWatch = new Stopwatch(); + var storageWatch = new Stopwatch(); + var rowsCollected = 0; + var status = "SUCCESS"; + string? errorMessage = null; + + try + { + switch (collectorName) + { + case "server_properties": + sqlWatch.Start(); + var properties = await CollectServerPropertiesAsync(connection, cancellationToken); + sqlWatch.Stop(); + + storageWatch.Start(); + await _store.InsertServerPropertiesAsync(server, startTime, properties, cancellationToken); + await _store.SetServerStatusAsync(server, "ONLINE", null, properties, cancellationToken); + storageWatch.Stop(); + rowsCollected = 1; + break; + + case "wait_stats": + sqlWatch.Start(); + var waitStats = await CollectWaitStatsAsync(connection, cancellationToken); + sqlWatch.Stop(); + + storageWatch.Start(); + await _store.InsertWaitStatsAsync(server, startTime, waitStats, cancellationToken); + storageWatch.Stop(); + rowsCollected = waitStats.Count; + break; + + case "cpu_utilization": + var lastSampleTime = await _store.GetLastCpuSampleTimeAsync(server.Id, cancellationToken); + sqlWatch.Start(); + var cpuSamples = await CollectCpuUtilizationAsync(connection, lastSampleTime, cancellationToken); + sqlWatch.Stop(); + + storageWatch.Start(); + await _store.InsertCpuSamplesAsync(server, startTime, cpuSamples, cancellationToken); + storageWatch.Stop(); + rowsCollected = cpuSamples.Count; + break; + + default: + status = "SKIPPED"; + errorMessage = $"Unknown collector '{collectorName}'"; + _logger.LogWarning("Unknown collector {Collector}", collectorName); + break; + } + } + catch (SqlException ex) when (IsPermissionError(ex)) + { + status = "PERMISSIONS"; + errorMessage = $"SQL Error #{ex.Number}: {ex.Message}"; + _logger.LogWarning("Collector {Collector} permission denied for {Server}: {Message}", + collectorName, server.ServerNameForStorage, ex.Message); + } + catch (Exception ex) when (ex is SqlException or InvalidOperationException or DataException) + { + status = "ERROR"; + errorMessage = ex.Message; + _logger.LogWarning(ex, "Collector {Collector} failed for {Server}", + collectorName, server.ServerNameForStorage); + } + finally + { + totalWatch.Stop(); + await _store.InsertCollectionLogAsync( + server, + collectorName, + startTime, + (int)totalWatch.ElapsedMilliseconds, + status, + errorMessage, + rowsCollected, + sqlWatch.ElapsedMilliseconds, + storageWatch.ElapsedMilliseconds, + cancellationToken); + } + } + + private bool IsDue(string serverId, CollectorScheduleOptions collector) + { + var frequencySeconds = Math.Max(1, collector.FrequencySeconds); + return !_lastRuns.TryGetValue((serverId, collector.Name), out var lastRun) + || DateTime.UtcNow - lastRun >= TimeSpan.FromSeconds(frequencySeconds); + } + + private void MarkRun(string serverId, string collectorName) + => _lastRuns[(serverId, collectorName)] = DateTime.UtcNow; + + private async Task ArchiveIfDueAsync(CancellationToken cancellationToken) + { + if (_options.ArchiveIntervalMinutes <= 0) + { + return; + } + + if (DateTime.UtcNow - _lastArchiveTime < TimeSpan.FromMinutes(_options.ArchiveIntervalMinutes)) + { + return; + } + + try + { + await _store.ArchiveOldDataAsync(cancellationToken); + _lastArchiveTime = DateTime.UtcNow; + } + catch (Exception ex) + { + _logger.LogWarning(ex, "Parquet archival failed"); + } + } + + private async Task CollectServerPropertiesAsync( + SqlConnection connection, + CancellationToken cancellationToken) + { + const string query = """ +SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED; + +SELECT + machine_name = CONVERT(nvarchar(128), SERVERPROPERTY(N'MachineName')), + instance_name = CONVERT(nvarchar(128), SERVERPROPERTY(N'InstanceName')), + product_version = CONVERT(nvarchar(128), SERVERPROPERTY(N'ProductVersion')), + product_level = CONVERT(nvarchar(128), SERVERPROPERTY(N'ProductLevel')), + edition = CONVERT(nvarchar(256), SERVERPROPERTY(N'Edition')), + engine_edition = CONVERT(integer, SERVERPROPERTY(N'EngineEdition')), + sql_major_version = CONVERT(integer, SERVERPROPERTY(N'ProductMajorVersion')), + cpu_count = CONVERT(integer, dosi.cpu_count), + physical_memory_mb = CONVERT(bigint, dosi.physical_memory_kb / 1024), + sqlserver_start_time = dosi.sqlserver_start_time +FROM sys.dm_os_sys_info AS dosi +OPTION(RECOMPILE); +"""; + + await using var command = new SqlCommand(query, connection); + command.CommandTimeout = Math.Max(1, _options.CommandTimeoutSeconds); + await using var reader = await command.ExecuteReaderAsync(cancellationToken); + if (!await reader.ReadAsync(cancellationToken)) + { + throw new DataException("Server properties query returned no rows"); + } + + return new ServerPropertiesSnapshot( + reader.GetString(0), + reader.IsDBNull(1) ? null : reader.GetString(1), + reader.GetString(2), + reader.GetString(3), + reader.GetString(4), + reader.GetInt32(5), + reader.GetInt32(6), + reader.GetInt32(7), + reader.GetInt64(8), + reader.GetDateTime(9)); + } + + private async Task> CollectWaitStatsAsync( + SqlConnection connection, + CancellationToken cancellationToken) + { + const string query = """ +SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED; + +SELECT + wait_type = ws.wait_type, + waiting_tasks_count = ws.waiting_tasks_count, + wait_time_ms = ws.wait_time_ms, + signal_wait_time_ms = ws.signal_wait_time_ms +FROM sys.dm_os_wait_stats AS ws +WHERE ws.wait_time_ms > 0 +ORDER BY ws.wait_time_ms DESC +OPTION(RECOMPILE); +"""; + + var rows = new List(); + await using var command = new SqlCommand(query, connection); + command.CommandTimeout = Math.Max(1, _options.CommandTimeoutSeconds); + await using var reader = await command.ExecuteReaderAsync(cancellationToken); + while (await reader.ReadAsync(cancellationToken)) + { + var waitType = reader.GetString(0); + if (IgnoredWaitTypes.Contains(waitType)) + { + continue; + } + + rows.Add(new WaitStatSnapshot( + waitType, + reader.GetInt64(1), + reader.GetInt64(2), + reader.GetInt64(3))); + } + + return rows; + } + + private async Task> CollectCpuUtilizationAsync( + SqlConnection connection, + DateTime? lastSampleTime, + CancellationToken cancellationToken) + { + const string query = """ +SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED; + +DECLARE + @ms_ticks bigint; + +SELECT + @ms_ticks = dosi.ms_ticks +FROM sys.dm_os_sys_info AS dosi; + +SELECT TOP (60) + sample_time = DATEADD(SECOND, -((@ms_ticks - t.timestamp) / 1000), SYSDATETIME()), + sqlserver_cpu_utilization = t.record.value('(Record/SchedulerMonitorEvent/SystemHealth/ProcessUtilization)[1]', 'integer'), + other_process_cpu_utilization = + CASE + WHEN (100 - t.record.value('(Record/SchedulerMonitorEvent/SystemHealth/SystemIdle)[1]', 'integer') + - t.record.value('(Record/SchedulerMonitorEvent/SystemHealth/ProcessUtilization)[1]', 'integer')) < 0 + THEN 0 + ELSE 100 - t.record.value('(Record/SchedulerMonitorEvent/SystemHealth/SystemIdle)[1]', 'integer') + - t.record.value('(Record/SchedulerMonitorEvent/SystemHealth/ProcessUtilization)[1]', 'integer') + END +FROM +( + SELECT + dorb.timestamp, + record = CONVERT(xml, dorb.record) + FROM sys.dm_os_ring_buffers AS dorb + WHERE dorb.ring_buffer_type = N'RING_BUFFER_SCHEDULER_MONITOR' +) AS t +ORDER BY t.timestamp DESC +OPTION(RECOMPILE); +"""; + + var rows = new List(); + await using var command = new SqlCommand(query, connection); + command.CommandTimeout = Math.Max(1, _options.CommandTimeoutSeconds); + await using var reader = await command.ExecuteReaderAsync(cancellationToken); + while (await reader.ReadAsync(cancellationToken)) + { + var sampleTime = reader.GetDateTime(0); + if (lastSampleTime.HasValue && sampleTime <= lastSampleTime.Value) + { + continue; + } + + rows.Add(new CpuSample( + sampleTime, + reader.IsDBNull(1) ? 0 : reader.GetInt32(1), + reader.IsDBNull(2) ? 0 : reader.GetInt32(2))); + } + + return rows; + } + + private static bool IsPermissionError(SqlException ex) + { + foreach (SqlError error in ex.Errors) + { + if (error.Number is 229 or 297 or 300) + { + return true; + } + } + + return false; + } +} diff --git a/Headless/Storage/HeadlessStore.cs b/Headless/Storage/HeadlessStore.cs new file mode 100644 index 0000000..0a59c0d --- /dev/null +++ b/Headless/Storage/HeadlessStore.cs @@ -0,0 +1,657 @@ +using DuckDB.NET.Data; +using Microsoft.Extensions.Options; +using PerformanceMonitor.Headless.Models; + +namespace PerformanceMonitor.Headless.Storage; + +public sealed class HeadlessStore +{ + private readonly MonitorOptions _options; + private readonly IHostEnvironment _environment; + private readonly ILogger _logger; + private readonly SemaphoreSlim _writeLock = new(1, 1); + private static long s_idCounter = DateTime.UtcNow.Ticks; + + public HeadlessStore( + IOptions options, + IHostEnvironment environment, + ILogger logger) + { + _options = options.Value; + _environment = environment; + _logger = logger; + } + + public string DatabasePath => ResolvePath(_options.StoragePath); + public string ArchiveDirectory => ResolvePath(_options.ArchiveDirectory); + + public async Task InitializeAsync(CancellationToken cancellationToken) + { + Directory.CreateDirectory(Path.GetDirectoryName(DatabasePath)!); + Directory.CreateDirectory(ArchiveDirectory); + + await using var connection = CreateConnection(); + await connection.OpenAsync(cancellationToken); + + foreach (var sql in SchemaStatements) + { + await using var command = connection.CreateCommand(); + command.CommandText = sql; + await command.ExecuteNonQueryAsync(cancellationToken); + } + } + + public DuckDBConnection CreateConnection() + => new($"Data Source={DatabasePath}"); + + public async Task UpsertConfiguredServersAsync(IEnumerable servers, CancellationToken cancellationToken) + { + await _writeLock.WaitAsync(cancellationToken); + try + { + await using var connection = CreateConnection(); + await connection.OpenAsync(cancellationToken); + + foreach (var server in servers) + { + await using var insert = connection.CreateCommand(); + insert.CommandText = @" +INSERT INTO servers (server_id, server_name, display_name, is_enabled, last_status) +VALUES ($1, $2, $3, $4, 'UNKNOWN') +ON CONFLICT(server_id) DO UPDATE +SET server_name = excluded.server_name, + display_name = excluded.display_name, + is_enabled = excluded.is_enabled"; + insert.Parameters.Add(new DuckDBParameter { Value = server.Id }); + insert.Parameters.Add(new DuckDBParameter { Value = server.ServerNameForStorage }); + insert.Parameters.Add(new DuckDBParameter { Value = server.DisplayName }); + insert.Parameters.Add(new DuckDBParameter { Value = server.Enabled }); + await insert.ExecuteNonQueryAsync(cancellationToken); + } + } + finally + { + _writeLock.Release(); + } + } + + public async Task SetServerStatusAsync( + MonitoredServerOptions server, + string status, + string? errorMessage, + ServerPropertiesSnapshot? properties, + CancellationToken cancellationToken) + { + await _writeLock.WaitAsync(cancellationToken); + try + { + await using var connection = CreateConnection(); + await connection.OpenAsync(cancellationToken); + await using var command = connection.CreateCommand(); + command.CommandText = @" +UPDATE servers +SET last_seen_time = $2, + last_status = $3, + last_error = $4, + product_version = COALESCE($5, product_version), + edition = COALESCE($6, edition), + sql_engine_edition = COALESCE($7, sql_engine_edition), + sql_major_version = COALESCE($8, sql_major_version) +WHERE server_id = $1"; + command.Parameters.Add(new DuckDBParameter { Value = server.Id }); + command.Parameters.Add(new DuckDBParameter { Value = DateTime.UtcNow }); + command.Parameters.Add(new DuckDBParameter { Value = status }); + command.Parameters.Add(new DuckDBParameter { Value = errorMessage ?? (object)DBNull.Value }); + command.Parameters.Add(new DuckDBParameter { Value = properties?.ProductVersion ?? (object)DBNull.Value }); + command.Parameters.Add(new DuckDBParameter { Value = properties?.Edition ?? (object)DBNull.Value }); + command.Parameters.Add(new DuckDBParameter { Value = properties?.EngineEdition ?? (object)DBNull.Value }); + command.Parameters.Add(new DuckDBParameter { Value = properties?.SqlMajorVersion ?? (object)DBNull.Value }); + await command.ExecuteNonQueryAsync(cancellationToken); + } + finally + { + _writeLock.Release(); + } + } + + public async Task InsertServerPropertiesAsync( + MonitoredServerOptions server, + DateTime collectionTime, + ServerPropertiesSnapshot properties, + CancellationToken cancellationToken) + { + await _writeLock.WaitAsync(cancellationToken); + try + { + await using var connection = CreateConnection(); + await connection.OpenAsync(cancellationToken); + using var appender = connection.CreateAppender("server_properties"); + var row = appender.CreateRow() + .AppendValue(NextId()) + .AppendValue(collectionTime) + .AppendValue(server.Id) + .AppendValue(server.ServerNameForStorage) + .AppendValue(properties.MachineName); + + if (properties.InstanceName is null) + { + row.AppendNullValue(); + } + else + { + row.AppendValue(properties.InstanceName); + } + + row + .AppendValue(properties.ProductVersion) + .AppendValue(properties.ProductLevel) + .AppendValue(properties.Edition) + .AppendValue(properties.EngineEdition) + .AppendValue(properties.SqlMajorVersion) + .AppendValue(properties.CpuCount) + .AppendValue(properties.PhysicalMemoryMb) + .AppendValue(properties.SqlServerStartTime) + .EndRow(); + } + finally + { + _writeLock.Release(); + } + } + + public async Task InsertWaitStatsAsync( + MonitoredServerOptions server, + DateTime collectionTime, + IReadOnlyList rows, + CancellationToken cancellationToken) + { + if (rows.Count == 0) + { + return; + } + + await _writeLock.WaitAsync(cancellationToken); + try + { + await using var connection = CreateConnection(); + await connection.OpenAsync(cancellationToken); + using var appender = connection.CreateAppender("wait_stats"); + foreach (var row in rows) + { + appender.CreateRow() + .AppendValue(NextId()) + .AppendValue(collectionTime) + .AppendValue(server.Id) + .AppendValue(server.ServerNameForStorage) + .AppendValue(row.WaitType) + .AppendValue(row.WaitingTasksCount) + .AppendValue(row.WaitTimeMs) + .AppendValue(row.SignalWaitTimeMs) + .EndRow(); + } + } + finally + { + _writeLock.Release(); + } + } + + public async Task GetLastCpuSampleTimeAsync(string serverId, CancellationToken cancellationToken) + { + await using var connection = CreateConnection(); + await connection.OpenAsync(cancellationToken); + await using var command = connection.CreateCommand(); + command.CommandText = "SELECT MAX(sample_time) FROM cpu_utilization_stats WHERE server_id = $1"; + command.Parameters.Add(new DuckDBParameter { Value = serverId }); + var result = await command.ExecuteScalarAsync(cancellationToken); + return result is DateTime dateTime ? dateTime : null; + } + + public async Task InsertCpuSamplesAsync( + MonitoredServerOptions server, + DateTime collectionTime, + IReadOnlyList rows, + CancellationToken cancellationToken) + { + if (rows.Count == 0) + { + return; + } + + await _writeLock.WaitAsync(cancellationToken); + try + { + await using var connection = CreateConnection(); + await connection.OpenAsync(cancellationToken); + using var appender = connection.CreateAppender("cpu_utilization_stats"); + foreach (var row in rows) + { + appender.CreateRow() + .AppendValue(NextId()) + .AppendValue(collectionTime) + .AppendValue(server.Id) + .AppendValue(server.ServerNameForStorage) + .AppendValue(row.SampleTime) + .AppendValue(row.SqlServerCpuUtilization) + .AppendValue(row.OtherProcessCpuUtilization) + .EndRow(); + } + } + finally + { + _writeLock.Release(); + } + } + + public async Task InsertCollectionLogAsync( + MonitoredServerOptions server, + string collectorName, + DateTime collectionTime, + int durationMs, + string status, + string? errorMessage, + int rowsCollected, + long sqlDurationMs, + long storageDurationMs, + CancellationToken cancellationToken) + { + await _writeLock.WaitAsync(cancellationToken); + try + { + await using var connection = CreateConnection(); + await connection.OpenAsync(cancellationToken); + await using var command = connection.CreateCommand(); + command.CommandText = @" +INSERT INTO collection_log + (log_id, server_id, server_name, collector_name, collection_time, duration_ms, status, error_message, rows_collected, sql_duration_ms, storage_duration_ms) +VALUES + ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11)"; + command.Parameters.Add(new DuckDBParameter { Value = NextId() }); + command.Parameters.Add(new DuckDBParameter { Value = server.Id }); + command.Parameters.Add(new DuckDBParameter { Value = server.ServerNameForStorage }); + command.Parameters.Add(new DuckDBParameter { Value = collectorName }); + command.Parameters.Add(new DuckDBParameter { Value = collectionTime }); + command.Parameters.Add(new DuckDBParameter { Value = durationMs }); + command.Parameters.Add(new DuckDBParameter { Value = status }); + command.Parameters.Add(new DuckDBParameter { Value = errorMessage ?? (object)DBNull.Value }); + command.Parameters.Add(new DuckDBParameter { Value = rowsCollected }); + command.Parameters.Add(new DuckDBParameter { Value = (int)sqlDurationMs }); + command.Parameters.Add(new DuckDBParameter { Value = (int)storageDurationMs }); + await command.ExecuteNonQueryAsync(cancellationToken); + } + finally + { + _writeLock.Release(); + } + } + + public async Task> GetServersAsync(CancellationToken cancellationToken) + { + var servers = new List(); + await using var connection = CreateConnection(); + await connection.OpenAsync(cancellationToken); + await using var command = connection.CreateCommand(); + command.CommandText = @" +SELECT + s.server_id, + s.display_name, + s.is_enabled, + s.last_seen_time, + s.last_status, + s.last_error, + s.product_version, + s.edition, + s.sql_major_version, + ( + SELECT COUNT(*) + FROM collection_log AS cl + WHERE cl.server_id = s.server_id + AND cl.collection_time >= $1 + AND cl.status IN ('ERROR', 'PERMISSIONS') + ) AS active_alert_count, + ( + SELECT cl.error_message + FROM collection_log AS cl + WHERE cl.server_id = s.server_id + AND cl.collection_time >= $1 + AND cl.status IN ('ERROR', 'PERMISSIONS') + ORDER BY cl.collection_time DESC + LIMIT 1 + ) AS recent_alert +FROM servers AS s +ORDER BY s.is_enabled DESC, s.display_name"; + command.Parameters.Add(new DuckDBParameter { Value = DateTime.UtcNow.AddMinutes(-15) }); + await using var reader = await command.ExecuteReaderAsync(cancellationToken); + while (await reader.ReadAsync(cancellationToken)) + { + var serverId = reader.GetString(0); + var displayName = reader.IsDBNull(1) ? serverId : reader.GetString(1); + var isEnabled = reader.GetBoolean(2); + var lastSeenTime = reader.IsDBNull(3) ? (DateTime?)null : reader.GetDateTime(3); + var lastStatus = reader.IsDBNull(4) ? "UNKNOWN" : reader.GetString(4); + var lastError = reader.IsDBNull(5) ? null : reader.GetString(5); + var productVersion = reader.IsDBNull(6) ? null : reader.GetString(6); + var edition = reader.IsDBNull(7) ? null : reader.GetString(7); + var sqlMajorVersion = reader.IsDBNull(8) ? (int?)null : reader.GetInt32(8); + var activeAlertCount = reader.IsDBNull(9) ? 0 : Convert.ToInt32(reader.GetInt64(9)); + var recentAlert = reader.IsDBNull(10) ? null : reader.GetString(10); + var (healthState, healthReason) = ComputeHealth(isEnabled, lastSeenTime, lastStatus, lastError, activeAlertCount, recentAlert); + + servers.Add(new ServerHealthDto( + serverId, + displayName, + isEnabled, + lastSeenTime, + lastStatus, + recentAlert ?? lastError, + productVersion, + edition, + sqlMajorVersion, + healthState, + healthReason, + activeAlertCount)); + } + + return servers; + } + + public async Task GetEstateSummaryAsync(CancellationToken cancellationToken) + { + var servers = await GetServersAsync(cancellationToken); + return new EstateSummaryDto( + servers.Count, + servers.Count(s => string.Equals(s.HealthState, "green", StringComparison.OrdinalIgnoreCase)), + servers.Count(s => string.Equals(s.HealthState, "yellow", StringComparison.OrdinalIgnoreCase)), + servers.Count(s => string.Equals(s.HealthState, "red", StringComparison.OrdinalIgnoreCase)), + servers.Count(s => s.IsEnabled && string.Equals(s.LastStatus, "ERROR", StringComparison.OrdinalIgnoreCase)), + servers.Count(s => !s.IsEnabled), + DateTime.UtcNow, + servers); + } + + private (string HealthState, string HealthReason) ComputeHealth( + bool isEnabled, + DateTime? lastSeenTime, + string lastStatus, + string? lastError, + int activeAlertCount, + string? recentAlert) + { + if (!isEnabled) + { + return ("disabled", "Monitoring disabled"); + } + + if (string.Equals(lastStatus, "ERROR", StringComparison.OrdinalIgnoreCase)) + { + return ("red", lastError ?? "Connection failed"); + } + + if (activeAlertCount > 0) + { + return ("red", recentAlert ?? $"{activeAlertCount} collector alert(s) in the last 15 minutes"); + } + + if (!lastSeenTime.HasValue) + { + return ("yellow", "No successful collection yet"); + } + + var staleAfter = TimeSpan.FromSeconds(Math.Max(180, _options.CollectionIntervalSeconds * 3)); + if (DateTime.UtcNow - lastSeenTime.Value > staleAfter) + { + return ("yellow", $"No server contact for {DateTime.UtcNow - lastSeenTime.Value:g}"); + } + + return ("green", "All good"); + } + + public async Task> GetCollectionLogAsync(int limit, CancellationToken cancellationToken) + { + var logs = new List(); + await using var connection = CreateConnection(); + await connection.OpenAsync(cancellationToken); + await using var command = connection.CreateCommand(); + command.CommandText = @" +SELECT collection_time, server_id, server_name, collector_name, status, rows_collected, duration_ms, error_message +FROM collection_log +ORDER BY collection_time DESC +LIMIT $1"; + command.Parameters.Add(new DuckDBParameter { Value = Math.Clamp(limit, 1, 1000) }); + await using var reader = await command.ExecuteReaderAsync(cancellationToken); + while (await reader.ReadAsync(cancellationToken)) + { + logs.Add(new CollectionLogDto( + reader.GetDateTime(0), + reader.GetString(1), + reader.GetString(2), + reader.GetString(3), + reader.GetString(4), + reader.IsDBNull(5) ? 0 : reader.GetInt32(5), + reader.IsDBNull(6) ? 0 : reader.GetInt32(6), + reader.IsDBNull(7) ? null : reader.GetString(7))); + } + + return logs; + } + + public async Task> GetTopWaitsAsync(string serverId, int hoursBack, int limit, CancellationToken cancellationToken) + { + var waits = new List(); + await using var connection = CreateConnection(); + await connection.OpenAsync(cancellationToken); + await using var command = connection.CreateCommand(); + command.CommandText = @" +WITH wait_window AS +( + SELECT + wait_type, + MAX(wait_time_ms) - MIN(wait_time_ms) AS wait_time_delta_ms, + MAX(signal_wait_time_ms) - MIN(signal_wait_time_ms) AS signal_wait_time_delta_ms, + MAX(waiting_tasks_count) - MIN(waiting_tasks_count) AS waiting_tasks_delta + FROM wait_stats + WHERE server_id = $1 + AND collection_time >= $2 + GROUP BY wait_type +) +SELECT wait_type, wait_time_delta_ms, signal_wait_time_delta_ms, waiting_tasks_delta +FROM wait_window +WHERE wait_time_delta_ms > 0 +ORDER BY wait_time_delta_ms DESC +LIMIT $3"; + command.Parameters.Add(new DuckDBParameter { Value = serverId }); + command.Parameters.Add(new DuckDBParameter { Value = DateTime.UtcNow.AddHours(-Math.Clamp(hoursBack, 1, 720)) }); + command.Parameters.Add(new DuckDBParameter { Value = Math.Clamp(limit, 1, 100) }); + await using var reader = await command.ExecuteReaderAsync(cancellationToken); + while (await reader.ReadAsync(cancellationToken)) + { + waits.Add(new TopWaitDto( + reader.GetString(0), + reader.GetInt64(1), + reader.GetInt64(2), + reader.GetInt64(3))); + } + + return waits; + } + + public async Task> GetCpuSamplesAsync(string serverId, int hoursBack, CancellationToken cancellationToken) + { + var samples = new List(); + await using var connection = CreateConnection(); + await connection.OpenAsync(cancellationToken); + await using var command = connection.CreateCommand(); + command.CommandText = @" +SELECT sample_time, sqlserver_cpu_utilization, other_process_cpu_utilization +FROM cpu_utilization_stats +WHERE server_id = $1 +AND sample_time >= $2 +ORDER BY sample_time"; + command.Parameters.Add(new DuckDBParameter { Value = serverId }); + command.Parameters.Add(new DuckDBParameter { Value = DateTime.UtcNow.AddHours(-Math.Clamp(hoursBack, 1, 720)) }); + await using var reader = await command.ExecuteReaderAsync(cancellationToken); + while (await reader.ReadAsync(cancellationToken)) + { + samples.Add(new CpuSampleDto( + reader.GetDateTime(0), + reader.IsDBNull(1) ? 0 : reader.GetInt32(1), + reader.IsDBNull(2) ? 0 : reader.GetInt32(2))); + } + + return samples; + } + + public async Task ArchiveOldDataAsync(CancellationToken cancellationToken) + { + if (_options.HotDataDays <= 0) + { + return; + } + + var cutoff = DateTime.UtcNow.AddDays(-_options.HotDataDays); + var tables = new[] { "wait_stats", "cpu_utilization_stats", "server_properties", "collection_log" }; + + await _writeLock.WaitAsync(cancellationToken); + try + { + await using var connection = CreateConnection(); + await connection.OpenAsync(cancellationToken); + + foreach (var table in tables) + { + var timeColumn = table == "cpu_utilization_stats" ? "sample_time" : "collection_time"; + await using var countCommand = connection.CreateCommand(); + countCommand.CommandText = $"SELECT COUNT(*) FROM {table} WHERE {timeColumn} < $1"; + countCommand.Parameters.Add(new DuckDBParameter { Value = cutoff }); + var count = Convert.ToInt64(await countCommand.ExecuteScalarAsync(cancellationToken)); + if (count == 0) + { + continue; + } + + var tableArchiveDirectory = Path.Combine(ArchiveDirectory, table); + Directory.CreateDirectory(tableArchiveDirectory); + var archiveFile = Path.Combine(tableArchiveDirectory, $"{table}_{DateTime.UtcNow:yyyyMMddTHHmmss}.parquet"); + var archiveFileSql = archiveFile.Replace("\\", "/").Replace("'", "''"); + var cutoffSql = cutoff.ToString("yyyy-MM-dd HH:mm:ss.fffffff"); + + await using var copyCommand = connection.CreateCommand(); + copyCommand.CommandText = $@" +COPY +( + SELECT * + FROM {table} + WHERE {timeColumn} < TIMESTAMP '{cutoffSql}' +) +TO '{archiveFileSql}' +(FORMAT PARQUET)"; + await copyCommand.ExecuteNonQueryAsync(cancellationToken); + + await using var deleteCommand = connection.CreateCommand(); + deleteCommand.CommandText = $"DELETE FROM {table} WHERE {timeColumn} < $1"; + deleteCommand.Parameters.Add(new DuckDBParameter { Value = cutoff }); + await deleteCommand.ExecuteNonQueryAsync(cancellationToken); + + _logger.LogInformation("Archived {RowCount} rows from {Table} to {File}", count, table, archiveFile); + } + } + finally + { + _writeLock.Release(); + } + } + + private string ResolvePath(string configuredPath) + { + var expanded = Environment.ExpandEnvironmentVariables(configuredPath); + if (!Path.IsPathRooted(expanded)) + { + expanded = Path.Combine(_environment.ContentRootPath, expanded); + } + + return Path.GetFullPath(expanded); + } + + private static long NextId() => Interlocked.Increment(ref s_idCounter); + + private static readonly string[] SchemaStatements = + [ + """ + CREATE TABLE IF NOT EXISTS servers ( + server_id VARCHAR PRIMARY KEY, + server_name VARCHAR NOT NULL, + display_name VARCHAR, + is_enabled BOOLEAN NOT NULL DEFAULT TRUE, + last_seen_time TIMESTAMP, + last_status VARCHAR NOT NULL DEFAULT 'UNKNOWN', + last_error VARCHAR, + product_version VARCHAR, + edition VARCHAR, + sql_engine_edition INTEGER, + sql_major_version INTEGER, + created_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP + ) + """, + """ + CREATE TABLE IF NOT EXISTS collection_log ( + log_id BIGINT PRIMARY KEY, + server_id VARCHAR NOT NULL, + server_name VARCHAR NOT NULL, + collector_name VARCHAR NOT NULL, + collection_time TIMESTAMP NOT NULL, + duration_ms INTEGER, + status VARCHAR NOT NULL, + error_message VARCHAR, + rows_collected INTEGER, + sql_duration_ms INTEGER, + storage_duration_ms INTEGER + ) + """, + """ + CREATE TABLE IF NOT EXISTS server_properties ( + collection_id BIGINT PRIMARY KEY, + collection_time TIMESTAMP NOT NULL, + server_id VARCHAR NOT NULL, + server_name VARCHAR NOT NULL, + machine_name VARCHAR, + instance_name VARCHAR, + product_version VARCHAR, + product_level VARCHAR, + edition VARCHAR, + engine_edition INTEGER, + sql_major_version INTEGER, + cpu_count INTEGER, + physical_memory_mb BIGINT, + sqlserver_start_time TIMESTAMP + ) + """, + """ + CREATE TABLE IF NOT EXISTS wait_stats ( + collection_id BIGINT PRIMARY KEY, + collection_time TIMESTAMP NOT NULL, + server_id VARCHAR NOT NULL, + server_name VARCHAR NOT NULL, + wait_type VARCHAR NOT NULL, + waiting_tasks_count BIGINT, + wait_time_ms BIGINT, + signal_wait_time_ms BIGINT + ) + """, + """ + CREATE TABLE IF NOT EXISTS cpu_utilization_stats ( + collection_id BIGINT PRIMARY KEY, + collection_time TIMESTAMP NOT NULL, + server_id VARCHAR NOT NULL, + server_name VARCHAR NOT NULL, + sample_time TIMESTAMP NOT NULL, + sqlserver_cpu_utilization INTEGER, + other_process_cpu_utilization INTEGER + ) + """, + "CREATE INDEX IF NOT EXISTS idx_servers_status ON servers(is_enabled, last_status)", + "CREATE INDEX IF NOT EXISTS idx_collection_log_time ON collection_log(collection_time)", + "CREATE INDEX IF NOT EXISTS idx_wait_stats_time ON wait_stats(server_id, collection_time)", + "CREATE INDEX IF NOT EXISTS idx_cpu_time ON cpu_utilization_stats(server_id, sample_time)", + "CREATE INDEX IF NOT EXISTS idx_server_properties_time ON server_properties(server_id, collection_time)" + ]; +} diff --git a/Headless/appsettings.example.json b/Headless/appsettings.example.json new file mode 100644 index 0000000..9d3ee73 --- /dev/null +++ b/Headless/appsettings.example.json @@ -0,0 +1,37 @@ +{ + "Urls": "http://localhost:5155", + "Monitor": { + "StoragePath": "data\\headless\\performance-monitor.duckdb", + "ArchiveDirectory": "data\\headless\\parquet", + "CollectionIntervalSeconds": 60, + "MaxConcurrentServers": 8, + "CommandTimeoutSeconds": 30, + "ArchiveIntervalMinutes": 60, + "HotDataDays": 7, + "Collectors": [ + { + "Name": "server_properties", + "Enabled": true, + "FrequencySeconds": 3600 + }, + { + "Name": "wait_stats", + "Enabled": true, + "FrequencySeconds": 60 + }, + { + "Name": "cpu_utilization", + "Enabled": true, + "FrequencySeconds": 60 + } + ], + "Servers": [ + { + "Id": "sample-dev", + "DisplayName": "Sample Dev SQL", + "ConnectionStringEnvironmentVariable": "PM_SAMPLE_DEV_CONNECTION", + "Enabled": false + } + ] + } +} diff --git a/Headless/wwwroot/app.js b/Headless/wwwroot/app.js new file mode 100644 index 0000000..7566487 --- /dev/null +++ b/Headless/wwwroot/app.js @@ -0,0 +1,313 @@ +const state = { + selectedServerId: null, + servers: [], + healthSnapshot: JSON.parse(localStorage.getItem("pm-headless-health") || "{}"), + loadedOnce: false +}; + +const els = { + green: document.getElementById("metric-green"), + yellow: document.getElementById("metric-yellow"), + red: document.getElementById("metric-red"), + disabled: document.getElementById("metric-disabled"), + generatedAt: document.getElementById("generated-at"), + serverCount: document.getElementById("server-count"), + storagePaths: document.getElementById("storage-paths"), + serverCardGrid: document.getElementById("server-card-grid"), + serverRows: document.getElementById("server-rows"), + collectorLog: document.getElementById("collector-log"), + selectedTitle: document.getElementById("selected-server-title"), + selectedSubtitle: document.getElementById("selected-server-subtitle"), + waitList: document.getElementById("wait-list"), + cpuCanvas: document.getElementById("cpu-chart"), + refresh: document.getElementById("refresh-button"), + notify: document.getElementById("notify-button"), + toastRegion: document.getElementById("toast-region") +}; + +els.refresh.addEventListener("click", () => loadAll()); +els.notify.addEventListener("click", async () => { + if (!("Notification" in window)) { + showToast("Browser notifications are not supported here.", "yellow"); + return; + } + + const permission = await Notification.requestPermission(); + updateNotifyButton(); + showToast(permission === "granted" ? "Browser notifications enabled." : "Browser notifications not enabled.", permission === "granted" ? "green" : "yellow"); +}); + +async function fetchJson(url) { + const response = await fetch(url, { headers: { "Accept": "application/json" } }); + if (!response.ok) throw new Error(`${response.status} ${response.statusText}`); + return response.json(); +} + +async function loadAll() { + const [summary, storage, logs] = await Promise.all([ + fetchJson("/api/summary"), + fetchJson("/api/storage"), + fetchJson("/api/collection-log?limit=50") + ]); + + state.servers = summary.servers || []; + if (!state.selectedServerId && state.servers.length > 0) { + const firstOnline = state.servers.find(s => s.isEnabled) || state.servers[0]; + state.selectedServerId = firstOnline.serverId; + } + + renderSummary(summary); + renderStorage(storage); + renderServerCards(state.servers); + renderServers(state.servers); + renderLog(logs); + handleHealthNotifications(state.servers); + state.loadedOnce = true; + await loadSelectedServer(); + updateNotifyButton(); +} + +function renderSummary(summary) { + els.green.textContent = summary.greenCount; + els.yellow.textContent = summary.yellowCount; + els.red.textContent = summary.redCount; + els.disabled.textContent = summary.disabledCount; + els.generatedAt.textContent = `Updated ${formatDate(summary.generatedAt)}`; + els.serverCount.textContent = `${summary.serverCount} configured`; +} + +function renderStorage(storage) { + els.storagePaths.textContent = `DuckDB ${storage.duckdb} | Parquet ${storage.parquet}`; +} + +function renderServerCards(servers) { + els.serverCardGrid.innerHTML = ""; + for (const server of servers) { + const card = document.createElement("button"); + card.type = "button"; + card.className = `server-card health-${server.healthState || "yellow"} ${server.serverId === state.selectedServerId ? "selected" : ""}`; + card.addEventListener("click", async () => { + state.selectedServerId = server.serverId; + renderServerCards(state.servers); + renderServers(state.servers); + await loadSelectedServer(); + }); + + card.innerHTML = ` + ${escapeHtml((server.healthState || "yellow").toUpperCase())} + ${escapeHtml(server.displayName || server.serverId)} + ${escapeHtml(server.healthReason || "No status yet")} + ${server.activeAlertCount ? `${server.activeAlertCount} active alert(s)` : formatDate(server.lastSeenTime) || "No contact yet"} + `; + els.serverCardGrid.appendChild(card); + } +} + +function renderServers(servers) { + els.serverRows.innerHTML = ""; + for (const server of servers) { + const tr = document.createElement("tr"); + tr.className = server.serverId === state.selectedServerId ? "selected" : ""; + tr.addEventListener("click", async () => { + state.selectedServerId = server.serverId; + renderServers(state.servers); + await loadSelectedServer(); + }); + + const statusClass = server.healthState || "yellow"; + + tr.innerHTML = ` + ${escapeHtml((server.healthState || "yellow").toUpperCase())} + ${escapeHtml(server.displayName || server.serverId)} + ${escapeHtml(server.edition || "")} + ${escapeHtml(server.productVersion || "")} + ${formatDate(server.lastSeenTime)} + ${escapeHtml(server.healthReason || server.lastError || "")} + `; + + els.serverRows.appendChild(tr); + } +} + +function handleHealthNotifications(servers) { + const nextSnapshot = {}; + for (const server of servers) { + const health = server.healthState || "yellow"; + nextSnapshot[server.serverId] = health; + + const isAlert = health === "red" || health === "yellow"; + const prior = state.healthSnapshot[server.serverId]; + const isNewOrChanged = prior !== health; + const shouldNotifyInitial = !state.loadedOnce && health === "red"; + + if (isAlert && (isNewOrChanged || shouldNotifyInitial)) { + const title = `${server.displayName || server.serverId} is ${health.toUpperCase()}`; + const body = server.healthReason || "Server needs attention"; + showToast(`${title}: ${body}`, health); + if ("Notification" in window && Notification.permission === "granted") { + new Notification(title, { body }); + } + } + } + + state.healthSnapshot = nextSnapshot; + localStorage.setItem("pm-headless-health", JSON.stringify(nextSnapshot)); +} + +function showToast(message, health) { + const toast = document.createElement("div"); + toast.className = `toast ${health}`; + toast.textContent = message; + els.toastRegion.appendChild(toast); + window.setTimeout(() => toast.remove(), 9000); +} + +function updateNotifyButton() { + if (!("Notification" in window)) { + els.notify.textContent = "Notifications Unavailable"; + els.notify.disabled = true; + return; + } + + els.notify.textContent = Notification.permission === "granted" + ? "Notifications Enabled" + : "Enable Notifications"; +} + +async function loadSelectedServer() { + const server = state.servers.find(s => s.serverId === state.selectedServerId); + if (!server) { + els.selectedTitle.textContent = "Server Detail"; + els.selectedSubtitle.textContent = "Select a server row"; + els.waitList.innerHTML = ""; + drawCpuChart([]); + return; + } + + els.selectedTitle.textContent = server.displayName || server.serverId; + els.selectedSubtitle.textContent = server.edition || server.productVersion || server.serverId; + + const [waits, cpu] = await Promise.all([ + fetchJson(`/api/servers/${encodeURIComponent(server.serverId)}/waits?hours=1&limit=12`), + fetchJson(`/api/servers/${encodeURIComponent(server.serverId)}/cpu?hours=1`) + ]); + + renderWaits(waits); + drawCpuChart(cpu); +} + +function renderWaits(waits) { + els.waitList.innerHTML = ""; + if (!waits.length) { + els.waitList.innerHTML = `
No wait deltas yet
`; + return; + } + + const max = Math.max(...waits.map(w => w.waitTimeDeltaMs), 1); + for (const wait of waits) { + const row = document.createElement("div"); + row.className = "wait-row"; + const width = Math.max(2, Math.round(wait.waitTimeDeltaMs / max * 100)); + row.innerHTML = ` + ${escapeHtml(wait.waitType)} + + ${formatMs(wait.waitTimeDeltaMs)} + `; + els.waitList.appendChild(row); + } +} + +function renderLog(logs) { + els.collectorLog.innerHTML = ""; + for (const log of logs) { + const row = document.createElement("div"); + const statusClass = (log.status || "").toLowerCase(); + row.className = "log-row"; + row.title = log.errorMessage || ""; + row.innerHTML = ` + ${formatTime(log.collectionTime)} + ${escapeHtml(log.serverName)} / ${escapeHtml(log.collectorName)} + ${escapeHtml(log.status)} + ${log.rowsCollected ?? 0} + `; + els.collectorLog.appendChild(row); + } +} + +function drawCpuChart(samples) { + const canvas = els.cpuCanvas; + const rect = canvas.getBoundingClientRect(); + const ratio = window.devicePixelRatio || 1; + canvas.width = Math.max(1, Math.round(rect.width * ratio)); + canvas.height = Math.max(1, Math.round(rect.height * ratio)); + + const ctx = canvas.getContext("2d"); + ctx.scale(ratio, ratio); + ctx.clearRect(0, 0, rect.width, rect.height); + + ctx.strokeStyle = "#d9e1ea"; + ctx.lineWidth = 1; + for (let i = 0; i <= 4; i++) { + const y = 12 + (rect.height - 24) * i / 4; + ctx.beginPath(); + ctx.moveTo(0, y); + ctx.lineTo(rect.width, y); + ctx.stroke(); + } + + if (!samples.length) { + ctx.fillStyle = "#64748b"; + ctx.font = "13px Segoe UI, sans-serif"; + ctx.fillText("No CPU samples yet", 12, 28); + return; + } + + const points = samples.map((sample, index) => ({ + x: samples.length === 1 ? rect.width - 10 : 10 + index * (rect.width - 20) / (samples.length - 1), + y: 10 + (100 - Math.min(100, Math.max(0, sample.sqlServerCpuUtilization))) * (rect.height - 20) / 100 + })); + + ctx.strokeStyle = "#2563eb"; + ctx.lineWidth = 2; + ctx.beginPath(); + points.forEach((point, index) => { + if (index === 0) ctx.moveTo(point.x, point.y); + else ctx.lineTo(point.x, point.y); + }); + ctx.stroke(); + + const latest = samples[samples.length - 1]; + ctx.fillStyle = "#111827"; + ctx.font = "700 13px Segoe UI, sans-serif"; + ctx.fillText(`${latest.sqlServerCpuUtilization}% SQL CPU`, 12, 22); +} + +function formatDate(value) { + if (!value) return ""; + return new Date(value).toLocaleString(); +} + +function formatTime(value) { + if (!value) return ""; + return new Date(value).toLocaleTimeString(); +} + +function formatMs(ms) { + if (ms > 3600000) return `${(ms / 3600000).toFixed(1)}h`; + if (ms > 60000) return `${(ms / 60000).toFixed(1)}m`; + return `${(ms / 1000).toFixed(1)}s`; +} + +function escapeHtml(value) { + return String(value) + .replaceAll("&", "&") + .replaceAll("<", "<") + .replaceAll(">", ">") + .replaceAll('"', """) + .replaceAll("'", "'"); +} + +window.addEventListener("resize", () => loadSelectedServer()); +loadAll().catch(error => { + els.generatedAt.textContent = error.message; +}); diff --git a/Headless/wwwroot/index.html b/Headless/wwwroot/index.html new file mode 100644 index 0000000..78e246a --- /dev/null +++ b/Headless/wwwroot/index.html @@ -0,0 +1,99 @@ + + + + + + Performance Monitor Estate + + + +
+
+
+

Performance Monitor Estate

+

+
+
+ + +
+
+ +
+
+
+ Green + 0 +
+
+ Yellow + 0 +
+
+ Red + 0 +
+
+ Disabled + 0 +
+
+ +
+
+

Estate Traffic Lights

+ +
+
+
+ +
+
+

Server Inventory

+ +
+
+ + + + + + + + + + + + +
StatusServerEditionVersionLast SeenLast Error
+
+
+ +
+
+
+

Server Detail

+ Select a server row +
+
+
CPU
+ +
+
+
+ +
+
+

Collector Log

+ Latest 50 +
+
+
+
+
+
+ +
+ + + diff --git a/Headless/wwwroot/styles.css b/Headless/wwwroot/styles.css new file mode 100644 index 0000000..33649a6 --- /dev/null +++ b/Headless/wwwroot/styles.css @@ -0,0 +1,450 @@ +:root { + color-scheme: light; + --bg: #f6f8fb; + --panel: #ffffff; + --panel-2: #f9fbfd; + --text: #111827; + --muted: #64748b; + --border: #d9e1ea; + --online: #169b62; + --warning: #c2410c; + --offline: #9f1239; + --accent: #2563eb; + --shadow: 0 12px 30px rgba(15, 23, 42, .08); +} + +* { + box-sizing: border-box; +} + +body { + margin: 0; + background: var(--bg); + color: var(--text); + font-family: "Segoe UI", system-ui, -apple-system, BlinkMacSystemFont, sans-serif; +} + +.app-shell { + width: min(1500px, calc(100vw - 36px)); + margin: 0 auto; + padding: 22px 0 36px; +} + +.topbar { + display: flex; + align-items: flex-start; + justify-content: space-between; + gap: 18px; + padding: 8px 0 18px; +} + +h1, h2, p { + margin: 0; +} + +h1 { + font-size: 27px; + line-height: 1.15; + font-weight: 700; + letter-spacing: 0; +} + +h2 { + font-size: 16px; + line-height: 1.2; + font-weight: 700; + letter-spacing: 0; +} + +#storage-paths, +.panel-header span, +td, +.log-row, +.wait-row, +.metric span { + font-size: 13px; +} + +#storage-paths { + color: var(--muted); + margin-top: 7px; +} + +button { + border: 1px solid var(--border); + background: var(--panel); + color: var(--text); + border-radius: 6px; + min-height: 34px; + padding: 0 14px; + font-size: 13px; + font-weight: 600; + cursor: pointer; + box-shadow: 0 2px 8px rgba(15, 23, 42, .06); +} + +.header-actions { + display: flex; + gap: 8px; + align-items: center; +} + +button:hover { + border-color: #b9c5d4; +} + +.status-grid { + display: grid; + grid-template-columns: repeat(4, minmax(0, 1fr)); + gap: 12px; + margin-bottom: 14px; +} + +.metric { + background: var(--panel); + border: 1px solid var(--border); + border-radius: 8px; + padding: 14px; + box-shadow: var(--shadow); +} + +.metric span { + color: var(--muted); + display: block; + margin-bottom: 8px; +} + +.metric strong { + display: block; + font-size: 32px; + line-height: 1; +} + +.metric.green strong { + color: var(--online); +} + +.metric.yellow strong { + color: #b7791f; +} + +.metric.red strong { + color: var(--warning); +} + +.metric.muted strong { + color: var(--muted); +} + +.panel { + background: var(--panel); + border: 1px solid var(--border); + border-radius: 8px; + box-shadow: var(--shadow); + overflow: hidden; +} + +.panel-header { + min-height: 48px; + padding: 14px 16px; + display: flex; + align-items: center; + justify-content: space-between; + gap: 14px; + border-bottom: 1px solid var(--border); +} + +.panel-header span { + color: var(--muted); +} + +.overview-panel { + margin-bottom: 14px; +} + +.server-card-grid { + display: grid; + grid-template-columns: repeat(auto-fill, minmax(220px, 1fr)); + gap: 12px; + padding: 14px; +} + +.server-card { + appearance: none; + text-align: left; + border-radius: 8px; + border: 1px solid var(--border); + min-height: 124px; + padding: 13px; + display: grid; + gap: 8px; + box-shadow: none; + background: var(--panel-2); +} + +.server-card:hover, +.server-card.selected { + border-color: #94a3b8; + box-shadow: 0 0 0 3px rgba(37, 99, 235, .10); +} + +.server-card strong { + font-size: 15px; + line-height: 1.25; +} + +.server-card span, +.server-card small { + font-size: 12px; + line-height: 1.25; +} + +.server-card small { + color: rgba(17, 24, 39, .72); +} + +.card-status { + font-weight: 800; +} + +.server-card.health-green { + background: #eaf8f1; + border-color: #a8dec2; +} + +.server-card.health-yellow { + background: #fff7df; + border-color: #f3d27a; +} + +.server-card.health-red { + background: #fff0ed; + border-color: #f3a38f; +} + +.server-card.health-disabled { + background: #f1f5f9; + border-color: #cbd5e1; + color: #64748b; +} + +.table-wrap { + overflow: auto; +} + +table { + width: 100%; + border-collapse: collapse; + table-layout: fixed; +} + +th { + text-align: left; + font-size: 12px; + line-height: 1.2; + color: var(--muted); + font-weight: 700; + padding: 10px 14px; + background: var(--panel-2); + border-bottom: 1px solid var(--border); +} + +td { + padding: 11px 14px; + border-bottom: 1px solid var(--border); + vertical-align: middle; + color: #243044; + white-space: nowrap; + overflow: hidden; + text-overflow: ellipsis; +} + +tbody tr { + cursor: pointer; +} + +tbody tr:hover { + background: #f4f8ff; +} + +tbody tr.selected { + background: #eef5ff; +} + +.traffic { + display: inline-flex; + align-items: center; + gap: 8px; + font-weight: 700; +} + +.traffic::before { + content: ""; + width: 10px; + height: 10px; + border-radius: 999px; + background: var(--muted); + flex: 0 0 auto; +} + +.traffic.green::before { + background: var(--online); + box-shadow: 0 0 0 4px rgba(22, 155, 98, .12); +} + +.traffic.yellow::before { + background: #d69e2e; + box-shadow: 0 0 0 4px rgba(214, 158, 46, .14); +} + +.traffic.red::before { + background: var(--offline); + box-shadow: 0 0 0 4px rgba(159, 18, 57, .12); +} + +.traffic.disabled::before { + background: var(--muted); +} + +.detail-grid { + display: grid; + grid-template-columns: minmax(0, 1.4fr) minmax(340px, .8fr); + gap: 14px; + margin-top: 14px; +} + +.chart-card { + margin: 16px; + padding: 12px; + background: var(--panel-2); + border: 1px solid var(--border); + border-radius: 8px; +} + +.chart-title { + color: var(--muted); + font-size: 12px; + font-weight: 700; + margin-bottom: 8px; +} + +canvas { + width: 100%; + height: 180px; + display: block; +} + +.wait-list { + display: grid; + gap: 8px; + padding: 0 16px 16px; +} + +.wait-row { + display: grid; + grid-template-columns: minmax(120px, .8fr) minmax(0, 1fr) 96px; + gap: 12px; + align-items: center; +} + +.bar-track { + height: 8px; + border-radius: 999px; + background: #e6edf5; + overflow: hidden; +} + +.bar-fill { + height: 100%; + border-radius: inherit; + background: var(--accent); +} + +.log-list { + max-height: 430px; + overflow: auto; +} + +.log-row { + display: grid; + grid-template-columns: 92px 1fr 95px 58px; + gap: 8px; + padding: 10px 12px; + border-bottom: 1px solid var(--border); + color: #243044; +} + +.log-row .status { + font-weight: 700; +} + +.log-row .status.success { + color: var(--online); +} + +.log-row .status.error, +.log-row .status.permissions { + color: var(--warning); +} + +.toast-region { + position: fixed; + right: 18px; + bottom: 18px; + display: grid; + gap: 10px; + z-index: 20; + width: min(420px, calc(100vw - 36px)); +} + +.toast { + border-radius: 8px; + border: 1px solid var(--border); + background: var(--panel); + box-shadow: var(--shadow); + padding: 12px 14px; + font-size: 13px; + line-height: 1.35; + font-weight: 650; +} + +.toast.green { + border-color: #a8dec2; +} + +.toast.yellow { + border-color: #f3d27a; + background: #fff7df; +} + +.toast.red { + border-color: #f3a38f; + background: #fff0ed; +} + +@media (max-width: 900px) { + .app-shell { + width: min(100vw - 24px, 760px); + } + + .topbar { + flex-direction: column; + } + + .header-actions { + width: 100%; + } + + .header-actions button { + flex: 1; + } + + .status-grid, + .detail-grid { + grid-template-columns: 1fr; + } + + table { + min-width: 820px; + } +} diff --git a/PerformanceMonitor.sln b/PerformanceMonitor.sln index 8140d76..6b7ea0b 100644 --- a/PerformanceMonitor.sln +++ b/PerformanceMonitor.sln @@ -15,6 +15,8 @@ Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "Installer.Tests", "Installe EndProject Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "Installer.Core", "Installer.Core\Installer.Core.csproj", "{AFA0BA1D-42D6-4E01-9D6C-B7E327E1B7D3}" EndProject +Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "PerformanceMonitor.Headless", "Headless\PerformanceMonitor.Headless.csproj", "{63C17106-FAE5-4417-974B-3C11AA8404B4}" +EndProject Global GlobalSection(SolutionConfigurationPlatforms) = preSolution Debug|Any CPU = Debug|Any CPU @@ -45,6 +47,10 @@ Global {AFA0BA1D-42D6-4E01-9D6C-B7E327E1B7D3}.Debug|Any CPU.Build.0 = Debug|Any CPU {AFA0BA1D-42D6-4E01-9D6C-B7E327E1B7D3}.Release|Any CPU.ActiveCfg = Release|Any CPU {AFA0BA1D-42D6-4E01-9D6C-B7E327E1B7D3}.Release|Any CPU.Build.0 = Release|Any CPU + {63C17106-FAE5-4417-974B-3C11AA8404B4}.Debug|Any CPU.ActiveCfg = Debug|Any CPU + {63C17106-FAE5-4417-974B-3C11AA8404B4}.Debug|Any CPU.Build.0 = Debug|Any CPU + {63C17106-FAE5-4417-974B-3C11AA8404B4}.Release|Any CPU.ActiveCfg = Release|Any CPU + {63C17106-FAE5-4417-974B-3C11AA8404B4}.Release|Any CPU.Build.0 = Release|Any CPU EndGlobalSection GlobalSection(SolutionProperties) = preSolution HideSolutionNode = FALSE diff --git a/docs/headless-monitor.md b/docs/headless-monitor.md new file mode 100644 index 0000000..1a86ab5 --- /dev/null +++ b/docs/headless-monitor.md @@ -0,0 +1,157 @@ +# Headless Estate Monitor + +The headless host is the central-server version of Performance Monitor. It runs on one monitoring server, connects remotely to SQL Server instances, stores hot data in DuckDB, archives older data to Parquet, and serves the website from the same process. + +## Current Thin Slice + +The first implementation includes: + +- central ASP.NET Core host +- background collector loop +- server inventory from configuration +- Windows auth or SQL auth via normal SQL Server connection strings +- DuckDB hot store on the monitoring server +- Parquet archival for old hot data +- HTTP API +- estate overview website with traffic-light server panels, collector log, CPU chart, and top waits +- in-page alert toasts +- optional browser notifications for red/yellow state changes +- initial collectors: + - `server_properties` + - `wait_stats` + - `cpu_utilization` + +It does not install SQL Agent jobs on monitored servers. + +## Project + +```powershell +D:\gitbhub\PerformanceMonitor\Headless\PerformanceMonitor.Headless.csproj +``` + +Default URL: + +```text +http://localhost:5155 +``` + +## Configuration + +Create a local config from the example, then edit the local file: + +```powershell +Copy-Item D:\gitbhub\PerformanceMonitor\Headless\appsettings.example.json D:\gitbhub\PerformanceMonitor\Headless\appsettings.json +D:\gitbhub\PerformanceMonitor\Headless\appsettings.json +``` + +Recommended pattern: keep secrets out of JSON and point each server at an environment variable. + +```json +{ + "Monitor": { + "StoragePath": "data\\headless\\performance-monitor.duckdb", + "ArchiveDirectory": "data\\headless\\parquet", + "CollectionIntervalSeconds": 60, + "MaxConcurrentServers": 8, + "CommandTimeoutSeconds": 30, + "ArchiveIntervalMinutes": 60, + "HotDataDays": 7, + "Servers": [ + { + "Id": "dev-sql-01", + "DisplayName": "DEV-SQL-01", + "ConnectionStringEnvironmentVariable": "PM_DEV_SQL_01", + "Enabled": true + } + ] + } +} +``` + +Windows auth example: + +```powershell +$env:PM_DEV_SQL_01 = "Server=DEV-SQL-01;Database=master;Integrated Security=true;Encrypt=Optional;TrustServerCertificate=true" +``` + +SQL auth example: + +```powershell +$env:PM_DEV_SQL_02 = "Server=DEV-SQL-02;Database=master;User ID=pm_reader;Password=;Encrypt=Mandatory;TrustServerCertificate=true" +``` + +For dozens of servers, use stable `Id` values. Those ids become the partition key in DuckDB and API URLs. + +## Run Locally + +Use the workspace-local SDK if the machine does not have a .NET SDK on `PATH`: + +```powershell +$env:TEMP = "D:\gitbhub\.tmp" +$env:TMP = "D:\gitbhub\.tmp" +$env:NUGET_PACKAGES = "D:\gitbhub\.nuget" +$env:DOTNET_CLI_HOME = "D:\gitbhub\.dotnet-home" +$env:DOTNET_CLI_TELEMETRY_OPTOUT = "1" + +D:\gitbhub\.dotnet-sdk\dotnet.exe run --project D:\gitbhub\PerformanceMonitor\Headless\PerformanceMonitor.Headless.csproj --source https://api.nuget.org/v3/index.json +``` + +Open: + +```text +http://localhost:5155 +``` + +## API + +```text +GET /api/summary +GET /api/servers +GET /api/storage +GET /api/collection-log?limit=200 +GET /api/servers/{serverId}/waits?hours=1&limit=20 +GET /api/servers/{serverId}/cpu?hours=1 +``` + +## Traffic Lights And Alerts + +The overview cards are intended to work like an estate traffic-light board: + +- green: enabled, recently contacted, and no recent collector alerts +- yellow: enabled but not fully healthy yet, for example no successful collection or stale contact +- red: connection failure or any recent alert-worthy collector failure +- disabled: configured but not being monitored + +The browser page raises an in-page toast when a server enters red or yellow. If browser notifications are enabled with the button in the header, the same state change also raises a native browser notification. + +For the current thin slice, "alert-worthy" means connection failures or collector statuses of `ERROR` or `PERMISSIONS` in the last 15 minutes. As more collectors are ported, SQL performance alerts should feed the same red/yellow state so the panel color changes whenever something needs checking. + +## Storage + +Hot data: + +```powershell +D:\gitbhub\PerformanceMonitor\Headless\data\headless\performance-monitor.duckdb +``` + +Archived Parquet: + +```powershell +D:\gitbhub\PerformanceMonitor\Headless\data\headless\parquet +``` + +Archival runs in-process. Rows older than `HotDataDays` are copied to Parquet and deleted from the hot DuckDB tables. + +## Where This Goes Next + +The next ports should come from Lite's existing remote collectors: + +- query stats +- query store +- file I/O +- memory stats +- blocking/deadlocks +- database size/capacity +- running SQL Agent jobs as optional `msdb` read telemetry + +The website should then grow from a status console into the Redgate-style estate overview: traffic lights, stale data warnings, top pain by server, recent regressions, blocking hotspots, and capacity risk. From ef2b39c56ecb8dfc72e7bbe06f6ddc4868542e57 Mon Sep 17 00:00:00 2001 From: Chris Baker <67105654+zacnaloen@users.noreply.github.com> Date: Wed, 13 May 2026 09:47:31 +0100 Subject: [PATCH 3/6] Darken headless monitor theme --- Headless/wwwroot/app.js | 12 ++++-- Headless/wwwroot/index.html | 5 ++- Headless/wwwroot/styles.css | 83 +++++++++++++++++++------------------ 3 files changed, 54 insertions(+), 46 deletions(-) diff --git a/Headless/wwwroot/app.js b/Headless/wwwroot/app.js index 7566487..87dd710 100644 --- a/Headless/wwwroot/app.js +++ b/Headless/wwwroot/app.js @@ -245,7 +245,7 @@ function drawCpuChart(samples) { ctx.scale(ratio, ratio); ctx.clearRect(0, 0, rect.width, rect.height); - ctx.strokeStyle = "#d9e1ea"; + ctx.strokeStyle = cssVar("--border", "#2a313c"); ctx.lineWidth = 1; for (let i = 0; i <= 4; i++) { const y = 12 + (rect.height - 24) * i / 4; @@ -256,7 +256,7 @@ function drawCpuChart(samples) { } if (!samples.length) { - ctx.fillStyle = "#64748b"; + ctx.fillStyle = cssVar("--muted", "#9ca8b8"); ctx.font = "13px Segoe UI, sans-serif"; ctx.fillText("No CPU samples yet", 12, 28); return; @@ -267,7 +267,7 @@ function drawCpuChart(samples) { y: 10 + (100 - Math.min(100, Math.max(0, sample.sqlServerCpuUtilization))) * (rect.height - 20) / 100 })); - ctx.strokeStyle = "#2563eb"; + ctx.strokeStyle = cssVar("--accent", "#62b6ff"); ctx.lineWidth = 2; ctx.beginPath(); points.forEach((point, index) => { @@ -277,11 +277,15 @@ function drawCpuChart(samples) { ctx.stroke(); const latest = samples[samples.length - 1]; - ctx.fillStyle = "#111827"; + ctx.fillStyle = cssVar("--text", "#edf2f7"); ctx.font = "700 13px Segoe UI, sans-serif"; ctx.fillText(`${latest.sqlServerCpuUtilization}% SQL CPU`, 12, 22); } +function cssVar(name, fallback) { + return getComputedStyle(document.documentElement).getPropertyValue(name).trim() || fallback; +} + function formatDate(value) { if (!value) return ""; return new Date(value).toLocaleString(); diff --git a/Headless/wwwroot/index.html b/Headless/wwwroot/index.html index 78e246a..9b27b88 100644 --- a/Headless/wwwroot/index.html +++ b/Headless/wwwroot/index.html @@ -3,8 +3,9 @@ + Performance Monitor Estate - +
@@ -94,6 +95,6 @@

Collector Log

- + diff --git a/Headless/wwwroot/styles.css b/Headless/wwwroot/styles.css index 33649a6..5b18558 100644 --- a/Headless/wwwroot/styles.css +++ b/Headless/wwwroot/styles.css @@ -1,16 +1,18 @@ :root { - color-scheme: light; - --bg: #f6f8fb; - --panel: #ffffff; - --panel-2: #f9fbfd; - --text: #111827; - --muted: #64748b; - --border: #d9e1ea; - --online: #169b62; - --warning: #c2410c; - --offline: #9f1239; - --accent: #2563eb; - --shadow: 0 12px 30px rgba(15, 23, 42, .08); + color-scheme: dark; + --bg: #0d1014; + --panel: #171b22; + --panel-2: #11161c; + --panel-3: #1d232c; + --text: #edf2f7; + --muted: #9ca8b8; + --border: #2a313c; + --online: #2dd681; + --yellow: #eab949; + --warning: #ff7a57; + --offline: #ff5f72; + --accent: #62b6ff; + --shadow: 0 16px 40px rgba(0, 0, 0, .34); } * { @@ -72,7 +74,7 @@ td, button { border: 1px solid var(--border); - background: var(--panel); + background: var(--panel-3); color: var(--text); border-radius: 6px; min-height: 34px; @@ -80,7 +82,7 @@ button { font-size: 13px; font-weight: 600; cursor: pointer; - box-shadow: 0 2px 8px rgba(15, 23, 42, .06); + box-shadow: 0 2px 10px rgba(0, 0, 0, .24); } .header-actions { @@ -90,7 +92,8 @@ button { } button:hover { - border-color: #b9c5d4; + border-color: #465364; + background: #242b35; } .status-grid { @@ -125,7 +128,7 @@ button:hover { } .metric.yellow strong { - color: #b7791f; + color: var(--yellow); } .metric.red strong { @@ -184,8 +187,8 @@ button:hover { .server-card:hover, .server-card.selected { - border-color: #94a3b8; - box-shadow: 0 0 0 3px rgba(37, 99, 235, .10); + border-color: #5e6d80; + box-shadow: 0 0 0 3px rgba(98, 182, 255, .14); } .server-card strong { @@ -200,7 +203,7 @@ button:hover { } .server-card small { - color: rgba(17, 24, 39, .72); + color: rgba(237, 242, 247, .72); } .card-status { @@ -208,24 +211,24 @@ button:hover { } .server-card.health-green { - background: #eaf8f1; - border-color: #a8dec2; + background: #10251b; + border-color: #276746; } .server-card.health-yellow { - background: #fff7df; - border-color: #f3d27a; + background: #29220f; + border-color: #876922; } .server-card.health-red { - background: #fff0ed; - border-color: #f3a38f; + background: #31171a; + border-color: #8e3440; } .server-card.health-disabled { - background: #f1f5f9; - border-color: #cbd5e1; - color: #64748b; + background: #151a21; + border-color: #303846; + color: #808b9a; } .table-wrap { @@ -253,7 +256,7 @@ td { padding: 11px 14px; border-bottom: 1px solid var(--border); vertical-align: middle; - color: #243044; + color: #d8e0ea; white-space: nowrap; overflow: hidden; text-overflow: ellipsis; @@ -264,11 +267,11 @@ tbody tr { } tbody tr:hover { - background: #f4f8ff; + background: #1d2530; } tbody tr.selected { - background: #eef5ff; + background: #202b39; } .traffic { @@ -293,8 +296,8 @@ tbody tr.selected { } .traffic.yellow::before { - background: #d69e2e; - box-shadow: 0 0 0 4px rgba(214, 158, 46, .14); + background: var(--yellow); + box-shadow: 0 0 0 4px rgba(234, 185, 73, .16); } .traffic.red::before { @@ -350,7 +353,7 @@ canvas { .bar-track { height: 8px; border-radius: 999px; - background: #e6edf5; + background: #25303d; overflow: hidden; } @@ -371,7 +374,7 @@ canvas { gap: 8px; padding: 10px 12px; border-bottom: 1px solid var(--border); - color: #243044; + color: #d8e0ea; } .log-row .status { @@ -409,17 +412,17 @@ canvas { } .toast.green { - border-color: #a8dec2; + border-color: #276746; } .toast.yellow { - border-color: #f3d27a; - background: #fff7df; + border-color: #876922; + background: #29220f; } .toast.red { - border-color: #f3a38f; - background: #fff0ed; + border-color: #8e3440; + background: #31171a; } @media (max-width: 900px) { From 60b387be43f8b237b77c484ccf43ec79a502e0f9 Mon Sep 17 00:00:00 2001 From: Chris Baker <67105654+zacnaloen@users.noreply.github.com> Date: Wed, 13 May 2026 10:03:11 +0100 Subject: [PATCH 4/6] Reshape overview into server cards and alerts rail --- Headless/Models/TelemetryModels.cs | 4 +- Headless/Storage/HeadlessStore.cs | 22 +- Headless/wwwroot/app.js | 320 +++++++++++++++++++++-------- Headless/wwwroot/index.html | 161 +++++++++------ Headless/wwwroot/styles.css | 314 ++++++++++++++++++++++++++-- 5 files changed, 658 insertions(+), 163 deletions(-) diff --git a/Headless/Models/TelemetryModels.cs b/Headless/Models/TelemetryModels.cs index 7fd5b04..5a06dbc 100644 --- a/Headless/Models/TelemetryModels.cs +++ b/Headless/Models/TelemetryModels.cs @@ -12,7 +12,9 @@ public sealed record ServerHealthDto( int? SqlMajorVersion, string HealthState, string HealthReason, - int ActiveAlertCount); + int ActiveAlertCount, + int? LatestSqlCpuUtilization, + string? TopWaitType); public sealed record CollectionLogDto( DateTime CollectionTime, diff --git a/Headless/Storage/HeadlessStore.cs b/Headless/Storage/HeadlessStore.cs index 0a59c0d..76388c5 100644 --- a/Headless/Storage/HeadlessStore.cs +++ b/Headless/Storage/HeadlessStore.cs @@ -317,7 +317,21 @@ FROM collection_log AS cl AND cl.status IN ('ERROR', 'PERMISSIONS') ORDER BY cl.collection_time DESC LIMIT 1 - ) AS recent_alert + ) AS recent_alert, + ( + SELECT cu.sqlserver_cpu_utilization + FROM cpu_utilization_stats AS cu + WHERE cu.server_id = s.server_id + ORDER BY cu.sample_time DESC + LIMIT 1 + ) AS latest_sql_cpu, + ( + SELECT ws.wait_type + FROM wait_stats AS ws + WHERE ws.server_id = s.server_id + ORDER BY ws.collection_time DESC, ws.wait_time_ms DESC + LIMIT 1 + ) AS top_wait_type FROM servers AS s ORDER BY s.is_enabled DESC, s.display_name"; command.Parameters.Add(new DuckDBParameter { Value = DateTime.UtcNow.AddMinutes(-15) }); @@ -335,6 +349,8 @@ FROM servers AS s var sqlMajorVersion = reader.IsDBNull(8) ? (int?)null : reader.GetInt32(8); var activeAlertCount = reader.IsDBNull(9) ? 0 : Convert.ToInt32(reader.GetInt64(9)); var recentAlert = reader.IsDBNull(10) ? null : reader.GetString(10); + var latestSqlCpu = reader.IsDBNull(11) ? (int?)null : reader.GetInt32(11); + var topWaitType = reader.IsDBNull(12) ? null : reader.GetString(12); var (healthState, healthReason) = ComputeHealth(isEnabled, lastSeenTime, lastStatus, lastError, activeAlertCount, recentAlert); servers.Add(new ServerHealthDto( @@ -349,7 +365,9 @@ FROM servers AS s sqlMajorVersion, healthState, healthReason, - activeAlertCount)); + activeAlertCount, + latestSqlCpu, + topWaitType)); } return servers; diff --git a/Headless/wwwroot/app.js b/Headless/wwwroot/app.js index 87dd710..0131a1f 100644 --- a/Headless/wwwroot/app.js +++ b/Headless/wwwroot/app.js @@ -1,31 +1,46 @@ const state = { selectedServerId: null, + activeTab: "stats", servers: [], + logs: [], + alerts: [], healthSnapshot: JSON.parse(localStorage.getItem("pm-headless-health") || "{}"), loadedOnce: false }; const els = { + overviewView: document.getElementById("overview-view"), + serverView: document.getElementById("server-view"), green: document.getElementById("metric-green"), yellow: document.getElementById("metric-yellow"), red: document.getElementById("metric-red"), disabled: document.getElementById("metric-disabled"), generatedAt: document.getElementById("generated-at"), - serverCount: document.getElementById("server-count"), + alertCount: document.getElementById("alert-count"), storagePaths: document.getElementById("storage-paths"), serverCardGrid: document.getElementById("server-card-grid"), - serverRows: document.getElementById("server-rows"), + alertList: document.getElementById("alert-list"), collectorLog: document.getElementById("collector-log"), selectedTitle: document.getElementById("selected-server-title"), selectedSubtitle: document.getElementById("selected-server-subtitle"), + serverStatsGrid: document.getElementById("server-stats-grid"), waitList: document.getElementById("wait-list"), cpuCanvas: document.getElementById("cpu-chart"), refresh: document.getElementById("refresh-button"), notify: document.getElementById("notify-button"), + back: document.getElementById("back-button"), toastRegion: document.getElementById("toast-region") }; els.refresh.addEventListener("click", () => loadAll()); +els.back.addEventListener("click", () => navigateOverview()); +document.querySelectorAll(".server-menu button").forEach(button => { + button.addEventListener("click", () => { + if (!state.selectedServerId) return; + navigateServer(state.selectedServerId, button.dataset.tab || "stats"); + }); +}); + els.notify.addEventListener("click", async () => { if (!("Notification" in window)) { showToast("Browser notifications are not supported here.", "yellow"); @@ -37,6 +52,13 @@ els.notify.addEventListener("click", async () => { showToast(permission === "granted" ? "Browser notifications enabled." : "Browser notifications not enabled.", permission === "granted" ? "green" : "yellow"); }); +window.addEventListener("hashchange", () => applyRoute()); +window.addEventListener("resize", () => { + if (state.activeTab === "cpu") { + loadSelectedServer(); + } +}); + async function fetchJson(url) { const response = await fetch(url, { headers: { "Accept": "application/json" } }); if (!response.ok) throw new Error(`${response.status} ${response.statusText}`); @@ -51,19 +73,21 @@ async function loadAll() { ]); state.servers = summary.servers || []; + state.logs = logs || []; + state.alerts = buildAlerts(state.servers, state.logs); + if (!state.selectedServerId && state.servers.length > 0) { - const firstOnline = state.servers.find(s => s.isEnabled) || state.servers[0]; - state.selectedServerId = firstOnline.serverId; + const firstActive = state.servers.find(s => s.isEnabled) || state.servers[0]; + state.selectedServerId = firstActive.serverId; } renderSummary(summary); renderStorage(storage); renderServerCards(state.servers); - renderServers(state.servers); - renderLog(logs); + renderAlerts(state.alerts); handleHealthNotifications(state.servers); state.loadedOnce = true; - await loadSelectedServer(); + applyRoute(); updateNotifyButton(); } @@ -73,7 +97,6 @@ function renderSummary(summary) { els.red.textContent = summary.redCount; els.disabled.textContent = summary.disabledCount; els.generatedAt.textContent = `Updated ${formatDate(summary.generatedAt)}`; - els.serverCount.textContent = `${summary.serverCount} configured`; } function renderStorage(storage) { @@ -82,50 +105,108 @@ function renderStorage(storage) { function renderServerCards(servers) { els.serverCardGrid.innerHTML = ""; + if (!servers.length) { + els.serverCardGrid.innerHTML = `
No servers configured yet.
`; + return; + } + for (const server of servers) { const card = document.createElement("button"); card.type = "button"; card.className = `server-card health-${server.healthState || "yellow"} ${server.serverId === state.selectedServerId ? "selected" : ""}`; - card.addEventListener("click", async () => { - state.selectedServerId = server.serverId; - renderServerCards(state.servers); - renderServers(state.servers); - await loadSelectedServer(); - }); + card.addEventListener("click", () => navigateServer(server.serverId, "stats")); + + const title = server.displayName || server.serverId; + const platform = server.edition ? "SQL Server" : "SQL Server"; + const os = server.productVersion ? `v${server.productVersion}` : "Windows"; + const health = server.healthState || "yellow"; card.innerHTML = ` - ${escapeHtml((server.healthState || "yellow").toUpperCase())} - ${escapeHtml(server.displayName || server.serverId)} - ${escapeHtml(server.healthReason || "No status yet")} - ${server.activeAlertCount ? `${server.activeAlertCount} active alert(s)` : formatDate(server.lastSeenTime) || "No contact yet"} +
+ +
+ ${escapeHtml(title)} + ${escapeHtml(platform)} / ${escapeHtml(os)} +
+ +
+
+ ${server.latestSqlCpuUtilization ?? "--"}${server.latestSqlCpuUtilization == null ? "" : "%"}CPU + ${escapeHtml(compactWait(server.topWaitType))}Wait + ${server.activeAlertCount ?? 0}Alerts + ${compactSeen(server.lastSeenTime)}Seen +
+
+ + ${escapeHtml(server.healthReason || "All good")} +
`; els.serverCardGrid.appendChild(card); } } -function renderServers(servers) { - els.serverRows.innerHTML = ""; +function buildAlerts(servers, logs) { + const alerts = []; + for (const server of servers) { - const tr = document.createElement("tr"); - tr.className = server.serverId === state.selectedServerId ? "selected" : ""; - tr.addEventListener("click", async () => { - state.selectedServerId = server.serverId; - renderServers(state.servers); - await loadSelectedServer(); - }); - - const statusClass = server.healthState || "yellow"; - - tr.innerHTML = ` - ${escapeHtml((server.healthState || "yellow").toUpperCase())} - ${escapeHtml(server.displayName || server.serverId)} - ${escapeHtml(server.edition || "")} - ${escapeHtml(server.productVersion || "")} - ${formatDate(server.lastSeenTime)} - ${escapeHtml(server.healthReason || server.lastError || "")} + const health = server.healthState || "yellow"; + if (health === "red" || health === "yellow") { + alerts.push({ + serverId: server.serverId, + serverName: server.displayName || server.serverId, + state: health, + title: `${server.displayName || server.serverId} is ${health.toUpperCase()}`, + body: server.healthReason || "Server needs attention", + targetTab: "stats", + time: server.lastSeenTime || null + }); + } + } + + for (const log of logs) { + const status = (log.status || "").toUpperCase(); + if (status === "ERROR" || status === "PERMISSIONS") { + alerts.push({ + serverId: log.serverId, + serverName: log.serverName, + state: status === "PERMISSIONS" ? "yellow" : "red", + title: `${log.serverName} / ${log.collectorName}`, + body: log.errorMessage || status, + targetTab: "logs", + time: log.collectionTime + }); + } + } + + return alerts.slice(0, 30); +} + +function renderAlerts(alerts) { + els.alertCount.textContent = alerts.length === 1 ? "1 active" : `${alerts.length} active`; + els.alertList.innerHTML = ""; + + if (!alerts.length) { + els.alertList.innerHTML = ` +
+ No active alerts + Green panels stay quiet here. +
`; + return; + } - els.serverRows.appendChild(tr); + for (const alert of alerts) { + const item = document.createElement("button"); + item.type = "button"; + item.className = `alert-item ${alert.state}`; + item.addEventListener("click", () => navigateServer(alert.serverId, alert.targetTab || "stats")); + item.innerHTML = ` + ${escapeHtml(alert.state.toUpperCase())} + ${escapeHtml(alert.title)} + ${escapeHtml(alert.body)} + ${formatDate(alert.time) || "Needs attention"} + `; + els.alertList.appendChild(item); } } @@ -154,52 +235,113 @@ function handleHealthNotifications(servers) { localStorage.setItem("pm-headless-health", JSON.stringify(nextSnapshot)); } -function showToast(message, health) { - const toast = document.createElement("div"); - toast.className = `toast ${health}`; - toast.textContent = message; - els.toastRegion.appendChild(toast); - window.setTimeout(() => toast.remove(), 9000); -} +function applyRoute() { + const hash = window.location.hash || "#/overview"; + const serverMatch = hash.match(/^#\/servers\/([^/]+)(?:\/([^/]+))?/); -function updateNotifyButton() { - if (!("Notification" in window)) { - els.notify.textContent = "Notifications Unavailable"; - els.notify.disabled = true; + if (serverMatch) { + state.selectedServerId = decodeURIComponent(serverMatch[1]); + state.activeTab = serverMatch[2] || "stats"; + showServerView(); + loadSelectedServer(); return; } - els.notify.textContent = Notification.permission === "granted" - ? "Notifications Enabled" - : "Enable Notifications"; + state.activeTab = "overview"; + els.overviewView.classList.remove("hidden"); + els.serverView.classList.add("hidden"); + renderServerCards(state.servers); +} + +function navigateOverview() { + window.location.hash = "#/overview"; +} + +function navigateServer(serverId, tab) { + window.location.hash = `#/servers/${encodeURIComponent(serverId)}/${tab}`; +} + +function showServerView() { + els.overviewView.classList.add("hidden"); + els.serverView.classList.remove("hidden"); + + document.querySelectorAll(".server-menu button").forEach(button => { + button.classList.toggle("active", button.dataset.tab === state.activeTab); + }); + + document.querySelectorAll(".server-tab").forEach(tab => tab.classList.add("hidden")); + const activePanel = document.getElementById(`tab-${state.activeTab}`); + (activePanel || document.getElementById("tab-stats")).classList.remove("hidden"); } async function loadSelectedServer() { const server = state.servers.find(s => s.serverId === state.selectedServerId); if (!server) { els.selectedTitle.textContent = "Server Detail"; - els.selectedSubtitle.textContent = "Select a server row"; - els.waitList.innerHTML = ""; - drawCpuChart([]); + els.selectedSubtitle.textContent = "Select a server"; return; } els.selectedTitle.textContent = server.displayName || server.serverId; - els.selectedSubtitle.textContent = server.edition || server.productVersion || server.serverId; + els.selectedSubtitle.textContent = `${(server.healthState || "yellow").toUpperCase()} / ${server.healthReason || server.serverId}`; + renderServerStats(server); + renderServerLog(server.serverId); + + if (state.activeTab === "stats" || state.activeTab === "cpu") { + const [waits, cpu] = await Promise.all([ + fetchJson(`/api/servers/${encodeURIComponent(server.serverId)}/waits?hours=1&limit=12`), + fetchJson(`/api/servers/${encodeURIComponent(server.serverId)}/cpu?hours=1`) + ]); + + if (state.activeTab === "stats") { + renderWaits(waits); + } - const [waits, cpu] = await Promise.all([ - fetchJson(`/api/servers/${encodeURIComponent(server.serverId)}/waits?hours=1&limit=12`), - fetchJson(`/api/servers/${encodeURIComponent(server.serverId)}/cpu?hours=1`) - ]); + if (state.activeTab === "cpu") { + drawCpuChart(cpu); + } + } +} - renderWaits(waits); - drawCpuChart(cpu); +function renderServerStats(server) { + els.serverStatsGrid.innerHTML = ` +
Status${escapeHtml((server.healthState || "yellow").toUpperCase())}${escapeHtml(server.healthReason || "")}
+
SQL CPU${server.latestSqlCpuUtilization ?? "--"}${server.latestSqlCpuUtilization == null ? "" : "%"}Latest sample
+
Top Wait${escapeHtml(compactWait(server.topWaitType))}Latest snapshot
+
Alerts${server.activeAlertCount ?? 0}Last 15 minutes
+
Edition${escapeHtml(server.edition || "Unknown")}${escapeHtml(server.productVersion || "No version collected")}
+
Last Contact${formatDate(server.lastSeenTime) || "Never"}${escapeHtml(server.serverId)}
+ `; +} + +function renderServerLog(serverId) { + const rows = state.logs.filter(log => log.serverId === serverId).slice(0, 50); + els.collectorLog.innerHTML = ""; + + if (!rows.length) { + els.collectorLog.innerHTML = `
No collector log entries for this server yet.
`; + return; + } + + for (const log of rows) { + const row = document.createElement("div"); + const statusClass = (log.status || "").toLowerCase(); + row.className = "log-row"; + row.title = log.errorMessage || ""; + row.innerHTML = ` + ${formatTime(log.collectionTime)} + ${escapeHtml(log.collectorName)} + ${escapeHtml(log.status)} + ${log.rowsCollected ?? 0} + `; + els.collectorLog.appendChild(row); + } } function renderWaits(waits) { els.waitList.innerHTML = ""; if (!waits.length) { - els.waitList.innerHTML = `
No wait deltas yet
`; + els.waitList.innerHTML = `
No wait deltas yet.
`; return; } @@ -217,23 +359,6 @@ function renderWaits(waits) { } } -function renderLog(logs) { - els.collectorLog.innerHTML = ""; - for (const log of logs) { - const row = document.createElement("div"); - const statusClass = (log.status || "").toLowerCase(); - row.className = "log-row"; - row.title = log.errorMessage || ""; - row.innerHTML = ` - ${formatTime(log.collectionTime)} - ${escapeHtml(log.serverName)} / ${escapeHtml(log.collectorName)} - ${escapeHtml(log.status)} - ${log.rowsCollected ?? 0} - `; - els.collectorLog.appendChild(row); - } -} - function drawCpuChart(samples) { const canvas = els.cpuCanvas; const rect = canvas.getBoundingClientRect(); @@ -282,10 +407,44 @@ function drawCpuChart(samples) { ctx.fillText(`${latest.sqlServerCpuUtilization}% SQL CPU`, 12, 22); } +function showToast(message, health) { + const toast = document.createElement("div"); + toast.className = `toast ${health}`; + toast.textContent = message; + els.toastRegion.appendChild(toast); + window.setTimeout(() => toast.remove(), 9000); +} + +function updateNotifyButton() { + if (!("Notification" in window)) { + els.notify.textContent = "Notifications Unavailable"; + els.notify.disabled = true; + return; + } + + els.notify.textContent = Notification.permission === "granted" + ? "Notifications Enabled" + : "Enable Notifications"; +} + function cssVar(name, fallback) { return getComputedStyle(document.documentElement).getPropertyValue(name).trim() || fallback; } +function compactWait(waitType) { + if (!waitType) return "--"; + return waitType.length > 8 ? `${waitType.slice(0, 8)}...` : waitType; +} + +function compactSeen(value) { + if (!value) return "--"; + const seconds = Math.max(0, (Date.now() - new Date(value).getTime()) / 1000); + if (seconds < 60) return `${Math.floor(seconds)}s`; + if (seconds < 3600) return `${Math.floor(seconds / 60)}m`; + if (seconds < 86400) return `${Math.floor(seconds / 3600)}h`; + return `${Math.floor(seconds / 86400)}d`; +} + function formatDate(value) { if (!value) return ""; return new Date(value).toLocaleString(); @@ -311,7 +470,6 @@ function escapeHtml(value) { .replaceAll("'", "'"); } -window.addEventListener("resize", () => loadSelectedServer()); loadAll().catch(error => { els.generatedAt.textContent = error.message; }); diff --git a/Headless/wwwroot/index.html b/Headless/wwwroot/index.html index 9b27b88..7280d7e 100644 --- a/Headless/wwwroot/index.html +++ b/Headless/wwwroot/index.html @@ -5,7 +5,7 @@ Performance Monitor Estate - +
@@ -21,80 +21,115 @@

Performance Monitor Estate

-
-
- Green - 0 -
-
- Yellow - 0 -
-
- Red - 0 -
-
- Disabled - 0 -
-
+
+
+
+
+ Green + 0 +
+
+ Yellow + 0 +
+
+ Red + 0 +
+
+ Disabled + 0 +
+
-
-
-

Estate Traffic Lights

- +
+
+

Estate Traffic Lights

+ +
+
+
-
-
-
-
-

Server Inventory

- -
-
- - - - - - - - - - - - -
StatusServerEditionVersionLast SeenLast Error
-
+
-
-
-
+
+
-
-
-

Collector Log

- Latest 50 -
-
-
+ + +
+
+
+
+

Top Waits

+ Last hour +
+
+
+
+
+

Query Workload

+ Coming from query collectors +
+
Query detail will appear here when the query stats and query store collectors are ported.
+
+
+ + + + + +
- + diff --git a/Headless/wwwroot/styles.css b/Headless/wwwroot/styles.css index 5b18558..26bfe22 100644 --- a/Headless/wwwroot/styles.css +++ b/Headless/wwwroot/styles.css @@ -103,6 +103,21 @@ button:hover { margin-bottom: 14px; } +.hidden { + display: none !important; +} + +.overview-layout { + display: grid; + grid-template-columns: minmax(0, 1fr) 360px; + gap: 14px; + align-items: start; +} + +.overview-main { + min-width: 0; +} + .metric { background: var(--panel); border: 1px solid var(--border); @@ -167,22 +182,24 @@ button:hover { .server-card-grid { display: grid; - grid-template-columns: repeat(auto-fill, minmax(220px, 1fr)); - gap: 12px; + grid-template-columns: repeat(auto-fill, minmax(250px, 1fr)); + gap: 14px; padding: 14px; } .server-card { appearance: none; text-align: left; - border-radius: 8px; + border-radius: 5px; border: 1px solid var(--border); - min-height: 124px; - padding: 13px; - display: grid; - gap: 8px; + min-height: 140px; + padding: 0; + display: flex; + flex-direction: column; + gap: 0; box-shadow: none; background: var(--panel-2); + overflow: hidden; } .server-card:hover, @@ -191,23 +208,135 @@ button:hover { box-shadow: 0 0 0 3px rgba(98, 182, 255, .14); } +.server-card-top { + display: grid; + grid-template-columns: 18px minmax(0, 1fr) 18px; + align-items: start; + gap: 8px; + padding: 10px 11px 6px; +} + +.server-icon { + width: 15px; + height: 15px; + border: 1px solid var(--text); + border-radius: 2px; + position: relative; + margin-top: 1px; +} + +.server-icon::before, +.server-icon::after { + content: ""; + position: absolute; + left: 3px; + right: 3px; + height: 1px; + background: currentColor; +} + +.server-icon::before { + top: 4px; +} + +.server-icon::after { + top: 8px; +} + .server-card strong { - font-size: 15px; - line-height: 1.25; + display: block; + font-size: 14px; + line-height: 1.2; + overflow-wrap: anywhere; } -.server-card span, .server-card small { + display: block; font-size: 12px; line-height: 1.25; + color: var(--muted); } -.server-card small { - color: rgba(237, 242, 247, .72); +.card-menu { + color: var(--muted); + font-weight: 800; + text-align: right; + line-height: 1; +} + +.mini-stats { + display: grid; + grid-template-columns: repeat(4, minmax(0, 1fr)); + padding: 8px 10px 9px; + gap: 7px; + flex: 1; +} + +.mini-stats span { + display: grid; + align-content: center; + min-width: 0; + text-align: center; +} + +.mini-stats b { + font-size: 13px; + line-height: 1.2; + overflow: hidden; + text-overflow: ellipsis; + white-space: nowrap; +} + +.mini-stats small { + font-size: 10px; + color: var(--muted); } -.card-status { +.server-card-ribbon { + min-height: 30px; + display: flex; + align-items: center; + gap: 7px; + padding: 6px 10px; + font-size: 12px; font-weight: 800; + color: #06120d; +} + +.server-card-ribbon.green { + background: #16c466; +} + +.server-card-ribbon.yellow { + background: #eab949; + color: #171201; +} + +.server-card-ribbon.red { + background: #ff5f72; + color: #1b0508; +} + +.server-card-ribbon.disabled { + background: #566273; + color: #f2f5f8; +} + +.ribbon-dot { + width: 15px; + height: 15px; + border-radius: 999px; + background: rgba(255, 255, 255, .9); + position: relative; + flex: 0 0 auto; +} + +.ribbon-dot::after { + content: ""; + position: absolute; + inset: 4px; + border-radius: inherit; + background: currentColor; } .server-card.health-green { @@ -231,6 +360,74 @@ button:hover { color: #808b9a; } +.alerts-panel { + position: sticky; + top: 14px; +} + +.alert-list { + display: grid; + gap: 8px; + padding: 12px; + max-height: calc(100vh - 160px); + overflow: auto; +} + +.alert-item { + appearance: none; + width: 100%; + min-height: 88px; + display: grid; + gap: 5px; + text-align: left; + border-radius: 6px; + padding: 10px; + box-shadow: none; +} + +.alert-item.red { + border-color: #8e3440; + background: #251418; +} + +.alert-item.yellow { + border-color: #876922; + background: #211d10; +} + +.alert-item strong, +.alert-item span, +.alert-item small { + overflow-wrap: anywhere; +} + +.alert-severity { + font-size: 11px; + font-weight: 900; + color: var(--muted); +} + +.alert-item strong { + font-size: 13px; + line-height: 1.25; +} + +.alert-item span, +.alert-item small, +.alert-empty span { + font-size: 12px; + line-height: 1.25; + color: var(--muted); +} + +.alert-empty { + min-height: 120px; + display: grid; + align-content: center; + gap: 6px; + color: var(--muted); +} + .table-wrap { overflow: auto; } @@ -316,6 +513,79 @@ tbody tr.selected { margin-top: 14px; } +.server-view { + display: grid; + gap: 14px; +} + +.server-view-header { + display: grid; + grid-template-columns: auto minmax(0, 1fr); + align-items: center; + gap: 14px; + background: var(--panel); + border: 1px solid var(--border); + border-radius: 8px; + padding: 12px; + box-shadow: var(--shadow); +} + +.server-view-header span { + display: block; + color: var(--muted); + font-size: 13px; + margin-top: 4px; +} + +.server-menu { + display: flex; + flex-wrap: wrap; + gap: 8px; +} + +.server-menu button.active { + border-color: var(--accent); + background: #14314b; +} + +.server-tab { + display: grid; + gap: 14px; +} + +.stats-grid { + display: grid; + grid-template-columns: repeat(4, minmax(0, 1fr)); + gap: 12px; +} + +.stat-tile { + background: var(--panel); + border: 1px solid var(--border); + border-radius: 8px; + padding: 13px; + box-shadow: var(--shadow); + min-height: 100px; +} + +.stat-tile.wide { + grid-column: span 2; +} + +.stat-tile span, +.stat-tile small { + color: var(--muted); + font-size: 12px; +} + +.stat-tile strong { + display: block; + font-size: 20px; + line-height: 1.2; + margin: 7px 0 5px; + overflow-wrap: anywhere; +} + .chart-card { margin: 16px; padding: 12px; @@ -324,6 +594,12 @@ tbody tr.selected { border-radius: 8px; } +.empty-state { + padding: 16px; + color: var(--muted); + font-size: 13px; +} + .chart-title { color: var(--muted); font-size: 12px; @@ -443,11 +719,17 @@ canvas { } .status-grid, - .detail-grid { + .detail-grid, + .overview-layout, + .stats-grid { grid-template-columns: 1fr; } - table { - min-width: 820px; + .alerts-panel { + position: static; + } + + .server-view-header { + grid-template-columns: 1fr; } } From 6248d0b01f2a5caf3b460312b7b8b8c5390fa89d Mon Sep 17 00:00:00 2001 From: Chris Baker <67105654+zacnaloen@users.noreply.github.com> Date: Wed, 13 May 2026 10:17:43 +0100 Subject: [PATCH 5/6] Simplify dashboard copy and group servers by purpose --- Headless/Models/MonitoredServerOptions.cs | 2 + Headless/Models/TelemetryModels.cs | 1 + Headless/Storage/HeadlessStore.cs | 48 +++-- Headless/appsettings.example.json | 3 +- Headless/wwwroot/app.js | 209 +++++++++++++++++----- Headless/wwwroot/index.html | 30 ++-- Headless/wwwroot/styles.css | 57 ++++-- docs/headless-monitor.md | 3 +- 8 files changed, 265 insertions(+), 88 deletions(-) diff --git a/Headless/Models/MonitoredServerOptions.cs b/Headless/Models/MonitoredServerOptions.cs index 7500c10..29b6800 100644 --- a/Headless/Models/MonitoredServerOptions.cs +++ b/Headless/Models/MonitoredServerOptions.cs @@ -4,11 +4,13 @@ public sealed class MonitoredServerOptions { public string Id { get; set; } = ""; public string DisplayName { get; set; } = ""; + public string Purpose { get; set; } = "Unassigned"; public string? ConnectionString { get; set; } public string? ConnectionStringEnvironmentVariable { get; set; } public bool Enabled { get; set; } = true; public string ServerNameForStorage => string.IsNullOrWhiteSpace(DisplayName) ? Id : DisplayName; + public string PurposeForDisplay => string.IsNullOrWhiteSpace(Purpose) ? "Unassigned" : Purpose.Trim(); public string ResolveConnectionString() { diff --git a/Headless/Models/TelemetryModels.cs b/Headless/Models/TelemetryModels.cs index 5a06dbc..e9b2f0c 100644 --- a/Headless/Models/TelemetryModels.cs +++ b/Headless/Models/TelemetryModels.cs @@ -3,6 +3,7 @@ namespace PerformanceMonitor.Headless.Models; public sealed record ServerHealthDto( string ServerId, string DisplayName, + string Purpose, bool IsEnabled, DateTime? LastSeenTime, string LastStatus, diff --git a/Headless/Storage/HeadlessStore.cs b/Headless/Storage/HeadlessStore.cs index 76388c5..ed61001 100644 --- a/Headless/Storage/HeadlessStore.cs +++ b/Headless/Storage/HeadlessStore.cs @@ -56,15 +56,17 @@ public async Task UpsertConfiguredServersAsync(IEnumerable> GetServersAsync(CancellationTo SELECT s.server_id, s.display_name, + COALESCE(NULLIF(TRIM(s.purpose), ''), 'Unassigned') AS purpose, s.is_enabled, s.last_seen_time, s.last_status, @@ -333,29 +336,43 @@ FROM wait_stats AS ws LIMIT 1 ) AS top_wait_type FROM servers AS s -ORDER BY s.is_enabled DESC, s.display_name"; +ORDER BY + s.is_enabled DESC, + CASE LOWER(COALESCE(NULLIF(TRIM(s.purpose), ''), 'unassigned')) + WHEN 'production' THEN 1 + WHEN 'prod' THEN 1 + WHEN 'staging' THEN 2 + WHEN 'stage' THEN 2 + WHEN 'development' THEN 3 + WHEN 'dev' THEN 3 + WHEN 'test' THEN 4 + ELSE 5 + END, + s.display_name"; command.Parameters.Add(new DuckDBParameter { Value = DateTime.UtcNow.AddMinutes(-15) }); await using var reader = await command.ExecuteReaderAsync(cancellationToken); while (await reader.ReadAsync(cancellationToken)) { var serverId = reader.GetString(0); var displayName = reader.IsDBNull(1) ? serverId : reader.GetString(1); - var isEnabled = reader.GetBoolean(2); - var lastSeenTime = reader.IsDBNull(3) ? (DateTime?)null : reader.GetDateTime(3); - var lastStatus = reader.IsDBNull(4) ? "UNKNOWN" : reader.GetString(4); - var lastError = reader.IsDBNull(5) ? null : reader.GetString(5); - var productVersion = reader.IsDBNull(6) ? null : reader.GetString(6); - var edition = reader.IsDBNull(7) ? null : reader.GetString(7); - var sqlMajorVersion = reader.IsDBNull(8) ? (int?)null : reader.GetInt32(8); - var activeAlertCount = reader.IsDBNull(9) ? 0 : Convert.ToInt32(reader.GetInt64(9)); - var recentAlert = reader.IsDBNull(10) ? null : reader.GetString(10); - var latestSqlCpu = reader.IsDBNull(11) ? (int?)null : reader.GetInt32(11); - var topWaitType = reader.IsDBNull(12) ? null : reader.GetString(12); + var purpose = reader.IsDBNull(2) ? "Unassigned" : reader.GetString(2); + var isEnabled = reader.GetBoolean(3); + var lastSeenTime = reader.IsDBNull(4) ? (DateTime?)null : reader.GetDateTime(4); + var lastStatus = reader.IsDBNull(5) ? "UNKNOWN" : reader.GetString(5); + var lastError = reader.IsDBNull(6) ? null : reader.GetString(6); + var productVersion = reader.IsDBNull(7) ? null : reader.GetString(7); + var edition = reader.IsDBNull(8) ? null : reader.GetString(8); + var sqlMajorVersion = reader.IsDBNull(9) ? (int?)null : reader.GetInt32(9); + var activeAlertCount = reader.IsDBNull(10) ? 0 : Convert.ToInt32(reader.GetInt64(10)); + var recentAlert = reader.IsDBNull(11) ? null : reader.GetString(11); + var latestSqlCpu = reader.IsDBNull(12) ? (int?)null : reader.GetInt32(12); + var topWaitType = reader.IsDBNull(13) ? null : reader.GetString(13); var (healthState, healthReason) = ComputeHealth(isEnabled, lastSeenTime, lastStatus, lastError, activeAlertCount, recentAlert); servers.Add(new ServerHealthDto( serverId, displayName, + purpose, isEnabled, lastSeenTime, lastStatus, @@ -599,6 +616,7 @@ CREATE TABLE IF NOT EXISTS servers ( server_id VARCHAR PRIMARY KEY, server_name VARCHAR NOT NULL, display_name VARCHAR, + purpose VARCHAR NOT NULL DEFAULT 'Unassigned', is_enabled BOOLEAN NOT NULL DEFAULT TRUE, last_seen_time TIMESTAMP, last_status VARCHAR NOT NULL DEFAULT 'UNKNOWN', @@ -610,6 +628,8 @@ CREATE TABLE IF NOT EXISTS servers ( created_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP ) """, + "ALTER TABLE servers ADD COLUMN IF NOT EXISTS purpose VARCHAR DEFAULT 'Unassigned'", + "UPDATE servers SET purpose = 'Unassigned' WHERE purpose IS NULL OR TRIM(purpose) = ''", """ CREATE TABLE IF NOT EXISTS collection_log ( log_id BIGINT PRIMARY KEY, diff --git a/Headless/appsettings.example.json b/Headless/appsettings.example.json index 9d3ee73..5b8748a 100644 --- a/Headless/appsettings.example.json +++ b/Headless/appsettings.example.json @@ -28,7 +28,8 @@ "Servers": [ { "Id": "sample-dev", - "DisplayName": "Sample Dev SQL", + "DisplayName": "EU4", + "Purpose": "Development", "ConnectionStringEnvironmentVariable": "PM_SAMPLE_DEV_CONNECTION", "Enabled": false } diff --git a/Headless/wwwroot/app.js b/Headless/wwwroot/app.js index 0131a1f..def6092 100644 --- a/Headless/wwwroot/app.js +++ b/Headless/wwwroot/app.js @@ -1,6 +1,7 @@ const state = { selectedServerId: null, activeTab: "stats", + purposeFilter: "all", servers: [], logs: [], alerts: [], @@ -17,7 +18,7 @@ const els = { disabled: document.getElementById("metric-disabled"), generatedAt: document.getElementById("generated-at"), alertCount: document.getElementById("alert-count"), - storagePaths: document.getElementById("storage-paths"), + purposeFilter: document.getElementById("purpose-filter"), serverCardGrid: document.getElementById("server-card-grid"), alertList: document.getElementById("alert-list"), collectorLog: document.getElementById("collector-log"), @@ -34,6 +35,10 @@ const els = { els.refresh.addEventListener("click", () => loadAll()); els.back.addEventListener("click", () => navigateOverview()); +els.purposeFilter.addEventListener("change", () => { + state.purposeFilter = els.purposeFilter.value; + renderServerCards(); +}); document.querySelectorAll(".server-menu button").forEach(button => { button.addEventListener("click", () => { if (!state.selectedServerId) return; @@ -66,9 +71,8 @@ async function fetchJson(url) { } async function loadAll() { - const [summary, storage, logs] = await Promise.all([ + const [summary, logs] = await Promise.all([ fetchJson("/api/summary"), - fetchJson("/api/storage"), fetchJson("/api/collection-log?limit=50") ]); @@ -82,8 +86,8 @@ async function loadAll() { } renderSummary(summary); - renderStorage(storage); - renderServerCards(state.servers); + renderPurposeOptions(state.servers); + renderServerCards(); renderAlerts(state.alerts); handleHealthNotifications(state.servers); state.loadedOnce = true; @@ -99,49 +103,159 @@ function renderSummary(summary) { els.generatedAt.textContent = `Updated ${formatDate(summary.generatedAt)}`; } -function renderStorage(storage) { - els.storagePaths.textContent = `DuckDB ${storage.duckdb} | Parquet ${storage.parquet}`; +function renderPurposeOptions(servers) { + const purposes = [...new Set(servers.map(server => normalizePurpose(server.purpose)))].sort(sortPurposes); + const options = ["all", ...purposes]; + const current = options.includes(state.purposeFilter) ? state.purposeFilter : "all"; + state.purposeFilter = current; + els.purposeFilter.innerHTML = options + .map(value => ``) + .join(""); + els.purposeFilter.value = current; } -function renderServerCards(servers) { +function renderServerCards() { + const visibleServers = state.servers.filter(server => { + return state.purposeFilter === "all" || normalizePurpose(server.purpose) === state.purposeFilter; + }); + els.serverCardGrid.innerHTML = ""; - if (!servers.length) { - els.serverCardGrid.innerHTML = `
No servers configured yet.
`; + if (!visibleServers.length) { + els.serverCardGrid.innerHTML = `
No servers.
`; return; } - for (const server of servers) { - const card = document.createElement("button"); - card.type = "button"; - card.className = `server-card health-${server.healthState || "yellow"} ${server.serverId === state.selectedServerId ? "selected" : ""}`; - card.addEventListener("click", () => navigateServer(server.serverId, "stats")); - - const title = server.displayName || server.serverId; - const platform = server.edition ? "SQL Server" : "SQL Server"; - const os = server.productVersion ? `v${server.productVersion}` : "Windows"; - const health = server.healthState || "yellow"; - - card.innerHTML = ` -
- -
- ${escapeHtml(title)} - ${escapeHtml(platform)} / ${escapeHtml(os)} -
- -
-
- ${server.latestSqlCpuUtilization ?? "--"}${server.latestSqlCpuUtilization == null ? "" : "%"}CPU - ${escapeHtml(compactWait(server.topWaitType))}Wait - ${server.activeAlertCount ?? 0}Alerts - ${compactSeen(server.lastSeenTime)}Seen -
-
- - ${escapeHtml(server.healthReason || "All good")} + for (const group of groupServersByPurpose(visibleServers)) { + const section = document.createElement("section"); + section.className = "purpose-section"; + section.innerHTML = ` +
+

${escapeHtml(group.purpose)}

+ ${group.servers.length}
+
`; - els.serverCardGrid.appendChild(card); + + const row = section.querySelector(".server-card-row"); + for (const server of group.servers) { + row.appendChild(createServerCard(server)); + } + + els.serverCardGrid.appendChild(section); + } +} + +function createServerCard(server) { + const card = document.createElement("button"); + card.type = "button"; + card.className = `server-card health-${server.healthState || "yellow"} ${server.serverId === state.selectedServerId ? "selected" : ""}`; + card.addEventListener("click", () => navigateServer(server.serverId, "stats")); + + const title = server.displayName || server.serverId; + const platform = "SQL Server"; + const os = server.productVersion ? `v${server.productVersion}` : "Windows"; + const health = server.healthState || "yellow"; + + card.innerHTML = ` +
+ +
+ ${escapeHtml(title)} + ${escapeHtml(platform)} / ${escapeHtml(os)} +
+ +
+
+ ${server.latestSqlCpuUtilization ?? "--"}${server.latestSqlCpuUtilization == null ? "" : "%"}CPU + ${escapeHtml(compactWait(server.topWaitType))}Wait + ${server.activeAlertCount ?? 0}Alerts + ${compactSeen(server.lastSeenTime)}Seen +
+
+ + ${escapeHtml(server.healthReason || "All good")} +
+ `; + + return card; +} + +function groupServersByPurpose(servers) { + const groups = new Map(); + for (const server of servers) { + const purpose = normalizePurpose(server.purpose); + if (!groups.has(purpose)) { + groups.set(purpose, []); + } + + groups.get(purpose).push(server); + } + + return [...groups.entries()] + .sort(([left], [right]) => sortPurposes(left, right)) + .map(([purpose, groupedServers]) => ({ + purpose, + servers: groupedServers.sort((left, right) => { + const leftHealth = healthRank(left.healthState); + const rightHealth = healthRank(right.healthState); + if (leftHealth !== rightHealth) return leftHealth - rightHealth; + return (left.displayName || left.serverId).localeCompare(right.displayName || right.serverId); + }) + })); +} + +function normalizePurpose(value) { + const purpose = String(value || "").trim(); + if (!purpose) return "Unassigned"; + + switch (purpose.toLowerCase()) { + case "prod": + return "Production"; + case "stage": + return "Staging"; + case "dev": + return "Development"; + default: + return purpose; + } +} + +function sortPurposes(left, right) { + const rank = purpose => { + switch (purpose.toLowerCase()) { + case "production": + return 1; + case "staging": + return 2; + case "development": + return 3; + case "test": + return 4; + case "unassigned": + return 99; + default: + return 20; + } + }; + + const leftRank = rank(left); + const rightRank = rank(right); + if (leftRank !== rightRank) return leftRank - rightRank; + return left.localeCompare(right); +} + +function healthRank(health) { + switch ((health || "").toLowerCase()) { + case "red": + return 1; + case "yellow": + return 2; + case "green": + return 3; + case "disabled": + return 4; + default: + return 5; } } @@ -188,8 +302,7 @@ function renderAlerts(alerts) { if (!alerts.length) { els.alertList.innerHTML = `
- No active alerts - Green panels stay quiet here. + No alerts
`; return; @@ -250,7 +363,7 @@ function applyRoute() { state.activeTab = "overview"; els.overviewView.classList.remove("hidden"); els.serverView.classList.add("hidden"); - renderServerCards(state.servers); + renderServerCards(); } function navigateOverview() { @@ -283,7 +396,7 @@ async function loadSelectedServer() { } els.selectedTitle.textContent = server.displayName || server.serverId; - els.selectedSubtitle.textContent = `${(server.healthState || "yellow").toUpperCase()} / ${server.healthReason || server.serverId}`; + els.selectedSubtitle.textContent = `${normalizePurpose(server.purpose)} / ${(server.healthState || "yellow").toUpperCase()} / ${server.healthReason || ""}`; renderServerStats(server); renderServerLog(server.serverId); @@ -310,7 +423,7 @@ function renderServerStats(server) {
Top Wait${escapeHtml(compactWait(server.topWaitType))}Latest snapshot
Alerts${server.activeAlertCount ?? 0}Last 15 minutes
Edition${escapeHtml(server.edition || "Unknown")}${escapeHtml(server.productVersion || "No version collected")}
-
Last Contact${formatDate(server.lastSeenTime) || "Never"}${escapeHtml(server.serverId)}
+
Last Contact${formatDate(server.lastSeenTime) || "Never"}${escapeHtml(normalizePurpose(server.purpose))}
`; } @@ -319,7 +432,7 @@ function renderServerLog(serverId) { els.collectorLog.innerHTML = ""; if (!rows.length) { - els.collectorLog.innerHTML = `
No collector log entries for this server yet.
`; + els.collectorLog.innerHTML = `
No log entries.
`; return; } @@ -341,7 +454,7 @@ function renderServerLog(serverId) { function renderWaits(waits) { els.waitList.innerHTML = ""; if (!waits.length) { - els.waitList.innerHTML = `
No wait deltas yet.
`; + els.waitList.innerHTML = `
No waits.
`; return; } @@ -383,7 +496,7 @@ function drawCpuChart(samples) { if (!samples.length) { ctx.fillStyle = cssVar("--muted", "#9ca8b8"); ctx.font = "13px Segoe UI, sans-serif"; - ctx.fillText("No CPU samples yet", 12, 28); + ctx.fillText("No CPU samples", 12, 28); return; } diff --git a/Headless/wwwroot/index.html b/Headless/wwwroot/index.html index 7280d7e..d02a457 100644 --- a/Headless/wwwroot/index.html +++ b/Headless/wwwroot/index.html @@ -4,15 +4,14 @@ - Performance Monitor Estate - + Dashboard +
-

Performance Monitor Estate

-

+

Dashboard

@@ -23,7 +22,7 @@

Performance Monitor Estate

-
+
Green 0 @@ -44,8 +43,11 @@

Performance Monitor Estate

-

Estate Traffic Lights

- +

Dashboard

+
+ + +
@@ -62,7 +64,7 @@

Alerts

-

Query Workload

- Coming from query collectors +

Queries

+ Not available
-
Query detail will appear here when the query stats and query store collectors are ported.
+
Not available yet.
@@ -110,9 +112,9 @@

CPU

Queries

- Not collected yet + Not available
-
This submenu is reserved for query stats, query store regressions, plans, and expensive query drilldowns.
+
Not available yet.
@@ -130,6 +132,6 @@

Collector Log

- + diff --git a/Headless/wwwroot/styles.css b/Headless/wwwroot/styles.css index 26bfe22..352df9c 100644 --- a/Headless/wwwroot/styles.css +++ b/Headless/wwwroot/styles.css @@ -40,7 +40,7 @@ body { padding: 8px 0 18px; } -h1, h2, p { +h1, h2, h3, p { margin: 0; } @@ -58,7 +58,6 @@ h2 { letter-spacing: 0; } -#storage-paths, .panel-header span, td, .log-row, @@ -67,12 +66,8 @@ td, font-size: 13px; } -#storage-paths { - color: var(--muted); - margin-top: 7px; -} - -button { +button, +select { border: 1px solid var(--border); background: var(--panel-3); color: var(--text); @@ -85,13 +80,19 @@ button { box-shadow: 0 2px 10px rgba(0, 0, 0, .24); } +select { + min-height: 34px; + padding: 0 30px 0 10px; +} + .header-actions { display: flex; gap: 8px; align-items: center; } -button:hover { +button:hover, +select:hover { border-color: #465364; background: #242b35; } @@ -176,17 +177,53 @@ button:hover { color: var(--muted); } +.panel-actions { + display: flex; + align-items: center; + gap: 10px; + min-width: 0; +} + .overview-panel { margin-bottom: 14px; } .server-card-grid { display: grid; - grid-template-columns: repeat(auto-fill, minmax(250px, 1fr)); gap: 14px; padding: 14px; } +.purpose-section { + display: grid; + gap: 10px; +} + +.purpose-heading { + display: flex; + align-items: center; + justify-content: space-between; + gap: 12px; + color: var(--muted); +} + +.purpose-heading h3 { + font-size: 12px; + line-height: 1.2; + font-weight: 800; + color: #d9e2ee; +} + +.purpose-heading span { + font-size: 12px; +} + +.server-card-row { + display: grid; + grid-template-columns: repeat(auto-fill, minmax(250px, 1fr)); + gap: 14px; +} + .server-card { appearance: none; text-align: left; diff --git a/docs/headless-monitor.md b/docs/headless-monitor.md index 1a86ab5..279a3ce 100644 --- a/docs/headless-monitor.md +++ b/docs/headless-monitor.md @@ -60,6 +60,7 @@ Recommended pattern: keep secrets out of JSON and point each server at an enviro { "Id": "dev-sql-01", "DisplayName": "DEV-SQL-01", + "Purpose": "Development", "ConnectionStringEnvironmentVariable": "PM_DEV_SQL_01", "Enabled": true } @@ -80,7 +81,7 @@ SQL auth example: $env:PM_DEV_SQL_02 = "Server=DEV-SQL-02;Database=master;User ID=pm_reader;Password=;Encrypt=Mandatory;TrustServerCertificate=true" ``` -For dozens of servers, use stable `Id` values. Those ids become the partition key in DuckDB and API URLs. +For dozens of servers, use stable `Id` values. Those ids become the partition key in DuckDB and API URLs. Set `Purpose` to values such as `Development`, `Staging`, or `Production` so the dashboard can group and filter the estate. ## Run Locally From 0a02a3f5d706f21ad292f1136b1181633ff7722a Mon Sep 17 00:00:00 2001 From: Chris Baker <67105654+zacnaloen@users.noreply.github.com> Date: Wed, 13 May 2026 10:49:44 +0100 Subject: [PATCH 6/6] Move severity counts into active alerts rail --- Headless/Models/TelemetryModels.cs | 12 ++- Headless/Program.cs | 3 + Headless/Storage/HeadlessStore.cs | 121 +++++++++++++++++++++++++---- Headless/wwwroot/app.js | 53 ++++++------- Headless/wwwroot/index.html | 23 +----- Headless/wwwroot/styles.css | 54 ++----------- docs/headless-monitor.md | 3 +- 7 files changed, 154 insertions(+), 115 deletions(-) diff --git a/Headless/Models/TelemetryModels.cs b/Headless/Models/TelemetryModels.cs index e9b2f0c..4294363 100644 --- a/Headless/Models/TelemetryModels.cs +++ b/Headless/Models/TelemetryModels.cs @@ -27,6 +27,15 @@ public sealed record CollectionLogDto( int DurationMs, string? ErrorMessage); +public sealed record ActiveAlertDto( + DateTime RaisedAt, + string ServerId, + string ServerName, + string Source, + string Severity, + string Message, + string TargetTab); + public sealed record TopWaitDto( string WaitType, long WaitTimeDeltaMs, @@ -46,7 +55,8 @@ public sealed record EstateSummaryDto( int ErrorCount, int DisabledCount, DateTime GeneratedAt, - IReadOnlyList Servers); + IReadOnlyList Servers, + IReadOnlyList ActiveAlerts); public sealed record ServerPropertiesSnapshot( string MachineName, diff --git a/Headless/Program.cs b/Headless/Program.cs index b083641..688c55f 100644 --- a/Headless/Program.cs +++ b/Headless/Program.cs @@ -28,6 +28,9 @@ app.MapGet("/api/servers", async (HeadlessStore store, CancellationToken cancellationToken) => Results.Ok(await store.GetServersAsync(cancellationToken))); +app.MapGet("/api/alerts", async (HeadlessStore store, CancellationToken cancellationToken) + => Results.Ok(await store.GetActiveAlertsAsync(cancellationToken))); + app.MapGet("/api/collection-log", async (HeadlessStore store, int? limit, CancellationToken cancellationToken) => Results.Ok(await store.GetCollectionLogAsync(limit ?? 200, cancellationToken))); diff --git a/Headless/Storage/HeadlessStore.cs b/Headless/Storage/HeadlessStore.cs index ed61001..526956b 100644 --- a/Headless/Storage/HeadlessStore.cs +++ b/Headless/Storage/HeadlessStore.cs @@ -307,20 +307,50 @@ public async Task> GetServersAsync(CancellationTo s.sql_major_version, ( SELECT COUNT(*) - FROM collection_log AS cl - WHERE cl.server_id = s.server_id - AND cl.collection_time >= $1 - AND cl.status IN ('ERROR', 'PERMISSIONS') + FROM + ( + SELECT + cl.status, + ROW_NUMBER() OVER (PARTITION BY cl.collector_name ORDER BY cl.collection_time DESC, cl.log_id DESC) AS rn + FROM collection_log AS cl + WHERE cl.server_id = s.server_id + ) AS latest + WHERE latest.rn = 1 + AND latest.status IN ('ERROR', 'PERMISSIONS') ) AS active_alert_count, ( - SELECT cl.error_message - FROM collection_log AS cl - WHERE cl.server_id = s.server_id - AND cl.collection_time >= $1 - AND cl.status IN ('ERROR', 'PERMISSIONS') - ORDER BY cl.collection_time DESC + SELECT COALESCE(NULLIF(latest.error_message, ''), latest.status) + FROM + ( + SELECT + cl.error_message, + cl.status, + cl.collection_time, + ROW_NUMBER() OVER (PARTITION BY cl.collector_name ORDER BY cl.collection_time DESC, cl.log_id DESC) AS rn + FROM collection_log AS cl + WHERE cl.server_id = s.server_id + ) AS latest + WHERE latest.rn = 1 + AND latest.status IN ('ERROR', 'PERMISSIONS') + ORDER BY CASE WHEN latest.status = 'ERROR' THEN 1 ELSE 2 END, latest.collection_time DESC LIMIT 1 ) AS recent_alert, + ( + SELECT CASE WHEN latest.status = 'ERROR' THEN 'red' ELSE 'yellow' END + FROM + ( + SELECT + cl.status, + cl.collection_time, + ROW_NUMBER() OVER (PARTITION BY cl.collector_name ORDER BY cl.collection_time DESC, cl.log_id DESC) AS rn + FROM collection_log AS cl + WHERE cl.server_id = s.server_id + ) AS latest + WHERE latest.rn = 1 + AND latest.status IN ('ERROR', 'PERMISSIONS') + ORDER BY CASE WHEN latest.status = 'ERROR' THEN 1 ELSE 2 END, latest.collection_time DESC + LIMIT 1 + ) AS active_alert_severity, ( SELECT cu.sqlserver_cpu_utilization FROM cpu_utilization_stats AS cu @@ -349,7 +379,6 @@ CASE LOWER(COALESCE(NULLIF(TRIM(s.purpose), ''), 'unassigned')) ELSE 5 END, s.display_name"; - command.Parameters.Add(new DuckDBParameter { Value = DateTime.UtcNow.AddMinutes(-15) }); await using var reader = await command.ExecuteReaderAsync(cancellationToken); while (await reader.ReadAsync(cancellationToken)) { @@ -365,9 +394,10 @@ ELSE 5 var sqlMajorVersion = reader.IsDBNull(9) ? (int?)null : reader.GetInt32(9); var activeAlertCount = reader.IsDBNull(10) ? 0 : Convert.ToInt32(reader.GetInt64(10)); var recentAlert = reader.IsDBNull(11) ? null : reader.GetString(11); - var latestSqlCpu = reader.IsDBNull(12) ? (int?)null : reader.GetInt32(12); - var topWaitType = reader.IsDBNull(13) ? null : reader.GetString(13); - var (healthState, healthReason) = ComputeHealth(isEnabled, lastSeenTime, lastStatus, lastError, activeAlertCount, recentAlert); + var activeAlertSeverity = reader.IsDBNull(12) ? null : reader.GetString(12); + var latestSqlCpu = reader.IsDBNull(13) ? (int?)null : reader.GetInt32(13); + var topWaitType = reader.IsDBNull(14) ? null : reader.GetString(14); + var (healthState, healthReason) = ComputeHealth(isEnabled, lastSeenTime, lastStatus, lastError, activeAlertCount, recentAlert, activeAlertSeverity); servers.Add(new ServerHealthDto( serverId, @@ -393,6 +423,7 @@ ELSE 5 public async Task GetEstateSummaryAsync(CancellationToken cancellationToken) { var servers = await GetServersAsync(cancellationToken); + var activeAlerts = await GetActiveAlertsAsync(cancellationToken); return new EstateSummaryDto( servers.Count, servers.Count(s => string.Equals(s.HealthState, "green", StringComparison.OrdinalIgnoreCase)), @@ -401,7 +432,8 @@ public async Task GetEstateSummaryAsync(CancellationToken canc servers.Count(s => s.IsEnabled && string.Equals(s.LastStatus, "ERROR", StringComparison.OrdinalIgnoreCase)), servers.Count(s => !s.IsEnabled), DateTime.UtcNow, - servers); + servers, + activeAlerts); } private (string HealthState, string HealthReason) ComputeHealth( @@ -410,7 +442,8 @@ public async Task GetEstateSummaryAsync(CancellationToken canc string lastStatus, string? lastError, int activeAlertCount, - string? recentAlert) + string? recentAlert, + string? activeAlertSeverity) { if (!isEnabled) { @@ -424,7 +457,8 @@ public async Task GetEstateSummaryAsync(CancellationToken canc if (activeAlertCount > 0) { - return ("red", recentAlert ?? $"{activeAlertCount} collector alert(s) in the last 15 minutes"); + var severity = string.Equals(activeAlertSeverity, "yellow", StringComparison.OrdinalIgnoreCase) ? "yellow" : "red"; + return (severity, recentAlert ?? $"{activeAlertCount} active collector alert(s)"); } if (!lastSeenTime.HasValue) @@ -441,6 +475,58 @@ public async Task GetEstateSummaryAsync(CancellationToken canc return ("green", "All good"); } + public async Task> GetActiveAlertsAsync(CancellationToken cancellationToken) + { + var alerts = new List(); + await using var connection = CreateConnection(); + await connection.OpenAsync(cancellationToken); + await using var command = connection.CreateCommand(); + command.CommandText = @" +WITH latest_collector AS +( + SELECT + cl.collection_time, + cl.server_id, + cl.server_name, + cl.collector_name, + cl.status, + cl.error_message, + ROW_NUMBER() OVER (PARTITION BY cl.server_id, cl.collector_name ORDER BY cl.collection_time DESC, cl.log_id DESC) AS rn + FROM collection_log AS cl +) +SELECT + lc.collection_time, + lc.server_id, + COALESCE(NULLIF(s.display_name, ''), lc.server_name) AS server_name, + lc.collector_name, + CASE WHEN lc.status = 'ERROR' THEN 'red' ELSE 'yellow' END AS severity, + COALESCE(NULLIF(lc.error_message, ''), lc.status) AS message, + CASE WHEN lc.status = 'PERMISSIONS' THEN 'stats' ELSE 'logs' END AS target_tab +FROM latest_collector AS lc +LEFT JOIN servers AS s + ON s.server_id = lc.server_id +WHERE lc.rn = 1 +AND lc.status IN ('ERROR', 'PERMISSIONS') +AND COALESCE(s.is_enabled, TRUE) = TRUE +ORDER BY + CASE WHEN lc.status = 'ERROR' THEN 1 ELSE 2 END, + lc.collection_time DESC"; + await using var reader = await command.ExecuteReaderAsync(cancellationToken); + while (await reader.ReadAsync(cancellationToken)) + { + alerts.Add(new ActiveAlertDto( + reader.GetDateTime(0), + reader.GetString(1), + reader.GetString(2), + reader.GetString(3), + reader.GetString(4), + reader.GetString(5), + reader.GetString(6))); + } + + return alerts; + } + public async Task> GetCollectionLogAsync(int limit, CancellationToken cancellationToken) { var logs = new List(); @@ -688,6 +774,7 @@ other_process_cpu_utilization INTEGER """, "CREATE INDEX IF NOT EXISTS idx_servers_status ON servers(is_enabled, last_status)", "CREATE INDEX IF NOT EXISTS idx_collection_log_time ON collection_log(collection_time)", + "CREATE INDEX IF NOT EXISTS idx_collection_log_server_collector_time ON collection_log(server_id, collector_name, collection_time)", "CREATE INDEX IF NOT EXISTS idx_wait_stats_time ON wait_stats(server_id, collection_time)", "CREATE INDEX IF NOT EXISTS idx_cpu_time ON cpu_utilization_stats(server_id, sample_time)", "CREATE INDEX IF NOT EXISTS idx_server_properties_time ON server_properties(server_id, collection_time)" diff --git a/Headless/wwwroot/app.js b/Headless/wwwroot/app.js index def6092..31db352 100644 --- a/Headless/wwwroot/app.js +++ b/Headless/wwwroot/app.js @@ -12,10 +12,6 @@ const state = { const els = { overviewView: document.getElementById("overview-view"), serverView: document.getElementById("server-view"), - green: document.getElementById("metric-green"), - yellow: document.getElementById("metric-yellow"), - red: document.getElementById("metric-red"), - disabled: document.getElementById("metric-disabled"), generatedAt: document.getElementById("generated-at"), alertCount: document.getElementById("alert-count"), purposeFilter: document.getElementById("purpose-filter"), @@ -78,7 +74,7 @@ async function loadAll() { state.servers = summary.servers || []; state.logs = logs || []; - state.alerts = buildAlerts(state.servers, state.logs); + state.alerts = buildAlerts(state.servers, summary.activeAlerts || []); if (!state.selectedServerId && state.servers.length > 0) { const firstActive = state.servers.find(s => s.isEnabled) || state.servers[0]; @@ -96,10 +92,6 @@ async function loadAll() { } function renderSummary(summary) { - els.green.textContent = summary.greenCount; - els.yellow.textContent = summary.yellowCount; - els.red.textContent = summary.redCount; - els.disabled.textContent = summary.disabledCount; els.generatedAt.textContent = `Updated ${formatDate(summary.generatedAt)}`; } @@ -259,17 +251,33 @@ function healthRank(health) { } } -function buildAlerts(servers, logs) { +function buildAlerts(servers, activeAlerts) { const alerts = []; + for (const alert of activeAlerts) { + alerts.push({ + serverId: alert.serverId, + serverName: alert.serverName, + state: alert.severity || "red", + title: `${alert.serverName} / ${alert.source}`, + body: alert.message || "Needs attention", + targetTab: alert.targetTab || "logs", + time: alert.raisedAt + }); + } + for (const server of servers) { const health = server.healthState || "yellow"; - if (health === "red" || health === "yellow") { + const status = (server.lastStatus || "").toUpperCase(); + const hasCollectorAlerts = (server.activeAlertCount || 0) > 0; + const isServerLevelAlert = server.isEnabled && (health === "red" || health === "yellow") && (!hasCollectorAlerts || status === "ERROR"); + + if (isServerLevelAlert) { alerts.push({ serverId: server.serverId, serverName: server.displayName || server.serverId, state: health, - title: `${server.displayName || server.serverId} is ${health.toUpperCase()}`, + title: `${server.displayName || server.serverId} / Server`, body: server.healthReason || "Server needs attention", targetTab: "stats", time: server.lastSeenTime || null @@ -277,22 +285,11 @@ function buildAlerts(servers, logs) { } } - for (const log of logs) { - const status = (log.status || "").toUpperCase(); - if (status === "ERROR" || status === "PERMISSIONS") { - alerts.push({ - serverId: log.serverId, - serverName: log.serverName, - state: status === "PERMISSIONS" ? "yellow" : "red", - title: `${log.serverName} / ${log.collectorName}`, - body: log.errorMessage || status, - targetTab: "logs", - time: log.collectionTime - }); - } - } - - return alerts.slice(0, 30); + return alerts.sort((left, right) => { + const severityDelta = healthRank(left.state) - healthRank(right.state); + if (severityDelta !== 0) return severityDelta; + return new Date(right.time || 0).getTime() - new Date(left.time || 0).getTime(); + }).slice(0, 30); } function renderAlerts(alerts) { diff --git a/Headless/wwwroot/index.html b/Headless/wwwroot/index.html index d02a457..b061ef9 100644 --- a/Headless/wwwroot/index.html +++ b/Headless/wwwroot/index.html @@ -5,7 +5,7 @@ Dashboard - +
@@ -22,25 +22,6 @@

Dashboard

-
-
- Green - 0 -
-
- Yellow - 0 -
-
- Red - 0 -
-
- Disabled - 0 -
-
-

Dashboard

@@ -132,6 +113,6 @@

Collector Log

- + diff --git a/Headless/wwwroot/styles.css b/Headless/wwwroot/styles.css index 352df9c..a3c0fb2 100644 --- a/Headless/wwwroot/styles.css +++ b/Headless/wwwroot/styles.css @@ -61,8 +61,7 @@ h2 { .panel-header span, td, .log-row, -.wait-row, -.metric span { +.wait-row { font-size: 13px; } @@ -97,13 +96,6 @@ select:hover { background: #242b35; } -.status-grid { - display: grid; - grid-template-columns: repeat(4, minmax(0, 1fr)); - gap: 12px; - margin-bottom: 14px; -} - .hidden { display: none !important; } @@ -119,42 +111,6 @@ select:hover { min-width: 0; } -.metric { - background: var(--panel); - border: 1px solid var(--border); - border-radius: 8px; - padding: 14px; - box-shadow: var(--shadow); -} - -.metric span { - color: var(--muted); - display: block; - margin-bottom: 8px; -} - -.metric strong { - display: block; - font-size: 32px; - line-height: 1; -} - -.metric.green strong { - color: var(--online); -} - -.metric.yellow strong { - color: var(--yellow); -} - -.metric.red strong { - color: var(--warning); -} - -.metric.muted strong { - color: var(--muted); -} - .panel { background: var(--panel); border: 1px solid var(--border); @@ -418,18 +374,23 @@ select:hover { gap: 5px; text-align: left; border-radius: 6px; - padding: 10px; + border-left-width: 5px; + padding: 10px 10px 10px 13px; box-shadow: none; + position: relative; + overflow: hidden; } .alert-item.red { border-color: #8e3440; background: #251418; + border-left-color: var(--offline); } .alert-item.yellow { border-color: #876922; background: #211d10; + border-left-color: var(--yellow); } .alert-item strong, @@ -755,7 +716,6 @@ canvas { flex: 1; } - .status-grid, .detail-grid, .overview-layout, .stats-grid { diff --git a/docs/headless-monitor.md b/docs/headless-monitor.md index 279a3ce..8e76ba0 100644 --- a/docs/headless-monitor.md +++ b/docs/headless-monitor.md @@ -108,6 +108,7 @@ http://localhost:5155 ```text GET /api/summary GET /api/servers +GET /api/alerts GET /api/storage GET /api/collection-log?limit=200 GET /api/servers/{serverId}/waits?hours=1&limit=20 @@ -125,7 +126,7 @@ The overview cards are intended to work like an estate traffic-light board: The browser page raises an in-page toast when a server enters red or yellow. If browser notifications are enabled with the button in the header, the same state change also raises a native browser notification. -For the current thin slice, "alert-worthy" means connection failures or collector statuses of `ERROR` or `PERMISSIONS` in the last 15 minutes. As more collectors are ported, SQL performance alerts should feed the same red/yellow state so the panel color changes whenever something needs checking. +For the current thin slice, "alert-worthy" means connection failures or collector statuses where the latest run for that server/collector is `ERROR` or `PERMISSIONS`. A later successful collector run clears that alert automatically, so the server panel colour returns to the next-worst current state instead of holding onto stale failures. As more collectors are ported, SQL performance alerts should feed the same red/yellow state so the panel color changes whenever something needs checking. ## Storage