Skip to content

fix: change azure metric names and give better alert descriptions#180

Merged
timtalbot merged 1 commit intomainfrom
improve-azure-alerts
Mar 12, 2026
Merged

fix: change azure metric names and give better alert descriptions#180
timtalbot merged 1 commit intomainfrom
improve-azure-alerts

Conversation

@timtalbot
Copy link
Contributor

Description

Improves Azure Monitor Grafana alert definitions with better metrics and standardized alert formatting.

Code Flow

Updated several Azure alert files to use the right metrics:

Azure Storage:

  • availability → availability_average_percent
  • successe2elatency → successe2elatency_average_milliseconds

Azure NetApp Files:

  • volumeconsumedsizepercentage → volumeconsumedsizepercentage_average_percent
  • averagereadlatency → averagereadlatency_average_milliseconds
  • averagewritelatency → averagewritelatency_average_milliseconds

Azure PostgreSQL:

  • cpu_percent → cpu_percent_average_percent
  • storage_percent → storage_percent_average_percent
  • memory_percent → memory_percent_average_percent
  • active_connections → active_connections_average_count
  • connections_failed → connections_failed_total_count
  • removed azure_postgres_deadlocks alert, invalid metric name for Azure Monitor

Alert Description Standardization

Reformatted all alert descriptions from single-line to structured multi-line format:

  • Added emoji severity indicators (🔴 CRITICAL, 🟡 WARNING)
  • Organized into consistent sections: WHERE (tenant, cluster, resource, location) and DETAILS (metric, thresholds, duration)
  • Improved readability for on-call responders

Category of change

  • Bug fix (non-breaking change which fixes an issue)
  • Version upgrade (upgrading the version of a service or product)
  • New feature (non-breaking change which adds functionality)
  • Build: a code change that affects the build system or external dependencies
  • Performance: a code change that improves performance
  • Refactor: a code change that neither fixes a bug nor adds a feature
  • Documentation: documentation changes
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Checklist

  • I have reviewed my own diff and added inline comments on lines I want reviewers to focus on or that I am uncertain about

@timtalbot timtalbot requested a review from a team as a code owner March 12, 2026 19:53
@timtalbot timtalbot requested a review from amdove March 12, 2026 19:53
@timtalbot timtalbot added this pull request to the merge queue Mar 12, 2026
Merged via the queue into main with commit b6cf6d9 Mar 12, 2026
4 checks passed
@timtalbot timtalbot deleted the improve-azure-alerts branch March 12, 2026 20:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants