Skip to content

[AKS] az aks update: Add --enable-high-log-scale-mode flag and fix monitoring addon key casing#33039

Open
carlotaarvela wants to merge 16 commits intoAzure:devfrom
carlotaarvela:cnl-hlsm-update
Open

[AKS] az aks update: Add --enable-high-log-scale-mode flag and fix monitoring addon key casing#33039
carlotaarvela wants to merge 16 commits intoAzure:devfrom
carlotaarvela:cnl-hlsm-update

Conversation

@carlotaarvela
Copy link
Contributor

@carlotaarvela carlotaarvela commented Mar 24, 2026

Related command
az aks update, az aks enable-addons, az aks disable-addons

Description

This PR introduces the --enable-high-log-scale-mode parameter to az aks update and fixes a bug where the monitoring addon key returned by the API as omsAgent (camelCase) was not recognized, causing failures in addon enable/disable/update workflows.

Changes

1. New --enable-high-log-scale-mode flag (az aks update)

  • Adds a new three-state flag --enable-high-log-scale-mode that enables or disables High Log Scale Mode (HLSM) for Container Logs.
  • HLSM is auto-enabled when --enable-container-network-logs is specified.
  • When enabling HLSM standalone, validates that the monitoring addon is already enabled with MSI auth (useAADAuth=true).
  • When disabling HLSM, validates that container network logs (CNL) are not currently active on the cluster.
  • Triggers DCR (Data Collection Rule) update post-processing when HLSM or CNL flags change, so the correct data streams are configured.
  • Preserves existing enableRetinaNetworkFlags (CNL) config when re-enabling the monitoring addon.

2. Fix monitoring addon key casing (omsagent vs omsAgent)

  • The AKS API may return the monitoring addon key as either omsagent (lowercase) or omsAgent (camelCase). Previously, the CLI only checked for omsagent, causing KeyError or silent failures when the API returned omsAgent.
  • Adds a new constant CONST_MONITORING_ADDON_NAME_CAMELCASE and a helper function get_monitoring_addon_key() that checks both variants and returns whichever key is present in the addon profiles.
  • Updates all monitoring addon lookups in custom.py, managed_cluster_decorator.py, and addonconfiguration.py to use this helper.

3. Retry logic for Log Analytics workspace creation

  • Adds exponential backoff with jitter (up to 3 retries) for 409 Conflict / ResourceExistsError during Log Analytics workspace provisioning in ensure_default_log_analytics_workspace_for_monitoring.

Testing Guide

Enable HLSM on an existing cluster with monitoring addon:

az aks update -g myRG -n myCluster --enable-high-log-scale-mode

Enable HLSM together with container network logs:

az aks update -g myRG -n myCluster --enable-container-network-logs --enable-high-log-scale-mode

Disable HLSM (requires CNL to be disabled first):

az aks update -g myRG -n myCluster --enable-high-log-scale-mode false

History Notes

[AKS] az aks update: Add --enable-high-log-scale-mode flag to enable/disable High Log Scale Mode for Container Logs
[AKS] Fix monitoring addon key lookup to handle both omsagent and omsAgent API response variants
[AKS] Add retry with exponential backoff for Log Analytics workspace creation conflicts


This checklist is used to make sure that common guidelines for a pull request are followed.

@azure-client-tools-bot-prd
Copy link

azure-client-tools-bot-prd bot commented Mar 24, 2026

️✔️AzureCLI-FullTest
️✔️acr
️✔️latest
️✔️3.12
️✔️3.13
️✔️acs
️✔️latest
️✔️3.12
️✔️3.13
️✔️advisor
️✔️latest
️✔️3.12
️✔️3.13
️✔️ams
️✔️latest
️✔️3.12
️✔️3.13
️✔️apim
️✔️latest
️✔️3.12
️✔️3.13
️✔️appconfig
️✔️latest
️✔️3.12
️✔️3.13
️✔️appservice
️✔️latest
️✔️3.12
️✔️3.13
️✔️aro
️✔️latest
️✔️3.12
️✔️3.13
️✔️backup
️✔️latest
️✔️3.12
️✔️3.13
️✔️batch
️✔️latest
️✔️3.12
️✔️3.13
️✔️batchai
️✔️latest
️✔️3.12
️✔️3.13
️✔️billing
️✔️latest
️✔️3.12
️✔️3.13
️✔️botservice
️✔️latest
️✔️3.12
️✔️3.13
️✔️cdn
️✔️latest
️✔️3.12
️✔️3.13
️✔️cloud
️✔️latest
️✔️3.12
️✔️3.13
️✔️cognitiveservices
️✔️latest
️✔️3.12
️✔️3.13
️✔️compute_recommender
️✔️latest
️✔️3.12
️✔️3.13
️✔️computefleet
️✔️latest
️✔️3.12
️✔️3.13
️✔️config
️✔️latest
️✔️3.12
️✔️3.13
️✔️configure
️✔️latest
️✔️3.12
️✔️3.13
️✔️consumption
️✔️latest
️✔️3.12
️✔️3.13
️✔️container
️✔️latest
️✔️3.12
️✔️3.13
️✔️containerapp
️✔️latest
️✔️3.12
️✔️3.13
️✔️core
️✔️latest
️✔️3.12
️✔️3.13
️✔️cosmosdb
️✔️latest
️✔️3.12
️✔️3.13
️✔️databoxedge
️✔️latest
️✔️3.12
️✔️3.13
️✔️dls
️✔️latest
️✔️3.12
️✔️3.13
️✔️dms
️✔️latest
️✔️3.12
️✔️3.13
️✔️eventgrid
️✔️latest
️✔️3.12
️✔️3.13
️✔️eventhubs
️✔️latest
️✔️3.12
️✔️3.13
️✔️feedback
️✔️latest
️✔️3.12
️✔️3.13
️✔️find
️✔️latest
️✔️3.12
️✔️3.13
️✔️hdinsight
️✔️latest
️✔️3.12
️✔️3.13
️✔️identity
️✔️latest
️✔️3.12
️✔️3.13
️✔️iot
️✔️latest
️✔️3.12
️✔️3.13
️✔️keyvault
️✔️latest
️✔️3.12
️✔️3.13
️✔️lab
️✔️latest
️✔️3.12
️✔️3.13
️✔️managedservices
️✔️latest
️✔️3.12
️✔️3.13
️✔️maps
️✔️latest
️✔️3.12
️✔️3.13
️✔️marketplaceordering
️✔️latest
️✔️3.12
️✔️3.13
️✔️monitor
️✔️latest
️✔️3.12
️✔️3.13
️✔️mysql
️✔️latest
️✔️3.12
️✔️3.13
️✔️netappfiles
️✔️latest
️✔️3.12
️✔️3.13
️✔️network
️✔️latest
️✔️3.12
️✔️3.13
️✔️policyinsights
️✔️latest
️✔️3.12
️✔️3.13
️✔️postgresql
️✔️latest
️✔️3.12
️✔️3.13
️✔️privatedns
️✔️latest
️✔️3.12
️✔️3.13
️✔️profile
️✔️latest
️✔️3.12
️✔️3.13
️✔️rdbms
️✔️latest
️✔️3.12
️✔️3.13
️✔️redis
️✔️latest
️✔️3.12
️✔️3.13
️✔️relay
️✔️latest
️✔️3.12
️✔️3.13
️✔️resource
️✔️latest
️✔️3.12
️✔️3.13
️✔️role
️✔️latest
️✔️3.12
️✔️3.13
️✔️search
️✔️latest
️✔️3.12
️✔️3.13
️✔️security
️✔️latest
️✔️3.12
️✔️3.13
️✔️servicebus
️✔️latest
️✔️3.12
️✔️3.13
️✔️serviceconnector
️✔️latest
️✔️3.12
️✔️3.13
️✔️servicefabric
️✔️latest
️✔️3.12
️✔️3.13
️✔️signalr
️✔️latest
️✔️3.12
️✔️3.13
️✔️sql
️✔️latest
️✔️3.12
️✔️3.13
️✔️sqlvm
️✔️latest
️✔️3.12
️✔️3.13
️✔️storage
️✔️latest
️✔️3.12
️✔️3.13
️✔️synapse
️✔️latest
️✔️3.12
️✔️3.13
️✔️telemetry
️✔️latest
️✔️3.12
️✔️3.13
️✔️util
️✔️latest
️✔️3.12
️✔️3.13
️✔️vm
️✔️latest
️✔️3.12
️✔️3.13

@azure-client-tools-bot-prd
Copy link

Hi @carlotaarvela,
Since the current milestone time is less than 7 days, this pr will be reviewed in the next milestone.

@yonzhan
Copy link
Collaborator

yonzhan commented Mar 24, 2026

Thank you for your contribution! We will review the pull request and get back to you soon.

@azure-client-tools-bot-prd
Copy link

azure-client-tools-bot-prd bot commented Mar 24, 2026

❌AzureCLI-BreakingChangeTest
❌acr
rule cmd_name rule_message suggest_message
1007 - ParaRemove acr cache create cmd acr cache create removed parameter identity please add back parameter identity for cmd acr cache create
1007 - ParaRemove acr cache update cmd acr cache update removed parameter identity please add back parameter identity for cmd acr cache update
⚠️acs
rule cmd_name rule_message suggest_message
⚠️ 1006 - ParaAdd aks update cmd aks update added parameter enable_high_log_scale_mode

Please submit your Breaking Change Pre-announcement ASAP if you haven't already. Please note:

  • Breaking changes can only be merged during the designated breaking change window
  • A pre-announcement must be released at least one month in advance

For more details on how to introduce breaking changes, refer to the documentation: azure-cli/doc/how_to_introduce_breaking_changes.md

@github-actions
Copy link

The git hooks are available for azure-cli and azure-cli-extensions repos. They could help you run required checks before creating the PR.

Please sync the latest code with latest dev branch (for azure-cli) or main branch (for azure-cli-extensions).
After that please run the following commands to enable git hooks:

pip install azdev --upgrade
azdev setup -c <your azure-cli repo path> -r <your azure-cli-extensions repo path>

@carlotaarvela carlotaarvela changed the title Update cnl hlsm workflow [ACS] az aks update: Add --enable-high-log-scale-mode flag and fix monitoring addon key casing Mar 24, 2026
@carlotaarvela carlotaarvela marked this pull request as ready for review March 24, 2026 20:17
@carlotaarvela carlotaarvela requested a review from NoriZC as a code owner March 24, 2026 20:17
Copilot AI review requested due to automatic review settings March 24, 2026 20:17
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends the AKS command module to support High Log Scale Mode (HLSM) configuration during az aks update, hardens monitoring addon handling against API key casing differences (omsagent vs omsAgent), and adds retry/backoff for default Log Analytics workspace provisioning conflicts.

Changes:

  • Add --enable-high-log-scale-mode (three-state) to az aks update, including validation and post-processing to update DCR/DCRA stream configuration when HLSM/CNL changes.
  • Normalize monitoring addon lookups by introducing CONST_MONITORING_ADDON_NAME_CAMELCASE and get_monitoring_addon_key() and applying it across update/enable/disable flows.
  • Add exponential backoff + jitter retry for Log Analytics workspace creation on 409 conflicts / ResourceExistsError, with accompanying unit tests.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/azure-cli/azure/cli/command_modules/acs/managed_cluster_decorator.py Adds addon-key normalization and HLSM/CNL post-processing triggers in update/create decorator flows.
src/azure-cli/azure/cli/command_modules/acs/custom.py Uses addon-key helper for monitoring enable/disable checks; preserves CNL config during monitoring addon disable/reenable cycles.
src/azure-cli/azure/cli/command_modules/acs/addonconfiguration.py Adds retry w/ backoff+jitter for default workspace creation conflicts.
src/azure-cli/azure/cli/command_modules/acs/_helpers.py Introduces get_monitoring_addon_key() helper to handle omsagent/omsAgent.
src/azure-cli/azure/cli/command_modules/acs/_consts.py Adds CONST_MONITORING_ADDON_NAME_CAMELCASE = "omsAgent".
src/azure-cli/azure/cli/command_modules/acs/_params.py Registers enable_high_log_scale_mode as a three-state flag.
src/azure-cli/azure/cli/command_modules/acs/_help.py Documents --enable-high-log-scale-mode under aks update.
src/azure-cli/azure/cli/command_modules/acs/tests/latest/test_managed_cluster_decorator.py Adds coverage for camelCase monitoring key resolution and HLSM/CNL postprocessing behavior.
src/azure-cli/azure/cli/command_modules/acs/tests/latest/test_custom.py Adds unit coverage for CNL config preservation, workspace retry behavior, and monitoring-enabled helper.
src/azure-cli/azure/cli/command_modules/acs/tests/latest/test_helpers.py Adds tests for get_monitoring_addon_key().
src/azure-cli/azure/cli/command_modules/acs/tests/latest/test_aks_commands.py Extends scenario test flow to cover CNL disable/enable and HLSM enable via aks update.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Member

@FumingZhang FumingZhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code related to the monitoring addon in the aks create/update path has become increasingly difficult to understand. Please create a task to refactor this section soon, before making ANY further changes to it.

@FumingZhang
Copy link
Member

Queued live test to validate the change.

  • test_aks_create_acns_with_flow_logs

Queued live test on all existing cases related to monitoring addon

  • test_aks_create_default_service_with_monitoring_addon
  • test_aks_create_default_service_with_monitoring_addon_msi
  • test_aks_create_with_monitoring_aad_auth_msi
  • test_aks_create_with_monitoring_aad_auth_uai
  • test_aks_create_with_monitoring_aad_auth_msi_with_syslog
  • test_aks_create_with_monitoring_aad_auth_msi_with_datacollectionsettings
  • test_aks_create_with_monitoring_aad_auth_msi_with_datacollectionsettings_and_otheraddon
  • test_aks_create_with_monitoring_aad_auth_uai_with_syslog
  • test_aks_create_with_private_cluster_with_monitoring_aad_auth_msi_with_ampls
  • test_aks_create_with_monitoring_aad_auth_with_highlogscale
  • test_aks_create_with_private_cluster_with_monitoring_aad_auth_msi_with_ampls_with_highlogscale
  • test_aks_enable_monitoring_with_aad_auth_msi
  • test_aks_enable_monitoring_with_aad_auth_uai
  • test_aks_enable_monitoring_with_aad_auth_msi_with_syslog
  • test_aks_enable_monitoring_with_aad_auth_uai_with_syslog
  • test_aks_create_with_monitoring_legacy_auth
  • test_aks_create_with_azuremonitormetrics
  • test_aks_update_with_azuremonitormetrics
  • test_aks_update_to_msi_cluster_with_addons
  • test_aks_approuting_update_with_monitoring_addon_enabled
  • test_aks_create_acns_with_flow_logs

@rashmichandrashekar
Copy link
Contributor

@carlotaarvela - could we pls keep this pr sinple to only have relevant changes and remove unnecessary refactoring and changes?

Copy link
Member

@FumingZhang FumingZhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

live test passed!

@carlotaarvela
Copy link
Contributor Author

Live tests are passing

@FumingZhang FumingZhang changed the title [ACS] az aks update: Add --enable-high-log-scale-mode flag and fix monitoring addon key casing [AKS] az aks update: Add --enable-high-log-scale-mode flag and fix monitoring addon key casing Mar 26, 2026
@yanzhudd
Copy link
Contributor

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 3 pipeline(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants