Skip to content

[xds] Implement A114: WRR support for custom backend metrics#12645

Open
sauravzg wants to merge 5 commits intogrpc:masterfrom
sauravzg:wrr-custom-metrics
Open

[xds] Implement A114: WRR support for custom backend metrics#12645
sauravzg wants to merge 5 commits intogrpc:masterfrom
sauravzg:wrr-custom-metrics

Conversation

@sauravzg
Copy link
Collaborator

@sauravzg sauravzg commented Feb 4, 2026

Description

This PR implements gRFC A114: WRR Support for Custom Backend Metrics.

It updates the weighted_round_robin policy to allow users to configure which backend metrics drive the load balancing weights.

Key Changes

  • Configuration: Supports the new metric_names_for_computing_utilization field in WeightedRoundRobinLbConfig.
  • Weight Calculation: Implements logic to resolve custom metrics (including map lookups like named_metrics.foo) when application_utilization is absent.
  • Refactor: Centralizes the complex metric lookup and validation logic (checking for NaN, <= 0, etc.) into a new internal utility MetricReportUtils.
  • Testing: Verifies correct precedence: application_utilization > custom_metrics (max valid value) > cpu_utilization.

@sauravzg sauravzg force-pushed the wrr-custom-metrics branch 2 times, most recently from 418bd90 to d76e770 Compare February 6, 2026 06:19
@sauravzg
Copy link
Collaborator Author

cc: @danielzhaotongliu To TAL at the PR.

@sauravzg sauravzg force-pushed the wrr-custom-metrics branch from 785c5f9 to 63c5bf3 Compare March 3, 2026 10:16
@ejona86 ejona86 self-requested a review March 5, 2026 05:09
@shivaspeaks shivaspeaks self-requested a review March 23, 2026 14:49
@shivaspeaks shivaspeaks added the kokoro:run Add this label to a PR to tell Kokoro the code is safe and tests can be run label Mar 24, 2026
@grpc-kokoro grpc-kokoro removed the kokoro:run Add this label to a PR to tell Kokoro the code is safe and tests can be run label Mar 24, 2026
Updates the Weighted Round Robin (WRR) load balancing policy to support
customizable utilization metrics via the `metric_names_for_computing_utilization` configuration.
This allows endpoint weights to be driven by arbitrary named metrics (e.g. `named_metrics.foo`)
or other standard metrics (e.g. `memory_utilization`) instead of solely `application_utilization`
or the `cpu_utilization` fallback.
Refactors metric resolution logic into `io.grpc.xds.internal.MetricReportUtils`
to handle the new map lookup and validation requirements.
@sauravzg sauravzg force-pushed the wrr-custom-metrics branch from 7117abe to 378ef63 Compare March 27, 2026 09:41
@sauravzg
Copy link
Collaborator Author

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for custom metrics in the Weighted Round Robin (WRR) load balancer. It adds a new configuration field, metricNamesForComputingUtilization, which allows users to specify a list of metrics for calculating backend utilization. The implementation includes a new utility class, MetricReportUtils, to resolve these metrics from ORCA load reports. The utilization calculation logic has been updated to prioritize the maximum value of specified custom metrics, falling back to application utilization and then CPU utilization if custom metrics are unavailable or invalid. This feature is currently guarded by the GRPC_EXPERIMENTAL_WRR_CUSTOM_METRICS experimental flag. I have no feedback to provide as there are no review comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants