Skip to content

ING-1378: Optimized routing metrics#364

Merged
Westwooo merged 4 commits intomasterfrom
ING-1378-optimized-routing-metrics
Mar 2, 2026
Merged

ING-1378: Optimized routing metrics#364
Westwooo merged 4 commits intomasterfrom
ING-1378-optimized-routing-metrics

Conversation

@Westwooo
Copy link
Copy Markdown
Contributor

@Westwooo Westwooo commented Jan 7, 2026

This PR adds support for metrics that track the performance of optimal routing. This is done by:

  1. CNG propagates the address of the node local to it and it's server group
  2. Goccborex watches the cluster nodes to monitor the server group for each node
  3. When a node is selected to service a kv request we check if it is the local node or in the same server group and increment the appropriate metric

In order to have a minimal impact on the performance of crud ops the actual metric emission is done off the hot path. The endpoint resolved is sent down a channel to a metric worker which reads the contents of this channel and emits the appropriate metric.

@Westwooo Westwooo force-pushed the ING-1378-optimized-routing-metrics branch from 2f968b8 to 9cbb218 Compare January 7, 2026 12:45
@Westwooo Westwooo requested a review from brett19 January 7, 2026 13:58
Comment thread metrics.go Outdated
@brett19
Copy link
Copy Markdown
Member

brett19 commented Jan 7, 2026

I would be a bit cautious with using context in our 'hot path' as there is quite a performance impact related to it as it uses runtime reflection to work. Is it possible to have the information about the connection string being the 'local' propagated to gocbcorex such that we can directly (internally to gocbcorex) discover whether a node we are routing to is system-local or server-group-local?

@Westwooo
Copy link
Copy Markdown
Contributor Author

Westwooo commented Jan 8, 2026

@brett19 Have changed it to make the address of the local node be a part of the crud component and propagated at startup, so should address the expensive context concerns.

@Westwooo Westwooo closed this Jan 8, 2026
@Westwooo Westwooo reopened this Jan 8, 2026
@Westwooo Westwooo force-pushed the ING-1378-optimized-routing-metrics branch 3 times, most recently from aa6183a to 2dac69b Compare January 8, 2026 16:35
@Westwooo Westwooo requested a review from chvck January 8, 2026 16:38
brett19
brett19 previously approved these changes Jan 8, 2026
Copy link
Copy Markdown
Member

@brett19 brett19 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, although I think it might be good to include both 'local' as well as 'server group' local, since in most deployments the latter will be most common.

@Westwooo Westwooo force-pushed the ING-1378-optimized-routing-metrics branch 5 times, most recently from 07729c7 to 8a5c4dd Compare January 13, 2026 07:34
@Westwooo Westwooo requested a review from brett19 January 13, 2026 09:47
@chvck chvck requested a review from Copilot January 13, 2026 11:19
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for routing metrics that track whether KV requests are serviced locally, within the same server group, or remotely. The implementation propagates node locality information through the system and increments appropriate metrics based on endpoint selection.

Changes:

  • Added NodesWatcherHttp component to monitor cluster nodes and their server groups
  • Extended configuration structs to include local node address and server group information
  • Updated routing orchestration to track and categorize memd requests using OpenTelemetry metrics

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
agent_options.go Added GetNodes function, LocalNodeAddr, and ServerGroup fields to support routing metrics
agent.go Initialized local KV endpoint and configured CRUD component with routing information
bucketstracking_agentmanager.go Integrated NodesWatcherHttp to track node topology and server groups
nodeswatcher_http.go New component that watches cluster nodes and maintains hostname-to-server-group mappings
streamwatcher_http.go Added streamWatcherHttp_streamNodes function to stream node configuration updates
crud.go Added GetServerGroupEndpoints method and propagated routing parameters through CRUD operations
vbucketrouter.go Updated routing orchestration to increment metrics based on endpoint locality
metrics.go Defined three new metrics counters for local, server group, and remote requests
contrib/cbconfig/cbconfig.go Added ServerGroup field to node configuration JSON
crud_test.go Updated test to pass empty routing parameters

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread metrics.go Outdated
Comment thread crud.go Outdated
Comment thread agent.go Outdated
Comment thread nodeswatcher_http.go Outdated
@Westwooo Westwooo force-pushed the ING-1378-optimized-routing-metrics branch from 8a5c4dd to ca1704d Compare January 13, 2026 14:24
Comment thread agent_options.go
Comment thread nodeswatcher_http.go Outdated
Comment thread nodeswatcher_http.go Outdated
@Westwooo Westwooo force-pushed the ING-1378-optimized-routing-metrics branch 2 times, most recently from 705abab to 38ff298 Compare January 14, 2026 09:50
@Westwooo Westwooo requested a review from chvck January 14, 2026 11:04
Copy link
Copy Markdown
Collaborator

@chvck chvck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, will leave +2 to Brett.

@Westwooo Westwooo force-pushed the ING-1378-optimized-routing-metrics branch from 38ff298 to 780293f Compare January 21, 2026 15:11
Comment thread vbucketrouter.go Outdated
return emptyResp, err
}

if localKvEp == endpoint {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than having the metrics incremented through the vbucketrouter which is meant to be a simple reusable component, I think this should happen in one of the orchestration levels of gocbcorex proper.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's worth considering the 'hot path' here as well. If the user does not have metrics enabled, is this somewhere we have context to be able to disable it?

@Westwooo Westwooo force-pushed the ING-1378-optimized-routing-metrics branch 3 times, most recently from 7b147ba to 2ccafad Compare January 22, 2026 16:01
@Westwooo Westwooo requested a review from brett19 January 26, 2026 14:10
@Westwooo Westwooo force-pushed the ING-1378-optimized-routing-metrics branch 2 times, most recently from 557d26e to bc7e18b Compare January 28, 2026 11:17
@chvck chvck requested a review from Copilot January 28, 2026 11:35
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread optimized_routing_metrics_worker.go Outdated
Comment thread crud.go Outdated
Comment thread crud.go Outdated
Comment thread agent.go
@Westwooo Westwooo force-pushed the ING-1378-optimized-routing-metrics branch from bc7e18b to 5da0011 Compare January 28, 2026 13:19
Comment thread optimized_routing_metrics_worker.go Outdated
@Westwooo Westwooo force-pushed the ING-1378-optimized-routing-metrics branch from 5da0011 to 1d6b716 Compare January 28, 2026 13:48
@Westwooo Westwooo force-pushed the ING-1378-optimized-routing-metrics branch 5 times, most recently from 30f1aa5 to a9ae6ef Compare February 23, 2026 07:14
Copy link
Copy Markdown
Member

@brett19 brett19 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved. It would be good to merge your microbenchmarking code first to validate that this does not have a significant negative effect on performance though.

@Westwooo Westwooo force-pushed the ING-1378-optimized-routing-metrics branch from a9ae6ef to 348974b Compare March 2, 2026 20:00
@Westwooo Westwooo merged commit b26465d into master Mar 2, 2026
10 checks passed
@Westwooo Westwooo deleted the ING-1378-optimized-routing-metrics branch March 2, 2026 20:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants