Skip to content

feat: local prometheus sdk metrics#322

Draft
andreas-karlsson wants to merge 5 commits intomainfrom
telemetry-simplification
Draft

feat: local prometheus sdk metrics#322
andreas-karlsson wants to merge 5 commits intomainfrom
telemetry-simplification

Conversation

@andreas-karlsson
Copy link
Contributor

@andreas-karlsson andreas-karlsson commented Mar 10, 2026

This PR first does some refactoring/cleanup of the telemetry and then adds a method to each SDK to extract prometheus metrics.

@andreas-karlsson andreas-karlsson changed the title unclamped historgrams feat: local prometheus sdk metrics Mar 11, 2026
uint64 memory_bytes = 7;

message ResolveLatency {
option (confidence.api.histogram) = {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was cute, but isn't really needed any longer cause we resample the histograms if client/server have different settings. Removing it deprecates quite a lot of code in builder.rs

let upper = ((i as f64 + 1.0) * LN_RATIO).exp();
writeln!(
w,
"confidence_resolve_latency_microseconds_bucket{{instance=\"{instance}\",le=\"{upper:.6e}\"}} {cumulative}"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want the unit as a label rather than part of metric name. Not sure what's idiomatic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. Unit in metric name is actually mandated. We could consider to switch to SI seconds though.

}

message PrometheusSnapshotRequest {
string instance = 1;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instance is just an id used to disambiguate the metrics from different wasm instances. I'm considering if we should also allow a "histogram ratio" option to control how many buckets are produced.

err := s.maintenance(func(lr LocalResolver) error {
text, err := lr.PrometheusSnapshot()
if err != nil {
return err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log and return empty string.

return current.get().prometheusSnapshot();
} catch (ChicoryException e) {
handleFailure("prometheusSnapshot", e);
throw e;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log and return empty string.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants