[RPC Metric Part 1] Support two basic metrics in RPC client : Latency and error rate #89
[RPC Metric Part 1] Support two basic metrics in RPC client : Latency and error rate #89
Conversation
…ency, RPC error rate
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a3174e7ea3
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
vlfig
left a comment
There was a problem hiding this comment.
Couple of comments, grab me if you need.
metrics/rpc_client.go
Outdated
| // RPCClientMetricsConfig holds labels for RPC client metrics. | ||
| // Empty strings are allowed; they will still be emitted as labels for filtering. | ||
| type RPCClientMetricsConfig struct { | ||
| Env string // e.g. "staging", "production" |
There was a problem hiding this comment.
I don't think this label should ever be populated by the application itself.
metrics/rpc_client.go
Outdated
| Env string // e.g. "staging", "production" | ||
| Network string // chain/network name | ||
| ChainID string // chain ID | ||
| RPCProvider string // RPC provider or node name (optional) |
There was a problem hiding this comment.
Where does this come from? I'd imagine this being called from logResult in rpc_client.go.
metrics/rpc_client.go
Outdated
| @@ -0,0 +1,125 @@ | |||
| // RPC client observability using Beholder. | |||
There was a problem hiding this comment.
Instead of a new module I'd imagine this becoming an expansion of metrics/client.go, which already has a (promauto) latency metric, to 1) include beholder as a "target" like in metrics/multinode.go; and 2) add the request error rate metric.
docs/rpc_observability.md
Outdated
| @@ -0,0 +1,54 @@ | |||
| # RPC Observability (Beholder) | |||
There was a problem hiding this comment.
Not sure you intended to commit this. I do like the idea of having a /docs folder in this style but I think that's better pursued with a broader effort. Leaving such a slim slice here would end up be more confusing, I think.
docs/rpc_observability.md
Outdated
|
|
||
| Create `RPCClientMetrics` with `metrics.NewRPCClientMetrics(metrics.RPCClientMetricsConfig{...})` and pass it as the last argument to `multinode.NewRPCClientBase(...)`. The follow-up interface refactor will make it easier for multinode/chain integrations to supply `env`, `network`, `chain_id`, and `rpc_provider`. | ||
|
|
||
| ## Follow-up: multinode integration (PR 2) |
There was a problem hiding this comment.
Better not mix the plan of what to do with description of what is.
vlfig
left a comment
There was a problem hiding this comment.
Couple of comments. Do tag Dmytro once you feel we're past these. I think we'll need approval from someone other than me.
| RPCCallLatency = promauto.NewHistogramVec(prometheus.HistogramOpts{ | ||
| Name: "rpc_call_latency", | ||
| Help: "The duration of an RPC call in milliseconds", | ||
| Help: "The duration of an RPC call in seconds", |
| rpcCallLatencyBeholder = "rpc_call_latency" | ||
| rpcCallErrorsTotalBeholder = "rpc_call_errors_total" |
There was a problem hiding this comment.
If we're defining these, let's use them above.
| } | ||
|
|
||
| // RPCClientMetricsConfig holds fixed labels for an RPC client instance. | ||
| type RPCClientMetricsConfig struct { |
There was a problem hiding this comment.
If this is going to be called from chainlink-evm's logResult, If I'm reading this correctly, the only "static" fields are ChainFamily and ChainID, no? The others come from the request and would be passed on each increment, not when creating the metrics instance. Did you test locally logResult calling RecordRequest instead of RPCCallLatency directly?
| latency: latency, | ||
| errorsTotal: errorsTotal, |
There was a problem hiding this comment.
maybe call them latencyHist and errorsCounter for clarity?
| // NoopRPCClientMetrics is a no-op implementation for when metrics are disabled. | ||
| type NoopRPCClientMetrics struct{} | ||
|
|
||
| func (NoopRPCClientMetrics) RecordRequest(context.Context, string, time.Duration, error) {} | ||
|
|
||
| var _ RPCClientMetrics = NoopRPCClientMetrics{} |
There was a problem hiding this comment.
I'm not exactly a golang wizz, but is this necessary?
There was a problem hiding this comment.
that's compile time interface check, recommended here
Description
Allow RPC client to capture two metrics, so it later can be exported to beholder
Requires Dependencies
Resolves Dependencies