Skip to content

Commit 6ec3723

Browse files
committed
Update metric_types.md for native histograms
Note that I used this opportunity to replace the term "client library" with "instrumentation library". I always thought that "client library" is confusing as it is not implementing a client in any way. (Technically, it implements a _server_, of which the Prometheus "server" is the client… 🤯) Even if we accept that "Prometheus client library" just means "a library to do something that has to do with Prometheus", the title "client library" still doesn't tell us what the library is actually for. (Note that the client_golang repository not only contains an instrumentation library, but also includes an _actual_ client library that helps you to implement clients that talk to the Prometheus HTTP API.) Signed-off-by: beorn7 <beorn@grafana.com>
1 parent 4ce5b60 commit 6ec3723

2 files changed

Lines changed: 83 additions & 36 deletions

File tree

docs/concepts/metric_types.md

Lines changed: 80 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -3,11 +3,17 @@ title: Metric types
33
sort_rank: 2
44
---
55

6-
The Prometheus client libraries offer four core metric types. These are
7-
currently only differentiated in the client libraries (to enable APIs tailored
8-
to the usage of the specific types) and in the wire protocol. The Prometheus
9-
server does not yet make use of the type information and flattens all data into
10-
untyped time series. This may change in the future.
6+
The Prometheus instrumentation libraries offer four core metric types. With the
7+
exception of native histograms, these are currently only differentiated in the
8+
instrumentation libraries (to enable APIs tailored to the usage of the specific
9+
types) and in the exposition protocols. The Prometheus server does not yet make
10+
use of the type information and flattens all types except native histograms
11+
into untyped time series of floating point values. Native histograms, however,
12+
are ingested as time series of special composite histogram samples. In the
13+
future, Prometheus might handle other metric types as [composite
14+
types](/blog/2026/02/14/modernizing-prometheus-composite-samples/), too. There
15+
is also ongoing work to persist the type information of the simple float
16+
samples.
1117

1218
## Counter
1319

@@ -20,7 +26,7 @@ errors.
2026
Do not use a counter to expose a value that can decrease. For example, do not
2127
use a counter for the number of currently running processes; instead use a gauge.
2228

23-
Client library usage documentation for counters:
29+
Instrumentation library usage documentation for counters:
2430

2531
* [Go](http://godoc.org/github.com/prometheus/client_golang/prometheus#Counter)
2632
* [Java](https://prometheus.github.io/client_java/getting-started/metric-types/#counter)
@@ -38,7 +44,7 @@ Gauges are typically used for measured values like temperatures or current
3844
memory usage, but also "counts" that can go up and down, like the number of
3945
concurrent requests.
4046

41-
Client library usage documentation for gauges:
47+
Instrumentation library usage documentation for gauges:
4248

4349
* [Go](http://godoc.org/github.com/prometheus/client_golang/prometheus#Gauge)
4450
* [Java](https://prometheus.github.io/client_java/getting-started/metric-types/#gauge)
@@ -51,37 +57,78 @@ Client library usage documentation for gauges:
5157

5258
A _histogram_ samples observations (usually things like request durations or
5359
response sizes) and counts them in configurable buckets. It also provides a sum
54-
of all observed values.
55-
56-
A histogram with a base metric name of `<basename>` exposes multiple time series
57-
during a scrape:
58-
59-
* cumulative counters for the observation buckets, exposed as `<basename>_bucket{le="<upper inclusive bound>"}`
60+
of all observed values. As such, a histogram is essentially a bucketed counter.
61+
However, a histogram can also represent the current state of a distribution, in
62+
which case it is called a _gauge histogram_. In contrast to the usual
63+
counter-like histograms, gauge histograms are rarely directly exposed by
64+
instrumented programs and are thus not (yet) usable in instrumentation
65+
libraries, but they are represented in newer versions of the protobuf
66+
exposition format and in [OpenMetrics](https://openmetrics.io/). They are also
67+
created regularly by PromQL expressions. For example, the outcome of applying
68+
the `rate` function to a counter histogram is a gauge histogram, in the same
69+
way as the outcome of applying the `rate` function to a counter is a gauge.
70+
71+
Histograms exists in two fundamentally different versions: The more recent
72+
_native histograms_ and the older _classic histograms_.
73+
74+
A native histogram is exposed and ingested as composite samples, where each
75+
sample represents the count and sum of observations together with a dynamic set
76+
of buckets.
77+
78+
A classic histogram, however, consists of multiple time series of simple float
79+
samples. A classic histogram with a base metric name of `<basename>` results in
80+
the following time series:
81+
82+
* cumulative counters for the observation buckets, exposed as
83+
`<basename>_bucket{le="<upper inclusive bound>"}`
6084
* the **total sum** of all observed values, exposed as `<basename>_sum`
61-
* the **count** of events that have been observed, exposed as `<basename>_count` (identical to `<basename>_bucket{le="+Inf"}` above)
62-
63-
Use the
64-
[`histogram_quantile()` function](/docs/prometheus/latest/querying/functions/#histogram_quantile)
65-
to calculate quantiles from histograms or even aggregations of histograms. A
66-
histogram is also suitable to calculate an
67-
[Apdex score](http://en.wikipedia.org/wiki/Apdex). When operating on buckets,
68-
remember that the histogram is
69-
[cumulative](https://en.wikipedia.org/wiki/Histogram#Cumulative_histogram). See
70-
[histograms and summaries](/docs/practices/histograms) for details of histogram
71-
usage and differences to [summaries](#summary).
72-
73-
NOTE: Beginning with Prometheus v2.40, there is experimental support for native
74-
histograms. A native histogram requires only one time series, which includes a
75-
dynamic number of buckets in addition to the sum and count of
76-
observations. Native histograms allow much higher resolution at a fraction of
77-
the cost. Detailed documentation will follow once native histograms are closer
78-
to becoming a stable feature.
85+
* the **count** of events that have been observed, exposed as
86+
`<basename>_count` (identical to `<basename>_bucket{le="+Inf"}` above)
87+
88+
Native histograms are generally much more efficient than classic histograms,
89+
allow much higher resolution, and do not require explicit configuration of
90+
bucket boundaries during instrumentation. Their bucketing schema ensures that
91+
they are always aggregatable with each other, even if the resolution might have
92+
changed, while classic histograms with different bucket boundaries are not
93+
generally aggregatable. If the instrumentation library you are using supports native
94+
histograms (currently this is the case for Go and Java), you should probably
95+
prefer native histograms over classic histograms.
96+
97+
If you are stuck with classic histograms for whatever reason, there is a way to
98+
get at least some of the benefits of native histograms: You can configure
99+
Prometheus to ingest classic histograms into a special form of native
100+
histograms, called Native Histograms with Custom Bucket boundaries (NHCB).
101+
NHCBs are stored as the same composite samples as usual native histograms with
102+
the same gain in efficiency. However, their buckets are still the same buckets
103+
statically configured during instrumentation, with their limited resolution and
104+
range and the same problems of aggregatability upon changing the bucket
105+
boundaries.
106+
107+
Use the [`histogram_quantile()`
108+
function](/docs/prometheus/latest/querying/functions/#histogram_quantile) to
109+
calculate quantiles from histograms or even aggregations of histograms. It
110+
works for both classic and native histograms, using a slightly different
111+
syntax. Histograms are also suitable to calculate an [Apdex
112+
score](http://en.wikipedia.org/wiki/Apdex).
113+
114+
You can operate directly on the buckets of a classic histogram, as they are
115+
represented as individual series (called `<basename>_bucket{le="<upper
116+
inclusive bound>"}` as described above). Remember, however, that these buckets
117+
are [cumulative](https://en.wikipedia.org/wiki/Histogram#Cumulative_histogram),
118+
i.e. every bucket counts all observations less than or equal to the upper
119+
boundary provided as a label. With native histograms, use the
120+
[`histogram_fraction()`
121+
function](/docs/prometheus/latest/querying/functions/#histogram_fraction) to
122+
calculate fractions of observations within given boundaries.
123+
124+
See [histograms and summaries](/docs/practices/histograms) for details of
125+
histogram usage and differences to [summaries](#summary).
79126

80127
NOTE: Beginning with Prometheus v3.0, the values of the `le` label of classic
81128
histograms are normalized during ingestion to follow the format of
82129
[OpenMetrics Canonical Numbers](https://github.com/prometheus/OpenMetrics/blob/main/specification/OpenMetrics.md#considerations-canonical-numbers).
83130

84-
Client library usage documentation for histograms:
131+
Instrumentation library usage documentation for histograms:
85132

86133
* [Go](http://godoc.org/github.com/prometheus/client_golang/prometheus#Histogram)
87134
* [Java](https://prometheus.github.io/client_java/getting-started/metric-types/#histogram)
@@ -111,7 +158,7 @@ to [histograms](#histogram).
111158
NOTE: Beginning with Prometheus v3.0, the values of the `quantile` label are normalized during
112159
ingestion to follow the format of [OpenMetrics Canonical Numbers](https://github.com/prometheus/OpenMetrics/blob/main/specification/OpenMetrics.md#considerations-canonical-numbers).
113160

114-
Client library usage documentation for summaries:
161+
Instrumentation library usage documentation for summaries:
115162

116163
* [Go](http://godoc.org/github.com/prometheus/client_golang/prometheus#Summary)
117164
* [Java](https://prometheus.github.io/client_java/getting-started/metric-types/#summary)

docs/practices/histograms.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -401,9 +401,9 @@ Classic histogram version:
401401
histogram_quantile(0.95, sum by (le) (rate(http_request_duration_seconds_bucket[5m]))) // GOOD.
402402

403403
Furthermore, should your SLO change and you now want to plot the 90th
404-
percentile, or you want to take into account the last 10 minutes
405-
instead of the last 5 minutes, you only have to adjust the expressions
406-
above and you do not need to reconfigure the clients.
404+
percentile, or you want to take into account the last 10 minutes instead of the
405+
last 5 minutes, you only have to adjust the expressions above and you do not
406+
need to reconfigure the instrumentation of the monitored programs.
407407

408408
### Errors of quantile estimation
409409

0 commit comments

Comments
 (0)