Skip to content

Export BEAM processes to OTLP metrics endpoint#6188

Merged
cnkk merged 5 commits intomasterfrom
beam-metrics-otel
Mar 24, 2026
Merged

Export BEAM processes to OTLP metrics endpoint#6188
cnkk merged 5 commits intomasterfrom
beam-metrics-otel

Conversation

@cnkk
Copy link
Copy Markdown
Member

@cnkk cnkk commented Mar 23, 2026

Vibed dashboard: https://plausible.grafana.net/d/beam-process-metrics/beam-process-metrics?orgId=1&from=now-1h&to=now&timezone=browser&refresh=30s&var-interval_s=10&var-top_n=20&var-process=$__all&var-namespace=plausible-app-6188

about pinned deps opentelemetry_*_experimental:

the OTel metrics API isn't published yet
override: true is required because opentelemetry_exporter transitively depends on the 0.5.x Hex stubs, which don't contain the metrics modules (otel_meter, otel_exporter_metrics_otlp, etc.).

@cnkk cnkk added the preview label Mar 23, 2026
@cnkk cnkk marked this pull request as draft March 23, 2026 14:56
@github-actions
Copy link
Copy Markdown

Preview environment👷🏼‍♀️🏗️
PR-6188

@cnkk cnkk marked this pull request as ready for review March 23, 2026 17:18
@cnkk
Copy link
Copy Markdown
Member Author

cnkk commented Mar 23, 2026

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 23, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 23, 2026

📝 Walkthrough

Walkthrough

This pull request introduces experimental OpenTelemetry BEAM metrics collection. The changes add runtime configuration for conditional metrics export, a new BeamMetrics module that registers observable gauge instruments for monitoring BEAM processes, conditional initialization during application startup, and three new dependencies to support the functionality.

Changes

Cohort / File(s) Summary
Configuration & Dependencies
config/runtime.exs, mix.exs
Adds conditional configuration for OpenTelemetry experimental metrics, including environment variable toggles for enabled state and collection interval. Introduces three new dependencies: opentelemetry_api_experimental, opentelemetry_experimental (from pinned GitHub commit), and recon ~> 2.5.
OpenTelemetry Setup
lib/plausible/application.ex
Modifies setup_opentelemetry/0 to conditionally invoke Plausible.OpenTelemetry.BeamMetrics.setup/0 when experimental metrics readers are configured.
BEAM Metrics Module
lib/plausible/open_telemetry/beam_metrics.ex
New module that registers three observable gauge instruments (memory, reductions, message queue length) and handles metric collection via callback. Collects top processes using recon.proc_count/2, filters valid processes, and constructs observations with process attributes (PID, registered name, current function, initial call).

Sequence Diagram

sequenceDiagram
    participant App as Plausible.Application
    participant OTel as OpenTelemetry
    participant BM as BeamMetrics
    participant Recon as recon (BEAM)
    participant Exporter as OTLP Exporter

    App->>BM: setup_opentelemetry() calls setup()
    BM->>OTel: Create instrumentation scope & meter
    BM->>OTel: Register 3 observable gauges (memory, reductions, msg_queue)
    BM->>OTel: Register callback with meter
    BM-->>App: Return :ok

    loop Collection Cycle (every interval_ms)
        OTel->>BM: Invoke registered callback
        BM->>Recon: collect_observations() calls recon.proc_count()
        Recon-->>BM: Return top N processes
        BM->>BM: Filter valid processes & build attributes
        BM-->>OTel: Return {value, attributes} observations
        OTel->>Exporter: Export metric batch
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title directly and clearly summarizes the main change: exporting BEAM processes data to an OTLP metrics endpoint, which aligns perfectly with the file changes.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Description check ✅ Passed The pull request description references a Grafana dashboard link and provides context about the experimental OpenTelemetry dependencies being pinned, which relates to the implementation.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch beam-metrics-otel

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (3)
lib/plausible/open_telemetry/beam_metrics.ex (2)

64-70: Consider simplifying the return structure.

The flat_map with a single-element list can be simplified to Enum.map:

♻️ Simplified implementation
   def observe_top_processes(_callback_args) do
-    Enum.flat_map(`@metrics`, fn metric ->
+    Enum.map(`@metrics`, fn metric ->
       {gauge_name, _opts} = Map.fetch!(`@instruments`, metric)
       observations = collect_observations(metric)
-      [{gauge_name, observations}]
+      {gauge_name, observations}
     end)
   end
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/plausible/open_telemetry/beam_metrics.ex` around lines 64 - 70, The
observe_top_processes function is using Enum.flat_map to return single-item
lists; replace the flat_map with Enum.map so each metric maps directly to a
tuple instead of a single-element list. In the observe_top_processes function,
iterate over `@metrics` with Enum.map, fetch the instrument via
Map.fetch!(`@instruments`, metric) to get gauge_name, call
collect_observations(metric) for observations, and return {gauge_name,
observations} for each entry (preserving the current tuple shape used
elsewhere).

94-102: Numeric metric values are converted to strings in attributes.

The memory, reductions, and message_queue_len values are converted to strings in the attributes map. While these values are already captured as the gauge observation value, including them as string attributes may affect:

  1. Query capabilities: Numeric attributes enable range queries and aggregations in observability backends
  2. Storage efficiency: String representation uses more storage

If the string conversion is intentional (e.g., for label-based filtering in Grafana), this is fine. Otherwise, consider keeping them as integers or omitting them from attributes since they're already the observation values.

♻️ Keep numeric attributes as integers
     %{
       "beam.process.pid" => inspect(pid),
       "beam.process.registered_name" => registered_name,
       "beam.process.current_function" => current_function,
-      "beam.process.initial_call" => initial_call,
-      "beam.process.memory" => to_string(Keyword.get(info, :memory, 0)),
-      "beam.process.reductions" => to_string(Keyword.get(info, :reductions, 0)),
-      "beam.process.message_queue_len" => to_string(Keyword.get(info, :message_queue_len, 0))
+      "beam.process.initial_call" => initial_call
     }

Or keep as integers if needed for cross-referencing:

-      "beam.process.memory" => to_string(Keyword.get(info, :memory, 0)),
-      "beam.process.reductions" => to_string(Keyword.get(info, :reductions, 0)),
-      "beam.process.message_queue_len" => to_string(Keyword.get(info, :message_queue_len, 0))
+      "beam.process.memory" => Keyword.get(info, :memory, 0),
+      "beam.process.reductions" => Keyword.get(info, :reductions, 0),
+      "beam.process.message_queue_len" => Keyword.get(info, :message_queue_len, 0)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/plausible/open_telemetry/beam_metrics.ex` around lines 94 - 102, The
attributes map in lib/plausible/open_telemetry/beam_metrics.ex currently
converts numeric values to strings for "beam.process.memory",
"beam.process.reductions", and "beam.process.message_queue_len"; update the map
so these attributes remain numeric (or remove them entirely if redundant with
the gauge observation) by replacing the to_string(Keyword.get(info, :memory,
0)), to_string(Keyword.get(info, :reductions, 0)), and
to_string(Keyword.get(info, :message_queue_len, 0)) entries with their integer
values (e.g., Keyword.get(info, :memory, 0) etc.) or drop the keys, ensuring
this change is applied where the attributes map is constructed in the
BeamMetrics module so observability backends can treat them as numbers.
mix.exs (1)

112-123: Git-based experimental dependencies pinned to a specific commit.

Using experimental OpenTelemetry packages from git with a pinned ref is reasonable for this feature, but be aware:

  1. These packages are explicitly "experimental" and may have breaking changes
  2. The override: true flag can mask version conflicts with other OpenTelemetry dependencies
  3. Updates require manually tracking the upstream repository

Consider documenting the rationale for this specific commit ref and establishing a process for periodic updates.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@mix.exs` around lines 112 - 123, The git-pinned experimental OpenTelemetry
deps (opentelemetry_api_experimental and opentelemetry_experimental) currently
pin a specific ref and use override: true without explanation; update mix.exs to
add a concise comment just above these dependency entries that documents why
this exact commit/ref is used, the risks (experimental/breaking changes and
override masking), and a defined update cadence or owner responsible for
tracking upstream (e.g., add a TODO with cadence and contact), and either
justify or remove the override: true setting for these deps in that comment so
future maintainers know whether the override is intentional.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@lib/plausible/open_telemetry/beam_metrics.ex`:
- Around line 64-70: The observe_top_processes function is using Enum.flat_map
to return single-item lists; replace the flat_map with Enum.map so each metric
maps directly to a tuple instead of a single-element list. In the
observe_top_processes function, iterate over `@metrics` with Enum.map, fetch the
instrument via Map.fetch!(`@instruments`, metric) to get gauge_name, call
collect_observations(metric) for observations, and return {gauge_name,
observations} for each entry (preserving the current tuple shape used
elsewhere).
- Around line 94-102: The attributes map in
lib/plausible/open_telemetry/beam_metrics.ex currently converts numeric values
to strings for "beam.process.memory", "beam.process.reductions", and
"beam.process.message_queue_len"; update the map so these attributes remain
numeric (or remove them entirely if redundant with the gauge observation) by
replacing the to_string(Keyword.get(info, :memory, 0)),
to_string(Keyword.get(info, :reductions, 0)), and to_string(Keyword.get(info,
:message_queue_len, 0)) entries with their integer values (e.g.,
Keyword.get(info, :memory, 0) etc.) or drop the keys, ensuring this change is
applied where the attributes map is constructed in the BeamMetrics module so
observability backends can treat them as numbers.

In `@mix.exs`:
- Around line 112-123: The git-pinned experimental OpenTelemetry deps
(opentelemetry_api_experimental and opentelemetry_experimental) currently pin a
specific ref and use override: true without explanation; update mix.exs to add a
concise comment just above these dependency entries that documents why this
exact commit/ref is used, the risks (experimental/breaking changes and override
masking), and a defined update cadence or owner responsible for tracking
upstream (e.g., add a TODO with cadence and contact), and either justify or
remove the override: true setting for these deps in that comment so future
maintainers know whether the override is intentional.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 06e4b32d-a049-46dd-a503-240fe9b8c395

📥 Commits

Reviewing files that changed from the base of the PR and between fca00f9 and c9a2e44.

⛔ Files ignored due to path filters (1)
  • mix.lock is excluded by !**/*.lock
📒 Files selected for processing (4)
  • config/runtime.exs
  • lib/plausible/application.ex
  • lib/plausible/open_telemetry/beam_metrics.ex
  • mix.exs

@cnkk cnkk added this pull request to the merge queue Mar 24, 2026
Merged via the queue into master with commit dc51b4c Mar 24, 2026
24 checks passed
@cnkk cnkk deleted the beam-metrics-otel branch March 24, 2026 10:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants