You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This epic refreshes the tracing telemetry plan to bring it in line with the canonical plugin/hook pattern established by the metrics (#443) and logging (#442) epics, both closed 2026-04-27.
The original plan (2026-02-10) and its nine sub-issues (#469–#477) predate two major refactors that reshape tracing's design surface:
The original sub-issues have been retired and replaced by the phased plan below. See the companion comment on this epic for a feature-level mapping of retired issues to their replacements.
Current state (2026-05-07)
Tracing lives in mellea/telemetry/tracing.py and mellea/telemetry/backend_instrumentation.py.
Uses direct inline instrumentation in all 5 backends (~20 call sites) — no plugin/hook usage.
No tracing_plugins.py; tracing is the only telemetry pillar still on pre-plugin patterns.
Env vars split and non-uniform: MELLEA_TRACE_APPLICATION, MELLEA_TRACE_BACKEND, MELLEA_TRACE_CONSOLE. No single umbrella flag. Only honors generic OTEL_EXPORTER_OTLP_ENDPOINT; no OTEL_EXPORTER_OTLP_TRACES_ENDPOINT support.
refactor: rename tracing env vars to plural and align with OTel semconv #1046 — Rename env vars to plural MELLEA_TRACES_* (aligned with OTEL_EXPORTER_OTLP_TRACES_ENDPOINT and MELLEA_METRICS_*), add umbrella enable flag, align attribute names (gen_ai.provider.name), honor OTEL_EXPORTER_OTLP_TRACES_ENDPOINT, switch to lazy init.
mellea/telemetry/tracing_plugins.py exists and parallels metrics_plugins.py in shape.
No backend file imports tracing helpers directly.
mot._meta["_telemetry_span"] is removed.
Env vars renamed to plural MELLEA_TRACES_* (aligned with OTel standard partner var and MELLEA_METRICS_*). Old MELLEA_TRACE_* names emit deprecation warnings for one release.
All emitted gen_ai.* attributes match current OTel GenAI semconv (gen_ai.provider.name, not gen_ai.system).
Span events emitted for streaming milestones (TTFB, first token, complete).
Tool calls produce spans via tool_post_invoke.
OTEL_EXPORTER_OTLP_TRACES_ENDPOINT is respected when set.
Hook timing in streaming path — epic feat: streaming validation — per-chunk requirement checking with early exit #891 has an open concern about when GENERATION_POST_CALL fires in the streaming path; whatever resolves that concern will also determine how cleanly a tracing plugin on the same hook can observe streaming generations.
Overview
This epic refreshes the tracing telemetry plan to bring it in line with the canonical plugin/hook pattern established by the metrics (#443) and logging (#442) epics, both closed 2026-04-27.
The original plan (2026-02-10) and its nine sub-issues (#469–#477) predate two major refactors that reshape tracing's design surface:
ModelOutputThunkexecution metadata intoGenerationMetadata.The original sub-issues have been retired and replaced by the phased plan below. See the companion comment on this epic for a feature-level mapping of retired issues to their replacements.
Current state (2026-05-07)
mellea/telemetry/tracing.pyandmellea/telemetry/backend_instrumentation.py.tracing_plugins.py; tracing is the only telemetry pillar still on pre-plugin patterns.mot._meta["_telemetry_span"]so it survives coroutine boundaries — collides with ModelOutputThunk structural cleanup — ._meta partitioning, raw responses, _thinking #909 item 2.mot.generation(the normalizedGenerationMetadata).gen_ai.systemattribute (current GenAI semconv isgen_ai.provider.name, which metrics uses). @planetf1's PR feat(telemetry): close five OTel GenAI semantic convention emission gaps (#1035) #1036 is adding dual-emission.MELLEA_TRACE_APPLICATION,MELLEA_TRACE_BACKEND,MELLEA_TRACE_CONSOLE. No single umbrella flag. Only honors genericOTEL_EXPORTER_OTLP_ENDPOINT; noOTEL_EXPORTER_OTLP_TRACES_ENDPOINTsupport.importlib.reload(vs logging's lazy init).core/base.py:520–541before thegeneration_errorhook fires — would block a tracing plugin hooking that event.gen_ai.conversation.id, mostgen_ai.request.*parameters,SpanKind.CLIENTon backend calls.Scope
Phase 1 — Foundation (serial, blocking)
These must land before Phase 2/3. Likely one PR, separate tracking issues for clarity.
mellea/telemetry/tracing_plugins.py, move instrumentation out of backends, read frommot.generation. Naturally resolves ModelOutputThunk structural cleanup — ._meta partitioning, raw responses, _thinking #909 item 2 (removesmot._meta["_telemetry_span"]).MELLEA_TRACES_*(aligned withOTEL_EXPORTER_OTLP_TRACES_ENDPOINTandMELLEA_METRICS_*), add umbrella enable flag, align attribute names (gen_ai.provider.name), honorOTEL_EXPORTER_OTLP_TRACES_ENDPOINT, switch to lazy init.generation_errorplugin hook so backends and core code no longer touch span state directly.Phase 2 — Coverage (parallelizable once Phase 1 lands)
stdlib/session.py/stdlib/functional.py).tool_post_invokehook (replaces Instrument tool calling to track tool invocations, arguments, and results #474).Phase 3 — Polish
classify_errorandERROR_TYPE_*frommellea/telemetry/metrics_plugins.pyinstead of a paralleltelemetry/errors.py(retires Create a standardized error classification system that maps exceptions to semantic error types for consistent error reporting in traces #475).Acceptance Criteria
mellea/telemetry/tracing_plugins.pyexists and parallelsmetrics_plugins.pyin shape.mot._meta["_telemetry_span"]is removed.MELLEA_TRACES_*(aligned with OTel standard partner var andMELLEA_METRICS_*). OldMELLEA_TRACE_*names emit deprecation warnings for one release.gen_ai.*attributes match current OTel GenAI semconv (gen_ai.provider.name, notgen_ai.system).tool_post_invoke.OTEL_EXPORTER_OTLP_TRACES_ENDPOINTis respected when set.Coordination
GENERATION_POST_CALLfires in the streaming path; whatever resolves that concern will also determine how cleanly a tracing plugin on the same hook can observe streaming generations.Related / Retired