Skip to content

obs: telemetry wrapper intermittent writeDataPoint drops (follow-up to #157)#212

Merged
klappy merged 1 commit into
mainfrom
obs/telemetry-wrapper-emit-loss-2026-05-16
May 16, 2026
Merged

obs: telemetry wrapper intermittent writeDataPoint drops (follow-up to #157)#212
klappy merged 1 commit into
mainfrom
obs/telemetry-wrapper-emit-loss-2026-05-16

Conversation

@klappy
Copy link
Copy Markdown
Owner

@klappy klappy commented May 16, 2026

Diagnostic observation documenting a residual gap in the per-tool withTelemetry wrapper introduced by klappy/oddkit#157 and promoted by klappy/oddkit#162.

What

Two smoke passes against worker_version = '0.28.0' (main preview + prod) each dropped 2 telemetry rows out of 16 successful handler returns. Different tools dropped on different runs. Retries emit cleanly.

Suggested fix

Wrap the wrapper's emit in ctx.waitUntil() to extend isolate lifecycle through the AE flush. Small change, no architecture impact.

Why filed here and not as an issue

The PAT scope for this work covers contents + PRs but not issues on klappy/oddkit. Canon observation is the natural durable format for this knowledge anyway — the diagnostic belongs in the searchable knowledge base, not buried in a GitHub issue tracker.

Not blocking the shipped promotion

klappy/oddkit#162 already shipped. The wrapper works on most calls; the residual drop is strictly better than the pre-wrapper 27% wire-edge gap. This is a follow-up fix, not a regression.


Note

Low Risk
Docs-only change adding an internal observation; no production code or runtime behavior is modified.

Overview
Adds a new Canon observation documenting intermittent loss of writeDataPoint telemetry rows from the withTelemetry wrapper despite successful tool execution.

Captures smoke-test evidence across preview/prod and proposes a follow-up implementation fix (wrapping emission in ctx.waitUntil) to avoid Cloudflare Workers isolate lifecycle flush races.

Reviewed by Cursor Bugbot for commit e75e559. Bugbot is set up for automated code reviews on this repo. Configure here.

Two consecutive smoke passes against worker_version 0.28.0 (main preview
and prod) each emitted only 14 of 17 tool_call rows despite all
handlers returning successfully. The dropped tools differ between runs,
ruling out per-tool wrapper attachment as the cause.

Most likely cause: writeDataPoint enqueue lost when SSE response closes
isolate before AE flush completes. Fix: wrap emit in ctx.waitUntil.

Not a promotion blocker — wrapper works on most calls, residual drop is
strictly better than the pre-wrapper 27% wire-edge gap. Per-tool
correctness (bytes_in matches no-space JSON of args exactly when
emitted) is verified.
@github-actions
Copy link
Copy Markdown

Canon Quality — Frontmatter Schema ✅

All 41 file(s) in writings/ conform to klappy://canon/meta/frontmatter-schema.

Validator: scripts/validate-frontmatter.py · Canon: klappy://canon/constraints/frontmatter-validation-before-merge · Run: #156

@github-actions
Copy link
Copy Markdown

Canon Quality — oddkit_audit

No dead klappy:// references or legacy link patterns found in writings/. 42 files scanned.

Spec: klappy://docs/oddkit/specs/oddkit-audit · Workflow: .github/workflows/canon-quality.yml · Run: #156

@klappy klappy merged commit 5aaaf86 into main May 16, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants