Skip to content

Diagnose log injection smoke test flakiness instead of masking it#11075

Draft
bm1549 wants to merge 3 commits intomasterfrom
brian.marks/fix-log-injection-smoke-test-flake
Draft

Diagnose log injection smoke test flakiness instead of masking it#11075
bm1549 wants to merge 3 commits intomasterfrom
brian.marks/fix-log-injection-smoke-test-flake

Conversation

@bm1549
Copy link
Copy Markdown
Contributor

@bm1549 bm1549 commented Apr 9, 2026

What Does This Do

Adds diagnostic instrumentation to the check raw file injection smoke test so the next CI failure tells us the root cause instead of a bare "Condition not satisfied after 30s" with traceCount=0.

Changes to LogInjectionSmokeTest:

  1. waitForTraceCountAlive — checks process liveness on every poll iteration; if the process dies, fails immediately with exit code + last 20 lines of process output
  2. Enriched timeout errors — on timeout, dumps: process alive?, traceCount, RC polls received, last 30 lines of process output
  3. Reorder waitForTraceCount(4) before waitFor + assert waitFor return value

Motivation

CI Visibility data for the last 30 days on master shows 10 failures of check raw file injection:

Failure mode Count Line Duration Root cause
traceCount=0 at waitForTraceCount(2) 9/10 368 30.3s Unknown — no diagnostics
logLines.size()=3 at assertRawLogLinesWithInjection 1/10 229 8.3s Incomplete log file

The failure distribution is bimodal — successful runs complete in 3.5-8.7s (80 data points, zero above 9s), while failures sit at exactly 30.3s. There is nothing in between. This means the process either works or is totally broken — a timeout increase would just delay the same failure.

<9s:  ████████████████████████████████████████  80/80 passes
9-30s:                                           0 runs
30s:  █████████                                  9/10 failures (at timeout)

The current test is blind during the wait — it just polls traceCount in a loop. We don't know if the process crashed, hung during agent init, failed to connect to the test server, or something else entirely. This PR makes the next failure self-diagnosing.

Example output when process crashes:

Process exited with code 1 while waiting for 2 traces (received 0, RC polls: 3).
Last process output:
[dd.trace ...] ERROR ... NullPointerException during instrumentation
...

Example output on timeout (process alive but not sending traces):

Timed out waiting for 2 traces after 30s. traceCount=0, process.alive=true, RC polls received: 142.
Last process output:
[dd.trace ...] DEBUG ... Still loading instrumentations...
...

Additional Notes

  • Only LogInjectionSmokeTest.groovy is changed
  • No timeout increase — the 30s defaultPoll is kept as-is
  • All 11 historically flaky backends pass locally
  • rcClientMessages.size() tells us whether the agent connected to the test server at all (RC polls hit /v0.7/config every 200ms)

Contributor Checklist

tag: no release notes
tag: ai generated

🤖 Generated with Claude Code

The `check raw file injection` test has been flaking across 11+ logging
backend variants for months. CI Visibility data shows 90% of failures are
`traceCount=0` at `waitForTraceCount(2)` after exactly 30s — the JVM +
agent bytecode instrumentation simply takes >30s on overloaded CI machines.

Changes:
- Add `startupPoll` with 120s timeout for the initial `waitForTraceCount(2)`
  that covers JVM startup + agent init, giving 4x headroom over the current
  30s `defaultPoll`
- Add `waitForTraceCountAlive` that checks process liveness on each poll
  iteration, turning silent 30-120s timeouts into instant, actionable errors
  when the process crashes
- Reorder `waitForTraceCount(4)` before `waitFor` to confirm all traces are
  delivered while the process is still alive
- Assert `waitFor` return value for a clear error if the process hangs

tag: no release note

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
@bm1549 bm1549 added type: bug Bug report and fix comp: core Tracer core tag: no release notes Changes to exclude from release notes tag: ai generated Largely based on code generated by an AI or LLM labels Apr 9, 2026
@pr-commenter
Copy link
Copy Markdown

pr-commenter bot commented Apr 9, 2026

Benchmarks

Startup

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master brian.marks/fix-log-injection-smoke-test-flake
git_commit_date 1775744045 1775764729
git_commit_sha b266e2d 9a554fc
release_version 1.62.0-SNAPSHOT~b266e2d0c2 1.62.0-SNAPSHOT~9a554fcbf3
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1775771791 1775771791
ci_job_id 1583602920 1583602920
ci_pipeline_id 106987322 106987322
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-0-859ey7d8 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-0-859ey7d8 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
module Agent Agent
parent None None

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 61 metrics, 10 unstable metrics.

Startup time reports for insecure-bank
gantt
    title insecure-bank - global startup overhead: candidate=1.62.0-SNAPSHOT~9a554fcbf3, baseline=1.62.0-SNAPSHOT~b266e2d0c2

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.068 s) : 0, 1068067
Total [baseline] (8.849 s) : 0, 8849434
Agent [candidate] (1.06 s) : 0, 1059507
Total [candidate] (8.835 s) : 0, 8834611
section iast
Agent [baseline] (1.224 s) : 0, 1224300
Total [baseline] (9.572 s) : 0, 9572168
Agent [candidate] (1.226 s) : 0, 1225816
Total [candidate] (9.582 s) : 0, 9581846
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.068 s -
Agent iast 1.224 s 156.232 ms (14.6%)
Total tracing 8.849 s -
Total iast 9.572 s 722.733 ms (8.2%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.06 s -
Agent iast 1.226 s 166.309 ms (15.7%)
Total tracing 8.835 s -
Total iast 9.582 s 747.235 ms (8.5%)
gantt
    title insecure-bank - break down per module: candidate=1.62.0-SNAPSHOT~9a554fcbf3, baseline=1.62.0-SNAPSHOT~b266e2d0c2

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.251 ms) : 0, 1251
crashtracking [candidate] (1.224 ms) : 0, 1224
BytebuddyAgent [baseline] (638.326 ms) : 0, 638326
BytebuddyAgent [candidate] (633.791 ms) : 0, 633791
AgentMeter [baseline] (29.743 ms) : 0, 29743
AgentMeter [candidate] (29.517 ms) : 0, 29517
GlobalTracer [baseline] (251.297 ms) : 0, 251297
GlobalTracer [candidate] (249.054 ms) : 0, 249054
AppSec [baseline] (32.346 ms) : 0, 32346
AppSec [candidate] (31.921 ms) : 0, 31921
Debugger [baseline] (60.086 ms) : 0, 60086
Debugger [candidate] (59.322 ms) : 0, 59322
Remote Config [baseline] (615.472 µs) : 0, 615
Remote Config [candidate] (594.755 µs) : 0, 595
Telemetry [baseline] (8.223 ms) : 0, 8223
Telemetry [candidate] (8.115 ms) : 0, 8115
Flare Poller [baseline] (9.78 ms) : 0, 9780
Flare Poller [candidate] (9.731 ms) : 0, 9731
section iast
crashtracking [baseline] (1.236 ms) : 0, 1236
crashtracking [candidate] (1.23 ms) : 0, 1230
BytebuddyAgent [baseline] (800.972 ms) : 0, 800972
BytebuddyAgent [candidate] (802.871 ms) : 0, 802871
AgentMeter [baseline] (11.395 ms) : 0, 11395
AgentMeter [candidate] (11.478 ms) : 0, 11478
GlobalTracer [baseline] (239.817 ms) : 0, 239817
GlobalTracer [candidate] (240.472 ms) : 0, 240472
AppSec [baseline] (31.14 ms) : 0, 31140
AppSec [candidate] (33.427 ms) : 0, 33427
Debugger [baseline] (60.062 ms) : 0, 60062
Debugger [candidate] (56.782 ms) : 0, 56782
Remote Config [baseline] (539.582 µs) : 0, 540
Remote Config [candidate] (517.713 µs) : 0, 518
Telemetry [baseline] (13.207 ms) : 0, 13207
Telemetry [candidate] (13.151 ms) : 0, 13151
Flare Poller [baseline] (3.612 ms) : 0, 3612
Flare Poller [candidate] (3.403 ms) : 0, 3403
IAST [baseline] (25.906 ms) : 0, 25906
IAST [candidate] (25.928 ms) : 0, 25928
Loading
Startup time reports for petclinic
gantt
    title petclinic - global startup overhead: candidate=1.62.0-SNAPSHOT~9a554fcbf3, baseline=1.62.0-SNAPSHOT~b266e2d0c2

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.06 s) : 0, 1060135
Total [baseline] (11.099 s) : 0, 11098805
Agent [candidate] (1.074 s) : 0, 1073912
Total [candidate] (11.183 s) : 0, 11183254
section appsec
Agent [baseline] (1.25 s) : 0, 1250296
Total [baseline] (11.213 s) : 0, 11212607
Agent [candidate] (1.255 s) : 0, 1255449
Total [candidate] (11.269 s) : 0, 11268944
section iast
Agent [baseline] (1.225 s) : 0, 1224831
Total [baseline] (11.304 s) : 0, 11303781
Agent [candidate] (1.224 s) : 0, 1223713
Total [candidate] (11.347 s) : 0, 11347404
section profiling
Agent [baseline] (1.186 s) : 0, 1185957
Total [baseline] (11.021 s) : 0, 11020816
Agent [candidate] (1.184 s) : 0, 1184064
Total [candidate] (11.091 s) : 0, 11090510
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.06 s -
Agent appsec 1.25 s 190.162 ms (17.9%)
Agent iast 1.225 s 164.696 ms (15.5%)
Agent profiling 1.186 s 125.823 ms (11.9%)
Total tracing 11.099 s -
Total appsec 11.213 s 113.802 ms (1.0%)
Total iast 11.304 s 204.975 ms (1.8%)
Total profiling 11.021 s -77.99 ms (-0.7%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.074 s -
Agent appsec 1.255 s 181.537 ms (16.9%)
Agent iast 1.224 s 149.801 ms (13.9%)
Agent profiling 1.184 s 110.152 ms (10.3%)
Total tracing 11.183 s -
Total appsec 11.269 s 85.69 ms (0.8%)
Total iast 11.347 s 164.149 ms (1.5%)
Total profiling 11.091 s -92.744 ms (-0.8%)
gantt
    title petclinic - break down per module: candidate=1.62.0-SNAPSHOT~9a554fcbf3, baseline=1.62.0-SNAPSHOT~b266e2d0c2

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.249 ms) : 0, 1249
crashtracking [candidate] (1.258 ms) : 0, 1258
BytebuddyAgent [baseline] (633.195 ms) : 0, 633195
BytebuddyAgent [candidate] (641.6 ms) : 0, 641600
AgentMeter [baseline] (29.293 ms) : 0, 29293
AgentMeter [candidate] (29.873 ms) : 0, 29873
GlobalTracer [baseline] (249.556 ms) : 0, 249556
GlobalTracer [candidate] (252.545 ms) : 0, 252545
AppSec [baseline] (32.047 ms) : 0, 32047
AppSec [candidate] (32.56 ms) : 0, 32560
Debugger [baseline] (59.996 ms) : 0, 59996
Debugger [candidate] (60.974 ms) : 0, 60974
Remote Config [baseline] (601.662 µs) : 0, 602
Remote Config [candidate] (606.693 µs) : 0, 607
Telemetry [baseline] (8.12 ms) : 0, 8120
Telemetry [candidate] (8.226 ms) : 0, 8226
Flare Poller [baseline] (9.842 ms) : 0, 9842
Flare Poller [candidate] (9.751 ms) : 0, 9751
section appsec
crashtracking [baseline] (1.23 ms) : 0, 1230
crashtracking [candidate] (1.233 ms) : 0, 1233
BytebuddyAgent [baseline] (661.306 ms) : 0, 661306
BytebuddyAgent [candidate] (666.516 ms) : 0, 666516
AgentMeter [baseline] (12.116 ms) : 0, 12116
AgentMeter [candidate] (12.135 ms) : 0, 12135
GlobalTracer [baseline] (250.33 ms) : 0, 250330
GlobalTracer [candidate] (250.0 ms) : 0, 250000
AppSec [baseline] (185.036 ms) : 0, 185036
AppSec [candidate] (184.951 ms) : 0, 184951
Debugger [baseline] (66.366 ms) : 0, 66366
Debugger [candidate] (66.622 ms) : 0, 66622
Remote Config [baseline] (619.492 µs) : 0, 619
Remote Config [candidate] (606.051 µs) : 0, 606
Telemetry [baseline] (8.678 ms) : 0, 8678
Telemetry [candidate] (8.651 ms) : 0, 8651
Flare Poller [baseline] (3.515 ms) : 0, 3515
Flare Poller [candidate] (3.586 ms) : 0, 3586
IAST [baseline] (24.675 ms) : 0, 24675
IAST [candidate] (24.683 ms) : 0, 24683
section iast
crashtracking [baseline] (1.235 ms) : 0, 1235
crashtracking [candidate] (1.223 ms) : 0, 1223
BytebuddyAgent [baseline] (801.776 ms) : 0, 801776
BytebuddyAgent [candidate] (801.43 ms) : 0, 801430
AgentMeter [baseline] (11.415 ms) : 0, 11415
AgentMeter [candidate] (11.39 ms) : 0, 11390
GlobalTracer [baseline] (239.354 ms) : 0, 239354
GlobalTracer [candidate] (239.093 ms) : 0, 239093
AppSec [baseline] (31.846 ms) : 0, 31846
AppSec [candidate] (33.159 ms) : 0, 33159
Debugger [baseline] (59.442 ms) : 0, 59442
Debugger [candidate] (57.981 ms) : 0, 57981
Remote Config [baseline] (540.82 µs) : 0, 541
Remote Config [candidate] (530.053 µs) : 0, 530
Telemetry [baseline] (12.909 ms) : 0, 12909
Telemetry [candidate] (13.349 ms) : 0, 13349
Flare Poller [baseline] (3.426 ms) : 0, 3426
Flare Poller [candidate] (3.429 ms) : 0, 3429
IAST [baseline] (26.564 ms) : 0, 26564
IAST [candidate] (25.817 ms) : 0, 25817
section profiling
crashtracking [baseline] (1.181 ms) : 0, 1181
crashtracking [candidate] (1.164 ms) : 0, 1164
BytebuddyAgent [baseline] (693.053 ms) : 0, 693053
BytebuddyAgent [candidate] (691.543 ms) : 0, 691543
AgentMeter [baseline] (9.163 ms) : 0, 9163
AgentMeter [candidate] (9.131 ms) : 0, 9131
GlobalTracer [baseline] (206.862 ms) : 0, 206862
GlobalTracer [candidate] (206.912 ms) : 0, 206912
AppSec [baseline] (32.47 ms) : 0, 32470
AppSec [candidate] (32.576 ms) : 0, 32576
Debugger [baseline] (65.757 ms) : 0, 65757
Debugger [candidate] (65.603 ms) : 0, 65603
Remote Config [baseline] (587.963 µs) : 0, 588
Remote Config [candidate] (572.814 µs) : 0, 573
Telemetry [baseline] (7.871 ms) : 0, 7871
Telemetry [candidate] (7.863 ms) : 0, 7863
Flare Poller [baseline] (3.559 ms) : 0, 3559
Flare Poller [candidate] (3.57 ms) : 0, 3570
ProfilingAgent [baseline] (94.125 ms) : 0, 94125
ProfilingAgent [candidate] (93.889 ms) : 0, 93889
Profiling [baseline] (94.698 ms) : 0, 94698
Profiling [candidate] (94.466 ms) : 0, 94466
Loading

Load

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master brian.marks/fix-log-injection-smoke-test-flake
git_commit_date 1775744045 1775764729
git_commit_sha b266e2d 9a554fc
release_version 1.62.0-SNAPSHOT~b266e2d0c2 1.62.0-SNAPSHOT~9a554fcbf3
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1775772278 1775772278
ci_job_id 1583602921 1583602921
ci_pipeline_id 106987322 106987322
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-0-li2j6yne 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-0-li2j6yne 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 3 performance improvements and 2 performance regressions! Performance is the same for 16 metrics, 15 unstable metrics.

scenario Δ mean agg_http_req_duration_p50 Δ mean agg_http_req_duration_p95 Δ mean throughput candidate mean agg_http_req_duration_p50 candidate mean agg_http_req_duration_p95 candidate mean throughput baseline mean agg_http_req_duration_p50 baseline mean agg_http_req_duration_p95 baseline mean throughput
scenario:load:petclinic:no_agent:high_load better
[-1.808ms; -0.485ms] or [-9.685%; -2.597%]
unsure
[-3.163ms; -0.213ms] or [-10.144%; -0.684%]
unstable
[-11.942op/s; +41.880op/s] or [-4.911%; +17.223%]
17.526ms 29.498ms 258.125op/s 18.673ms 31.186ms 243.156op/s
scenario:load:petclinic:appsec:high_load worse
[+0.459ms; +1.619ms] or [+2.454%; +8.651%]
worse
[+0.701ms; +2.494ms] or [+2.327%; +8.282%]
unstable
[-35.920op/s; +13.732op/s] or [-14.629%; +5.593%]
19.750ms 31.706ms 234.438op/s 18.711ms 30.109ms 245.531op/s
scenario:load:petclinic:profiling:high_load better
[-1.697ms; -0.737ms] or [-8.895%; -3.862%]
better
[-2.469ms; -0.865ms] or [-7.945%; -2.783%]
unstable
[-12.744op/s; +38.057op/s] or [-5.265%; +15.722%]
17.862ms 29.414ms 254.719op/s 19.079ms 31.082ms 242.062op/s
Request duration reports for petclinic
gantt
    title petclinic - request duration [CI 0.99] : candidate=1.62.0-SNAPSHOT~9a554fcbf3, baseline=1.62.0-SNAPSHOT~b266e2d0c2
    dateFormat X
    axisFormat %s
section baseline
no_agent (19.2 ms) : 19005, 19394
.   : milestone, 19200,
appsec (19.006 ms) : 18815, 19197
.   : milestone, 19006,
code_origins (18.062 ms) : 17883, 18240
.   : milestone, 18062,
iast (17.73 ms) : 17556, 17904
.   : milestone, 17730,
profiling (19.284 ms) : 19086, 19482
.   : milestone, 19284,
tracing (17.86 ms) : 17682, 18037
.   : milestone, 17860,
section candidate
no_agent (18.081 ms) : 17896, 18265
.   : milestone, 18081,
appsec (19.911 ms) : 19706, 20116
.   : milestone, 19911,
code_origins (18.134 ms) : 17955, 18312
.   : milestone, 18134,
iast (17.644 ms) : 17472, 17816
.   : milestone, 17644,
profiling (18.324 ms) : 18139, 18509
.   : milestone, 18324,
tracing (17.918 ms) : 17740, 18096
.   : milestone, 17918,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 19.2 ms [19.005 ms, 19.394 ms] -
appsec 19.006 ms [18.815 ms, 19.197 ms] -193.382 µs (-1.0%)
code_origins 18.062 ms [17.883 ms, 18.24 ms] -1.138 ms (-5.9%)
iast 17.73 ms [17.556 ms, 17.904 ms] -1.47 ms (-7.7%)
profiling 19.284 ms [19.086 ms, 19.482 ms] 84.007 µs (0.4%)
tracing 17.86 ms [17.682 ms, 18.037 ms] -1.34 ms (-7.0%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 18.081 ms [17.896 ms, 18.265 ms] -
appsec 19.911 ms [19.706 ms, 20.116 ms] 1.831 ms (10.1%)
code_origins 18.134 ms [17.955 ms, 18.312 ms] 53.041 µs (0.3%)
iast 17.644 ms [17.472 ms, 17.816 ms] -436.568 µs (-2.4%)
profiling 18.324 ms [18.139 ms, 18.509 ms] 243.551 µs (1.3%)
tracing 17.918 ms [17.74 ms, 18.096 ms] -162.566 µs (-0.9%)
Request duration reports for insecure-bank
gantt
    title insecure-bank - request duration [CI 0.99] : candidate=1.62.0-SNAPSHOT~9a554fcbf3, baseline=1.62.0-SNAPSHOT~b266e2d0c2
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.243 ms) : 1231, 1255
.   : milestone, 1243,
iast (3.337 ms) : 3288, 3386
.   : milestone, 3337,
iast_FULL (6.056 ms) : 5995, 6117
.   : milestone, 6056,
iast_GLOBAL (3.74 ms) : 3679, 3801
.   : milestone, 3740,
profiling (2.279 ms) : 2258, 2301
.   : milestone, 2279,
tracing (1.881 ms) : 1865, 1897
.   : milestone, 1881,
section candidate
no_agent (1.252 ms) : 1240, 1264
.   : milestone, 1252,
iast (3.306 ms) : 3257, 3354
.   : milestone, 3306,
iast_FULL (5.891 ms) : 5832, 5950
.   : milestone, 5891,
iast_GLOBAL (3.677 ms) : 3615, 3739
.   : milestone, 3677,
profiling (2.367 ms) : 2344, 2390
.   : milestone, 2367,
tracing (1.867 ms) : 1852, 1882
.   : milestone, 1867,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.243 ms [1.231 ms, 1.255 ms] -
iast 3.337 ms [3.288 ms, 3.386 ms] 2.094 ms (168.5%)
iast_FULL 6.056 ms [5.995 ms, 6.117 ms] 4.813 ms (387.3%)
iast_GLOBAL 3.74 ms [3.679 ms, 3.801 ms] 2.497 ms (200.9%)
profiling 2.279 ms [2.258 ms, 2.301 ms] 1.036 ms (83.4%)
tracing 1.881 ms [1.865 ms, 1.897 ms] 637.931 µs (51.3%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.252 ms [1.24 ms, 1.264 ms] -
iast 3.306 ms [3.257 ms, 3.354 ms] 2.054 ms (164.1%)
iast_FULL 5.891 ms [5.832 ms, 5.95 ms] 4.64 ms (370.7%)
iast_GLOBAL 3.677 ms [3.615 ms, 3.739 ms] 2.426 ms (193.8%)
profiling 2.367 ms [2.344 ms, 2.39 ms] 1.115 ms (89.1%)
tracing 1.867 ms [1.852 ms, 1.882 ms] 615.089 µs (49.1%)

Dacapo

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master brian.marks/fix-log-injection-smoke-test-flake
git_commit_date 1775744045 1775764729
git_commit_sha b266e2d 9a554fc
release_version 1.62.0-SNAPSHOT~b266e2d0c2 1.62.0-SNAPSHOT~9a554fcbf3
See matching parameters
Baseline Candidate
application biojava biojava
ci_job_date 1775771998 1775771998
ci_job_id 1583602922 1583602922
ci_pipeline_id 106987322 106987322
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-0-8p30iw02 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-0-8p30iw02 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 11 metrics, 1 unstable metrics.

Execution time for biojava
gantt
    title biojava - execution time [CI 0.99] : candidate=1.62.0-SNAPSHOT~9a554fcbf3, baseline=1.62.0-SNAPSHOT~b266e2d0c2
    dateFormat X
    axisFormat %s
section baseline
no_agent (15.674 s) : 15674000, 15674000
.   : milestone, 15674000,
appsec (15.087 s) : 15087000, 15087000
.   : milestone, 15087000,
iast (18.249 s) : 18249000, 18249000
.   : milestone, 18249000,
iast_GLOBAL (18.156 s) : 18156000, 18156000
.   : milestone, 18156000,
profiling (14.801 s) : 14801000, 14801000
.   : milestone, 14801000,
tracing (14.971 s) : 14971000, 14971000
.   : milestone, 14971000,
section candidate
no_agent (14.951 s) : 14951000, 14951000
.   : milestone, 14951000,
appsec (14.747 s) : 14747000, 14747000
.   : milestone, 14747000,
iast (18.143 s) : 18143000, 18143000
.   : milestone, 18143000,
iast_GLOBAL (18.111 s) : 18111000, 18111000
.   : milestone, 18111000,
profiling (14.864 s) : 14864000, 14864000
.   : milestone, 14864000,
tracing (14.882 s) : 14882000, 14882000
.   : milestone, 14882000,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 15.674 s [15.674 s, 15.674 s] -
appsec 15.087 s [15.087 s, 15.087 s] -587.0 ms (-3.7%)
iast 18.249 s [18.249 s, 18.249 s] 2.575 s (16.4%)
iast_GLOBAL 18.156 s [18.156 s, 18.156 s] 2.482 s (15.8%)
profiling 14.801 s [14.801 s, 14.801 s] -873.0 ms (-5.6%)
tracing 14.971 s [14.971 s, 14.971 s] -703.0 ms (-4.5%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 14.951 s [14.951 s, 14.951 s] -
appsec 14.747 s [14.747 s, 14.747 s] -204.0 ms (-1.4%)
iast 18.143 s [18.143 s, 18.143 s] 3.192 s (21.3%)
iast_GLOBAL 18.111 s [18.111 s, 18.111 s] 3.16 s (21.1%)
profiling 14.864 s [14.864 s, 14.864 s] -87.0 ms (-0.6%)
tracing 14.882 s [14.882 s, 14.882 s] -69.0 ms (-0.5%)
Execution time for tomcat
gantt
    title tomcat - execution time [CI 0.99] : candidate=1.62.0-SNAPSHOT~9a554fcbf3, baseline=1.62.0-SNAPSHOT~b266e2d0c2
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.492 ms) : 1481, 1504
.   : milestone, 1492,
appsec (3.87 ms) : 3647, 4094
.   : milestone, 3870,
iast (2.272 ms) : 2203, 2341
.   : milestone, 2272,
iast_GLOBAL (2.317 ms) : 2247, 2386
.   : milestone, 2317,
profiling (2.095 ms) : 2041, 2150
.   : milestone, 2095,
tracing (2.091 ms) : 2038, 2145
.   : milestone, 2091,
section candidate
no_agent (1.488 ms) : 1476, 1499
.   : milestone, 1488,
appsec (3.818 ms) : 3596, 4040
.   : milestone, 3818,
iast (2.267 ms) : 2198, 2336
.   : milestone, 2267,
iast_GLOBAL (2.314 ms) : 2245, 2383
.   : milestone, 2314,
profiling (2.122 ms) : 2066, 2177
.   : milestone, 2122,
tracing (2.088 ms) : 2034, 2142
.   : milestone, 2088,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.492 ms [1.481 ms, 1.504 ms] -
appsec 3.87 ms [3.647 ms, 4.094 ms] 2.378 ms (159.4%)
iast 2.272 ms [2.203 ms, 2.341 ms] 780.076 µs (52.3%)
iast_GLOBAL 2.317 ms [2.247 ms, 2.386 ms] 824.479 µs (55.2%)
profiling 2.095 ms [2.041 ms, 2.15 ms] 602.821 µs (40.4%)
tracing 2.091 ms [2.038 ms, 2.145 ms] 598.787 µs (40.1%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.488 ms [1.476 ms, 1.499 ms] -
appsec 3.818 ms [3.596 ms, 4.04 ms] 2.33 ms (156.7%)
iast 2.267 ms [2.198 ms, 2.336 ms] 779.35 µs (52.4%)
iast_GLOBAL 2.314 ms [2.245 ms, 2.383 ms] 826.48 µs (55.6%)
profiling 2.122 ms [2.066 ms, 2.177 ms] 634.237 µs (42.6%)
tracing 2.088 ms [2.034 ms, 2.142 ms] 600.575 µs (40.4%)

The `check raw file injection` test flakes across 11+ logging backend
variants. CI Visibility data shows the failure is bimodal — successful
runs complete in 3-9s, but failures sit at exactly 30s (the
PollingConditions timeout) with traceCount=0. Nothing in between. This
means the process either works or is totally broken — no amount of
timeout increase will help.

The current test is blind during the 30s wait — it just polls
traceCount with no diagnostics when the process crashes or hangs.

Changes:
- Add `waitForTraceCountAlive` that checks process liveness on every
  poll iteration. If the process dies, it fails immediately with the
  exit code, RC poll count, and last 20 lines of process output.
- On timeout, enrich the error with diagnostic state (process alive?,
  traceCount, RC polls received, last 30 lines of output) so the next
  CI failure tells us whether it's a crash, a hang, or a connectivity
  issue.
- Reorder `waitForTraceCount(4)` before `waitFor` to confirm all
  traces are delivered while the process is still alive.
- Assert `waitFor` return value for a clear error if the process hangs.

tag: no release notes

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
@bm1549 bm1549 changed the title Fix log injection smoke test flakiness from startup timeout Diagnose log injection smoke test flakiness instead of masking it Apr 9, 2026
The liveness check fired before the trace count check, so a normal
process exit after delivering all traces was treated as a failure.
Check traceCount >= count first and return early if satisfied.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp: core Tracer core tag: ai generated Largely based on code generated by an AI or LLM tag: no release notes Changes to exclude from release notes type: bug Bug report and fix

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant