Diagnose log injection smoke test flakiness instead of masking it by bm1549 · Pull Request #11075 · DataDog/dd-trace-java

bm1549 · 2026-04-09T18:33:13Z

What Does This Do

Adds diagnostic instrumentation to the check raw file injection smoke test so the next CI failure tells us the root cause instead of a bare "Condition not satisfied after 30s" with traceCount=0.

Changes to LogInjectionSmokeTest:

waitForTraceCountAlive — checks process liveness on every poll iteration; if the process dies, fails immediately with exit code + last 20 lines of process output
Enriched timeout errors — on timeout, dumps: process alive?, traceCount, RC polls received, last 30 lines of process output
Reorder waitForTraceCount(4) before waitFor + assert waitFor return value

Motivation

CI Visibility data for the last 30 days on master shows 10 failures of check raw file injection:

Failure mode	Count	Line	Duration	Root cause
`traceCount=0` at `waitForTraceCount(2)`	9/10	368	30.3s	Unknown — no diagnostics
`logLines.size()=3` at `assertRawLogLinesWithInjection`	1/10	229	8.3s	Incomplete log file

The failure distribution is bimodal — successful runs complete in 3.5-8.7s (80 data points, zero above 9s), while failures sit at exactly 30.3s. There is nothing in between. This means the process either works or is totally broken — a timeout increase would just delay the same failure.

<9s:  ████████████████████████████████████████  80/80 passes
9-30s:                                           0 runs
30s:  █████████                                  9/10 failures (at timeout)

The current test is blind during the wait — it just polls traceCount in a loop. We don't know if the process crashed, hung during agent init, failed to connect to the test server, or something else entirely. This PR makes the next failure self-diagnosing.

Example output when process crashes:

Process exited with code 1 while waiting for 2 traces (received 0, RC polls: 3).
Last process output:
[dd.trace ...] ERROR ... NullPointerException during instrumentation
...

Example output on timeout (process alive but not sending traces):

Timed out waiting for 2 traces after 30s. traceCount=0, process.alive=true, RC polls received: 142.
Last process output:
[dd.trace ...] DEBUG ... Still loading instrumentations...
...

Additional Notes

Only LogInjectionSmokeTest.groovy is changed
No timeout increase — the 30s defaultPoll is kept as-is
All 11 historically flaky backends pass locally
rcClientMessages.size() tells us whether the agent connected to the test server at all (RC polls hit /v0.7/config every 200ms)

Contributor Checklist

Format the title according to the contribution guidelines
Assign the type: and (comp: or inst:) labels
Avoid using close, fix, or any linking keywords when referencing an issue
Update the CODEOWNERS file on source file addition, migration, or deletion — N/A (no file additions)
Update public documentation with any new configuration flags or behaviors — N/A (test-only change)

tag: no release notes
tag: ai generated

🤖 Generated with Claude Code

The `check raw file injection` test has been flaking across 11+ logging backend variants for months. CI Visibility data shows 90% of failures are `traceCount=0` at `waitForTraceCount(2)` after exactly 30s — the JVM + agent bytecode instrumentation simply takes >30s on overloaded CI machines. Changes: - Add `startupPoll` with 120s timeout for the initial `waitForTraceCount(2)` that covers JVM startup + agent init, giving 4x headroom over the current 30s `defaultPoll` - Add `waitForTraceCountAlive` that checks process liveness on each poll iteration, turning silent 30-120s timeouts into instant, actionable errors when the process crashes - Reorder `waitForTraceCount(4)` before `waitFor` to confirm all traces are delivered while the process is still alive - Assert `waitFor` return value for a clear error if the process hangs tag: no release note Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

pr-commenter · 2026-04-09T19:20:03Z

Benchmarks

Startup

Parameters

	Baseline	Candidate
baseline_or_candidate	baseline	candidate
git_branch	master	brian.marks/fix-log-injection-smoke-test-flake
git_commit_date	1775744045	1775764729
git_commit_sha	`b266e2d`	`9a554fc`
release_version	1.62.0-SNAPSHOT~b266e2d0c2	1.62.0-SNAPSHOT~9a554fcbf3

See matching parameters

	Baseline	Candidate
application	insecure-bank	insecure-bank
ci_job_date	1775771791	1775771791
ci_job_id	1583602920	1583602920
ci_pipeline_id	106987322	106987322
cpu_model	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version	Linux runner-zfyrx7zua-project-304-concurrent-0-859ey7d8 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux	Linux runner-zfyrx7zua-project-304-concurrent-0-859ey7d8 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
module	Agent	Agent
parent	None	None

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 61 metrics, 10 unstable metrics.

Startup time reports for insecure-bank

gantt
    title insecure-bank - global startup overhead: candidate=1.62.0-SNAPSHOT~9a554fcbf3, baseline=1.62.0-SNAPSHOT~b266e2d0c2

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.068 s) : 0, 1068067
Total [baseline] (8.849 s) : 0, 8849434
Agent [candidate] (1.06 s) : 0, 1059507
Total [candidate] (8.835 s) : 0, 8834611
section iast
Agent [baseline] (1.224 s) : 0, 1224300
Total [baseline] (9.572 s) : 0, 9572168
Agent [candidate] (1.226 s) : 0, 1225816
Total [candidate] (9.582 s) : 0, 9581846

baseline results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.068 s	-
Agent	iast	1.224 s	156.232 ms (14.6%)
Total	tracing	8.849 s	-
Total	iast	9.572 s	722.733 ms (8.2%)

candidate results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.06 s	-
Agent	iast	1.226 s	166.309 ms (15.7%)
Total	tracing	8.835 s	-
Total	iast	9.582 s	747.235 ms (8.5%)

gantt
    title insecure-bank - break down per module: candidate=1.62.0-SNAPSHOT~9a554fcbf3, baseline=1.62.0-SNAPSHOT~b266e2d0c2

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.251 ms) : 0, 1251
crashtracking [candidate] (1.224 ms) : 0, 1224
BytebuddyAgent [baseline] (638.326 ms) : 0, 638326
BytebuddyAgent [candidate] (633.791 ms) : 0, 633791
AgentMeter [baseline] (29.743 ms) : 0, 29743
AgentMeter [candidate] (29.517 ms) : 0, 29517
GlobalTracer [baseline] (251.297 ms) : 0, 251297
GlobalTracer [candidate] (249.054 ms) : 0, 249054
AppSec [baseline] (32.346 ms) : 0, 32346
AppSec [candidate] (31.921 ms) : 0, 31921
Debugger [baseline] (60.086 ms) : 0, 60086
Debugger [candidate] (59.322 ms) : 0, 59322
Remote Config [baseline] (615.472 µs) : 0, 615
Remote Config [candidate] (594.755 µs) : 0, 595
Telemetry [baseline] (8.223 ms) : 0, 8223
Telemetry [candidate] (8.115 ms) : 0, 8115
Flare Poller [baseline] (9.78 ms) : 0, 9780
Flare Poller [candidate] (9.731 ms) : 0, 9731
section iast
crashtracking [baseline] (1.236 ms) : 0, 1236
crashtracking [candidate] (1.23 ms) : 0, 1230
BytebuddyAgent [baseline] (800.972 ms) : 0, 800972
BytebuddyAgent [candidate] (802.871 ms) : 0, 802871
AgentMeter [baseline] (11.395 ms) : 0, 11395
AgentMeter [candidate] (11.478 ms) : 0, 11478
GlobalTracer [baseline] (239.817 ms) : 0, 239817
GlobalTracer [candidate] (240.472 ms) : 0, 240472
AppSec [baseline] (31.14 ms) : 0, 31140
AppSec [candidate] (33.427 ms) : 0, 33427
Debugger [baseline] (60.062 ms) : 0, 60062
Debugger [candidate] (56.782 ms) : 0, 56782
Remote Config [baseline] (539.582 µs) : 0, 540
Remote Config [candidate] (517.713 µs) : 0, 518
Telemetry [baseline] (13.207 ms) : 0, 13207
Telemetry [candidate] (13.151 ms) : 0, 13151
Flare Poller [baseline] (3.612 ms) : 0, 3612
Flare Poller [candidate] (3.403 ms) : 0, 3403
IAST [baseline] (25.906 ms) : 0, 25906
IAST [candidate] (25.928 ms) : 0, 25928

Startup time reports for petclinic

gantt
    title petclinic - global startup overhead: candidate=1.62.0-SNAPSHOT~9a554fcbf3, baseline=1.62.0-SNAPSHOT~b266e2d0c2

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.06 s) : 0, 1060135
Total [baseline] (11.099 s) : 0, 11098805
Agent [candidate] (1.074 s) : 0, 1073912
Total [candidate] (11.183 s) : 0, 11183254
section appsec
Agent [baseline] (1.25 s) : 0, 1250296
Total [baseline] (11.213 s) : 0, 11212607
Agent [candidate] (1.255 s) : 0, 1255449
Total [candidate] (11.269 s) : 0, 11268944
section iast
Agent [baseline] (1.225 s) : 0, 1224831
Total [baseline] (11.304 s) : 0, 11303781
Agent [candidate] (1.224 s) : 0, 1223713
Total [candidate] (11.347 s) : 0, 11347404
section profiling
Agent [baseline] (1.186 s) : 0, 1185957
Total [baseline] (11.021 s) : 0, 11020816
Agent [candidate] (1.184 s) : 0, 1184064
Total [candidate] (11.091 s) : 0, 11090510

baseline results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.06 s	-
Agent	appsec	1.25 s	190.162 ms (17.9%)
Agent	iast	1.225 s	164.696 ms (15.5%)
Agent	profiling	1.186 s	125.823 ms (11.9%)
Total	tracing	11.099 s	-
Total	appsec	11.213 s	113.802 ms (1.0%)
Total	iast	11.304 s	204.975 ms (1.8%)
Total	profiling	11.021 s	-77.99 ms (-0.7%)

candidate results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.074 s	-
Agent	appsec	1.255 s	181.537 ms (16.9%)
Agent	iast	1.224 s	149.801 ms (13.9%)
Agent	profiling	1.184 s	110.152 ms (10.3%)
Total	tracing	11.183 s	-
Total	appsec	11.269 s	85.69 ms (0.8%)
Total	iast	11.347 s	164.149 ms (1.5%)
Total	profiling	11.091 s	-92.744 ms (-0.8%)

gantt
    title petclinic - break down per module: candidate=1.62.0-SNAPSHOT~9a554fcbf3, baseline=1.62.0-SNAPSHOT~b266e2d0c2

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.249 ms) : 0, 1249
crashtracking [candidate] (1.258 ms) : 0, 1258
BytebuddyAgent [baseline] (633.195 ms) : 0, 633195
BytebuddyAgent [candidate] (641.6 ms) : 0, 641600
AgentMeter [baseline] (29.293 ms) : 0, 29293
AgentMeter [candidate] (29.873 ms) : 0, 29873
GlobalTracer [baseline] (249.556 ms) : 0, 249556
GlobalTracer [candidate] (252.545 ms) : 0, 252545
AppSec [baseline] (32.047 ms) : 0, 32047
AppSec [candidate] (32.56 ms) : 0, 32560
Debugger [baseline] (59.996 ms) : 0, 59996
Debugger [candidate] (60.974 ms) : 0, 60974
Remote Config [baseline] (601.662 µs) : 0, 602
Remote Config [candidate] (606.693 µs) : 0, 607
Telemetry [baseline] (8.12 ms) : 0, 8120
Telemetry [candidate] (8.226 ms) : 0, 8226
Flare Poller [baseline] (9.842 ms) : 0, 9842
Flare Poller [candidate] (9.751 ms) : 0, 9751
section appsec
crashtracking [baseline] (1.23 ms) : 0, 1230
crashtracking [candidate] (1.233 ms) : 0, 1233
BytebuddyAgent [baseline] (661.306 ms) : 0, 661306
BytebuddyAgent [candidate] (666.516 ms) : 0, 666516
AgentMeter [baseline] (12.116 ms) : 0, 12116
AgentMeter [candidate] (12.135 ms) : 0, 12135
GlobalTracer [baseline] (250.33 ms) : 0, 250330
GlobalTracer [candidate] (250.0 ms) : 0, 250000
AppSec [baseline] (185.036 ms) : 0, 185036
AppSec [candidate] (184.951 ms) : 0, 184951
Debugger [baseline] (66.366 ms) : 0, 66366
Debugger [candidate] (66.622 ms) : 0, 66622
Remote Config [baseline] (619.492 µs) : 0, 619
Remote Config [candidate] (606.051 µs) : 0, 606
Telemetry [baseline] (8.678 ms) : 0, 8678
Telemetry [candidate] (8.651 ms) : 0, 8651
Flare Poller [baseline] (3.515 ms) : 0, 3515
Flare Poller [candidate] (3.586 ms) : 0, 3586
IAST [baseline] (24.675 ms) : 0, 24675
IAST [candidate] (24.683 ms) : 0, 24683
section iast
crashtracking [baseline] (1.235 ms) : 0, 1235
crashtracking [candidate] (1.223 ms) : 0, 1223
BytebuddyAgent [baseline] (801.776 ms) : 0, 801776
BytebuddyAgent [candidate] (801.43 ms) : 0, 801430
AgentMeter [baseline] (11.415 ms) : 0, 11415
AgentMeter [candidate] (11.39 ms) : 0, 11390
GlobalTracer [baseline] (239.354 ms) : 0, 239354
GlobalTracer [candidate] (239.093 ms) : 0, 239093
AppSec [baseline] (31.846 ms) : 0, 31846
AppSec [candidate] (33.159 ms) : 0, 33159
Debugger [baseline] (59.442 ms) : 0, 59442
Debugger [candidate] (57.981 ms) : 0, 57981
Remote Config [baseline] (540.82 µs) : 0, 541
Remote Config [candidate] (530.053 µs) : 0, 530
Telemetry [baseline] (12.909 ms) : 0, 12909
Telemetry [candidate] (13.349 ms) : 0, 13349
Flare Poller [baseline] (3.426 ms) : 0, 3426
Flare Poller [candidate] (3.429 ms) : 0, 3429
IAST [baseline] (26.564 ms) : 0, 26564
IAST [candidate] (25.817 ms) : 0, 25817
section profiling
crashtracking [baseline] (1.181 ms) : 0, 1181
crashtracking [candidate] (1.164 ms) : 0, 1164
BytebuddyAgent [baseline] (693.053 ms) : 0, 693053
BytebuddyAgent [candidate] (691.543 ms) : 0, 691543
AgentMeter [baseline] (9.163 ms) : 0, 9163
AgentMeter [candidate] (9.131 ms) : 0, 9131
GlobalTracer [baseline] (206.862 ms) : 0, 206862
GlobalTracer [candidate] (206.912 ms) : 0, 206912
AppSec [baseline] (32.47 ms) : 0, 32470
AppSec [candidate] (32.576 ms) : 0, 32576
Debugger [baseline] (65.757 ms) : 0, 65757
Debugger [candidate] (65.603 ms) : 0, 65603
Remote Config [baseline] (587.963 µs) : 0, 588
Remote Config [candidate] (572.814 µs) : 0, 573
Telemetry [baseline] (7.871 ms) : 0, 7871
Telemetry [candidate] (7.863 ms) : 0, 7863
Flare Poller [baseline] (3.559 ms) : 0, 3559
Flare Poller [candidate] (3.57 ms) : 0, 3570
ProfilingAgent [baseline] (94.125 ms) : 0, 94125
ProfilingAgent [candidate] (93.889 ms) : 0, 93889
Profiling [baseline] (94.698 ms) : 0, 94698
Profiling [candidate] (94.466 ms) : 0, 94466

Load

Parameters

	Baseline	Candidate
baseline_or_candidate	baseline	candidate
git_branch	master	brian.marks/fix-log-injection-smoke-test-flake
git_commit_date	1775744045	1775764729
git_commit_sha	`b266e2d`	`9a554fc`
release_version	1.62.0-SNAPSHOT~b266e2d0c2	1.62.0-SNAPSHOT~9a554fcbf3

See matching parameters

	Baseline	Candidate
application	insecure-bank	insecure-bank
ci_job_date	1775772278	1775772278
ci_job_id	1583602921	1583602921
ci_pipeline_id	106987322	106987322
cpu_model	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version	Linux runner-zfyrx7zua-project-304-concurrent-0-li2j6yne 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux	Linux runner-zfyrx7zua-project-304-concurrent-0-li2j6yne 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 3 performance improvements and 2 performance regressions! Performance is the same for 16 metrics, 15 unstable metrics.

scenario	Δ mean agg_http_req_duration_p50	Δ mean agg_http_req_duration_p95	Δ mean throughput	candidate mean agg_http_req_duration_p50	candidate mean agg_http_req_duration_p95	candidate mean throughput	baseline mean agg_http_req_duration_p50	baseline mean agg_http_req_duration_p95	baseline mean throughput
scenario:load:petclinic:no_agent:high_load	better [-1.808ms; -0.485ms] or [-9.685%; -2.597%]	unsure [-3.163ms; -0.213ms] or [-10.144%; -0.684%]	unstable [-11.942op/s; +41.880op/s] or [-4.911%; +17.223%]	17.526ms	29.498ms	258.125op/s	18.673ms	31.186ms	243.156op/s
scenario:load:petclinic:appsec:high_load	worse [+0.459ms; +1.619ms] or [+2.454%; +8.651%]	worse [+0.701ms; +2.494ms] or [+2.327%; +8.282%]	unstable [-35.920op/s; +13.732op/s] or [-14.629%; +5.593%]	19.750ms	31.706ms	234.438op/s	18.711ms	30.109ms	245.531op/s
scenario:load:petclinic:profiling:high_load	better [-1.697ms; -0.737ms] or [-8.895%; -3.862%]	better [-2.469ms; -0.865ms] or [-7.945%; -2.783%]	unstable [-12.744op/s; +38.057op/s] or [-5.265%; +15.722%]	17.862ms	29.414ms	254.719op/s	19.079ms	31.082ms	242.062op/s

Request duration reports for petclinic

gantt
    title petclinic - request duration [CI 0.99] : candidate=1.62.0-SNAPSHOT~9a554fcbf3, baseline=1.62.0-SNAPSHOT~b266e2d0c2
    dateFormat X
    axisFormat %s
section baseline
no_agent (19.2 ms) : 19005, 19394
.   : milestone, 19200,
appsec (19.006 ms) : 18815, 19197
.   : milestone, 19006,
code_origins (18.062 ms) : 17883, 18240
.   : milestone, 18062,
iast (17.73 ms) : 17556, 17904
.   : milestone, 17730,
profiling (19.284 ms) : 19086, 19482
.   : milestone, 19284,
tracing (17.86 ms) : 17682, 18037
.   : milestone, 17860,
section candidate
no_agent (18.081 ms) : 17896, 18265
.   : milestone, 18081,
appsec (19.911 ms) : 19706, 20116
.   : milestone, 19911,
code_origins (18.134 ms) : 17955, 18312
.   : milestone, 18134,
iast (17.644 ms) : 17472, 17816
.   : milestone, 17644,
profiling (18.324 ms) : 18139, 18509
.   : milestone, 18324,
tracing (17.918 ms) : 17740, 18096
.   : milestone, 17918,

baseline results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	19.2 ms [19.005 ms, 19.394 ms]	-
appsec	19.006 ms [18.815 ms, 19.197 ms]	-193.382 µs (-1.0%)
code_origins	18.062 ms [17.883 ms, 18.24 ms]	-1.138 ms (-5.9%)
iast	17.73 ms [17.556 ms, 17.904 ms]	-1.47 ms (-7.7%)
profiling	19.284 ms [19.086 ms, 19.482 ms]	84.007 µs (0.4%)
tracing	17.86 ms [17.682 ms, 18.037 ms]	-1.34 ms (-7.0%)

candidate results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	18.081 ms [17.896 ms, 18.265 ms]	-
appsec	19.911 ms [19.706 ms, 20.116 ms]	1.831 ms (10.1%)
code_origins	18.134 ms [17.955 ms, 18.312 ms]	53.041 µs (0.3%)
iast	17.644 ms [17.472 ms, 17.816 ms]	-436.568 µs (-2.4%)
profiling	18.324 ms [18.139 ms, 18.509 ms]	243.551 µs (1.3%)
tracing	17.918 ms [17.74 ms, 18.096 ms]	-162.566 µs (-0.9%)

Request duration reports for insecure-bank

gantt
    title insecure-bank - request duration [CI 0.99] : candidate=1.62.0-SNAPSHOT~9a554fcbf3, baseline=1.62.0-SNAPSHOT~b266e2d0c2
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.243 ms) : 1231, 1255
.   : milestone, 1243,
iast (3.337 ms) : 3288, 3386
.   : milestone, 3337,
iast_FULL (6.056 ms) : 5995, 6117
.   : milestone, 6056,
iast_GLOBAL (3.74 ms) : 3679, 3801
.   : milestone, 3740,
profiling (2.279 ms) : 2258, 2301
.   : milestone, 2279,
tracing (1.881 ms) : 1865, 1897
.   : milestone, 1881,
section candidate
no_agent (1.252 ms) : 1240, 1264
.   : milestone, 1252,
iast (3.306 ms) : 3257, 3354
.   : milestone, 3306,
iast_FULL (5.891 ms) : 5832, 5950
.   : milestone, 5891,
iast_GLOBAL (3.677 ms) : 3615, 3739
.   : milestone, 3677,
profiling (2.367 ms) : 2344, 2390
.   : milestone, 2367,
tracing (1.867 ms) : 1852, 1882
.   : milestone, 1867,

baseline results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	1.243 ms [1.231 ms, 1.255 ms]	-
iast	3.337 ms [3.288 ms, 3.386 ms]	2.094 ms (168.5%)
iast_FULL	6.056 ms [5.995 ms, 6.117 ms]	4.813 ms (387.3%)
iast_GLOBAL	3.74 ms [3.679 ms, 3.801 ms]	2.497 ms (200.9%)
profiling	2.279 ms [2.258 ms, 2.301 ms]	1.036 ms (83.4%)
tracing	1.881 ms [1.865 ms, 1.897 ms]	637.931 µs (51.3%)

candidate results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	1.252 ms [1.24 ms, 1.264 ms]	-
iast	3.306 ms [3.257 ms, 3.354 ms]	2.054 ms (164.1%)
iast_FULL	5.891 ms [5.832 ms, 5.95 ms]	4.64 ms (370.7%)
iast_GLOBAL	3.677 ms [3.615 ms, 3.739 ms]	2.426 ms (193.8%)
profiling	2.367 ms [2.344 ms, 2.39 ms]	1.115 ms (89.1%)
tracing	1.867 ms [1.852 ms, 1.882 ms]	615.089 µs (49.1%)

Dacapo

Parameters

	Baseline	Candidate
baseline_or_candidate	baseline	candidate
git_branch	master	brian.marks/fix-log-injection-smoke-test-flake
git_commit_date	1775744045	1775764729
git_commit_sha	`b266e2d`	`9a554fc`
release_version	1.62.0-SNAPSHOT~b266e2d0c2	1.62.0-SNAPSHOT~9a554fcbf3

See matching parameters

	Baseline	Candidate
application	biojava	biojava
ci_job_date	1775771998	1775771998
ci_job_id	1583602922	1583602922
ci_pipeline_id	106987322	106987322
cpu_model	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version	Linux runner-zfyrx7zua-project-304-concurrent-0-8p30iw02 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux	Linux runner-zfyrx7zua-project-304-concurrent-0-8p30iw02 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 11 metrics, 1 unstable metrics.

Execution time for biojava

gantt
    title biojava - execution time [CI 0.99] : candidate=1.62.0-SNAPSHOT~9a554fcbf3, baseline=1.62.0-SNAPSHOT~b266e2d0c2
    dateFormat X
    axisFormat %s
section baseline
no_agent (15.674 s) : 15674000, 15674000
.   : milestone, 15674000,
appsec (15.087 s) : 15087000, 15087000
.   : milestone, 15087000,
iast (18.249 s) : 18249000, 18249000
.   : milestone, 18249000,
iast_GLOBAL (18.156 s) : 18156000, 18156000
.   : milestone, 18156000,
profiling (14.801 s) : 14801000, 14801000
.   : milestone, 14801000,
tracing (14.971 s) : 14971000, 14971000
.   : milestone, 14971000,
section candidate
no_agent (14.951 s) : 14951000, 14951000
.   : milestone, 14951000,
appsec (14.747 s) : 14747000, 14747000
.   : milestone, 14747000,
iast (18.143 s) : 18143000, 18143000
.   : milestone, 18143000,
iast_GLOBAL (18.111 s) : 18111000, 18111000
.   : milestone, 18111000,
profiling (14.864 s) : 14864000, 14864000
.   : milestone, 14864000,
tracing (14.882 s) : 14882000, 14882000
.   : milestone, 14882000,

baseline results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	15.674 s [15.674 s, 15.674 s]	-
appsec	15.087 s [15.087 s, 15.087 s]	-587.0 ms (-3.7%)
iast	18.249 s [18.249 s, 18.249 s]	2.575 s (16.4%)
iast_GLOBAL	18.156 s [18.156 s, 18.156 s]	2.482 s (15.8%)
profiling	14.801 s [14.801 s, 14.801 s]	-873.0 ms (-5.6%)
tracing	14.971 s [14.971 s, 14.971 s]	-703.0 ms (-4.5%)

candidate results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	14.951 s [14.951 s, 14.951 s]	-
appsec	14.747 s [14.747 s, 14.747 s]	-204.0 ms (-1.4%)
iast	18.143 s [18.143 s, 18.143 s]	3.192 s (21.3%)
iast_GLOBAL	18.111 s [18.111 s, 18.111 s]	3.16 s (21.1%)
profiling	14.864 s [14.864 s, 14.864 s]	-87.0 ms (-0.6%)
tracing	14.882 s [14.882 s, 14.882 s]	-69.0 ms (-0.5%)

Execution time for tomcat

gantt
    title tomcat - execution time [CI 0.99] : candidate=1.62.0-SNAPSHOT~9a554fcbf3, baseline=1.62.0-SNAPSHOT~b266e2d0c2
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.492 ms) : 1481, 1504
.   : milestone, 1492,
appsec (3.87 ms) : 3647, 4094
.   : milestone, 3870,
iast (2.272 ms) : 2203, 2341
.   : milestone, 2272,
iast_GLOBAL (2.317 ms) : 2247, 2386
.   : milestone, 2317,
profiling (2.095 ms) : 2041, 2150
.   : milestone, 2095,
tracing (2.091 ms) : 2038, 2145
.   : milestone, 2091,
section candidate
no_agent (1.488 ms) : 1476, 1499
.   : milestone, 1488,
appsec (3.818 ms) : 3596, 4040
.   : milestone, 3818,
iast (2.267 ms) : 2198, 2336
.   : milestone, 2267,
iast_GLOBAL (2.314 ms) : 2245, 2383
.   : milestone, 2314,
profiling (2.122 ms) : 2066, 2177
.   : milestone, 2122,
tracing (2.088 ms) : 2034, 2142
.   : milestone, 2088,

baseline results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	1.492 ms [1.481 ms, 1.504 ms]	-
appsec	3.87 ms [3.647 ms, 4.094 ms]	2.378 ms (159.4%)
iast	2.272 ms [2.203 ms, 2.341 ms]	780.076 µs (52.3%)
iast_GLOBAL	2.317 ms [2.247 ms, 2.386 ms]	824.479 µs (55.2%)
profiling	2.095 ms [2.041 ms, 2.15 ms]	602.821 µs (40.4%)
tracing	2.091 ms [2.038 ms, 2.145 ms]	598.787 µs (40.1%)

candidate results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	1.488 ms [1.476 ms, 1.499 ms]	-
appsec	3.818 ms [3.596 ms, 4.04 ms]	2.33 ms (156.7%)
iast	2.267 ms [2.198 ms, 2.336 ms]	779.35 µs (52.4%)
iast_GLOBAL	2.314 ms [2.245 ms, 2.383 ms]	826.48 µs (55.6%)
profiling	2.122 ms [2.066 ms, 2.177 ms]	634.237 µs (42.6%)
tracing	2.088 ms [2.034 ms, 2.142 ms]	600.575 µs (40.4%)

The `check raw file injection` test flakes across 11+ logging backend variants. CI Visibility data shows the failure is bimodal — successful runs complete in 3-9s, but failures sit at exactly 30s (the PollingConditions timeout) with traceCount=0. Nothing in between. This means the process either works or is totally broken — no amount of timeout increase will help. The current test is blind during the 30s wait — it just polls traceCount with no diagnostics when the process crashes or hangs. Changes: - Add `waitForTraceCountAlive` that checks process liveness on every poll iteration. If the process dies, it fails immediately with the exit code, RC poll count, and last 20 lines of process output. - On timeout, enrich the error with diagnostic state (process alive?, traceCount, RC polls received, last 30 lines of output) so the next CI failure tells us whether it's a crash, a hang, or a connectivity issue. - Reorder `waitForTraceCount(4)` before `waitFor` to confirm all traces are delivered while the process is still alive. - Assert `waitFor` return value for a clear error if the process hangs. tag: no release notes Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

The liveness check fired before the trace count check, so a normal process exit after delivering all traces was treated as a failure. Check traceCount >= count first and return early if satisfied. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

bm1549 added type: bug Bug report and fix comp: core Tracer core tag: no release notes Changes to exclude from release notes tag: ai generated Largely based on code generated by an AI or LLM labels Apr 9, 2026

bm1549 changed the title ~~Fix log injection smoke test flakiness from startup timeout~~ Diagnose log injection smoke test flakiness instead of masking it Apr 9, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Diagnose log injection smoke test flakiness instead of masking it#11075

Diagnose log injection smoke test flakiness instead of masking it#11075
bm1549 wants to merge 3 commits intomasterfrom
brian.marks/fix-log-injection-smoke-test-flake

bm1549 commented Apr 9, 2026 •

edited

Loading

Uh oh!

pr-commenter bot commented Apr 9, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bm1549 commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What Does This Do

Motivation

Additional Notes

Contributor Checklist

Uh oh!

pr-commenter bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks

Startup

Parameters

Summary

Load

Parameters

Summary

Dacapo

Parameters

Summary

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

bm1549 commented Apr 9, 2026 •

edited

Loading

pr-commenter bot commented Apr 9, 2026 •

edited

Loading