Skip to content

Optimize PendingTrace span registration and time tracking#11078

Merged
gh-worker-dd-mergequeue-cf854d[bot] merged 2 commits intomasterfrom
brian.marks/perf-pending-trace
Apr 10, 2026
Merged

Optimize PendingTrace span registration and time tracking#11078
gh-worker-dd-mergequeue-cf854d[bot] merged 2 commits intomasterfrom
brian.marks/perf-pending-trace

Conversation

@bm1549
Copy link
Copy Markdown
Contributor

@bm1549 bm1549 commented Apr 10, 2026

What Does This Do

Two optimizations to PendingTrace, which is on the critical path for every span start and finish:

  1. Replace volatile write of lastReferenced with lazySet: The lastReferenced field is written on every getCurrentTimeNano() call (span start/finish) but only read by a background PendingTraceBuffer thread for approximate timeout detection. The full StoreLoad memory fence from a volatile write is unnecessary — lazySet provides release-store semantics (StoreStore barrier only), which is sufficient since the reader tolerates staleness on the order of seconds.

  2. Guard ROOT_SPAN CAS with volatile read: registerSpan() calls ROOT_SPAN.compareAndSet(this, null, span) on every span creation, but only the first call (root span) succeeds. For all subsequent spans, the CAS fails after acquiring exclusive cache line ownership. Adding if (rootSpan == null) before the CAS replaces the failed CAS (which requires cache line ownership + write barrier) with a cheap volatile read for non-root spans.

Motivation

PendingTrace.getCurrentTimeNano() and registerSpan() execute on every span start. In high-throughput applications creating millions of spans per second, eliminating a memory fence and a failed CAS per span adds up.

Why this is faster

  • lazySet vs volatile write: On x86, volatile writes emit a lock prefix or mfence (StoreLoad barrier). lazySet emits only a StoreStore barrier, which is a no-op on x86 (TSO memory model). On ARM, it avoids the dmb ish full barrier, using only dmb ishst (store barrier). This eliminates serialization of the store buffer on every span start/finish.
  • CAS guard: compareAndSet requires exclusive cache line ownership (MESI E/M state) even when it fails. A volatile read only needs shared ownership (S state). For non-root spans (the vast majority), this avoids a cache line transition on the rootSpan field.

The buffer worker may see a slightly stale lastReferenced, which could trigger marginally earlier trace splitting but cannot cause data loss.

Benchmark results (8 threads, JDK 21, macOS aarch64, Fork=1, Warmup=2, Measurement=3)

Benchmark Baseline (ops/ns) Optimized (ops/ns) Change
getCurrentTimeNano 0.019 0.019 0%
getCurrentTimeNano_contended 0.014 0.015 +7.1%
startAndFinishSpan ~0.0001 ~0.0001 ~0%
systemNanoTime (control) 0.019 0.019 0%

Note: The lazySet optimization primarily reduces memory fence cost, which is more impactful on ARM than x86 (where StoreStore is already free under TSO). The getCurrentTimeNano_contended benchmark shows modest improvement under cross-thread contention. The startAndFinishSpan benchmark includes significant overhead from span creation/finish that dominates the CAS-guard savings.

Human readability score: 9.5/10

Both changes use patterns already present in the same file (AtomicLongFieldUpdater for other fields, volatile read guards elsewhere).

Additional Notes

Also adds TimeSourceBenchmark.java JMH benchmark covering PendingTrace.getCurrentTimeNano(), contended variant, startAndFinishSpan, System.nanoTime() baseline, and CoreTracer.getTimeWithNanoTicks().

tag: no release note
tag: ai generated

Contributor Checklist

  • Format the title according to the contribution guidelines
  • Assign the type: and comp: labels
  • Avoid linking keywords — use solves instead

🤖 Generated with Claude Code

tag: no release note
tag: ai generated

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@bm1549 bm1549 added type: enhancement Enhancements and improvements comp: core Tracer core tag: ai generated Largely based on code generated by an AI or LLM labels Apr 10, 2026
@pr-commenter
Copy link
Copy Markdown

pr-commenter bot commented Apr 10, 2026

Benchmarks

⚠️ Warning: Baseline build not found for merge-base commit. Comparing against the latest commit on master instead.

Startup

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master brian.marks/perf-pending-trace
git_commit_date 1775810809 1775825423
git_commit_sha 067d0d2 2e6fda2
release_version 1.62.0-SNAPSHOT~067d0d2c4b 1.62.0-SNAPSHOT~2e6fda2251
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1775827167 1775827167
ci_job_id 1585203292 1585203292
ci_pipeline_id 107084930 107084930
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-1-3j97dfry 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-1-3j97dfry 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
module Agent Agent
parent None None

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 58 metrics, 13 unstable metrics.

Startup time reports for insecure-bank
gantt
    title insecure-bank - global startup overhead: candidate=1.62.0-SNAPSHOT~2e6fda2251, baseline=1.62.0-SNAPSHOT~067d0d2c4b

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.056 s) : 0, 1055960
Total [baseline] (8.843 s) : 0, 8842824
Agent [candidate] (1.057 s) : 0, 1057315
Total [candidate] (8.845 s) : 0, 8845293
section iast
Agent [baseline] (1.22 s) : 0, 1219681
Total [baseline] (9.53 s) : 0, 9529530
Agent [candidate] (1.223 s) : 0, 1222930
Total [candidate] (9.56 s) : 0, 9560114
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.056 s -
Agent iast 1.22 s 163.721 ms (15.5%)
Total tracing 8.843 s -
Total iast 9.53 s 686.706 ms (7.8%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.057 s -
Agent iast 1.223 s 165.615 ms (15.7%)
Total tracing 8.845 s -
Total iast 9.56 s 714.822 ms (8.1%)
gantt
    title insecure-bank - break down per module: candidate=1.62.0-SNAPSHOT~2e6fda2251, baseline=1.62.0-SNAPSHOT~067d0d2c4b

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.244 ms) : 0, 1244
crashtracking [candidate] (1.225 ms) : 0, 1225
BytebuddyAgent [baseline] (631.876 ms) : 0, 631876
BytebuddyAgent [candidate] (633.35 ms) : 0, 633350
AgentMeter [baseline] (29.5 ms) : 0, 29500
AgentMeter [candidate] (29.396 ms) : 0, 29396
GlobalTracer [baseline] (248.753 ms) : 0, 248753
GlobalTracer [candidate] (248.594 ms) : 0, 248594
AppSec [baseline] (32.158 ms) : 0, 32158
AppSec [candidate] (31.994 ms) : 0, 31994
Debugger [baseline] (59.197 ms) : 0, 59197
Debugger [candidate] (58.987 ms) : 0, 58987
Remote Config [baseline] (603.013 µs) : 0, 603
Remote Config [candidate] (586.69 µs) : 0, 587
Telemetry [baseline] (8.068 ms) : 0, 8068
Telemetry [candidate] (8.071 ms) : 0, 8071
Flare Poller [baseline] (8.327 ms) : 0, 8327
Flare Poller [candidate] (8.955 ms) : 0, 8955
section iast
crashtracking [baseline] (1.229 ms) : 0, 1229
crashtracking [candidate] (1.228 ms) : 0, 1228
BytebuddyAgent [baseline] (798.738 ms) : 0, 798738
BytebuddyAgent [candidate] (800.079 ms) : 0, 800079
AgentMeter [baseline] (11.35 ms) : 0, 11350
AgentMeter [candidate] (11.38 ms) : 0, 11380
GlobalTracer [baseline] (238.276 ms) : 0, 238276
GlobalTracer [candidate] (239.459 ms) : 0, 239459
IAST [baseline] (25.726 ms) : 0, 25726
IAST [candidate] (25.74 ms) : 0, 25740
AppSec [baseline] (31.673 ms) : 0, 31673
AppSec [candidate] (31.877 ms) : 0, 31877
Debugger [baseline] (60.177 ms) : 0, 60177
Debugger [candidate] (59.489 ms) : 0, 59489
Remote Config [baseline] (545.751 µs) : 0, 546
Remote Config [candidate] (532.508 µs) : 0, 533
Telemetry [baseline] (11.903 ms) : 0, 11903
Telemetry [candidate] (13.16 ms) : 0, 13160
Flare Poller [baseline] (3.437 ms) : 0, 3437
Flare Poller [candidate] (3.613 ms) : 0, 3613
Loading
Startup time reports for petclinic
gantt
    title petclinic - global startup overhead: candidate=1.62.0-SNAPSHOT~2e6fda2251, baseline=1.62.0-SNAPSHOT~067d0d2c4b

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.07 s) : 0, 1070157
Total [baseline] (11.192 s) : 0, 11191836
Agent [candidate] (1.056 s) : 0, 1055775
Total [candidate] (11.124 s) : 0, 11124195
section appsec
Agent [baseline] (1.255 s) : 0, 1255083
Total [baseline] (11.134 s) : 0, 11133619
Agent [candidate] (1.251 s) : 0, 1251266
Total [candidate] (11.097 s) : 0, 11096773
section iast
Agent [baseline] (1.229 s) : 0, 1228663
Total [baseline] (11.398 s) : 0, 11397508
Agent [candidate] (1.234 s) : 0, 1233916
Total [candidate] (11.412 s) : 0, 11411800
section profiling
Agent [baseline] (1.184 s) : 0, 1183985
Total [baseline] (11.154 s) : 0, 11153842
Agent [candidate] (1.183 s) : 0, 1182693
Total [candidate] (11.147 s) : 0, 11146517
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.07 s -
Agent appsec 1.255 s 184.926 ms (17.3%)
Agent iast 1.229 s 158.505 ms (14.8%)
Agent profiling 1.184 s 113.828 ms (10.6%)
Total tracing 11.192 s -
Total appsec 11.134 s -58.217 ms (-0.5%)
Total iast 11.398 s 205.672 ms (1.8%)
Total profiling 11.154 s -37.994 ms (-0.3%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.056 s -
Agent appsec 1.251 s 195.492 ms (18.5%)
Agent iast 1.234 s 178.142 ms (16.9%)
Agent profiling 1.183 s 126.918 ms (12.0%)
Total tracing 11.124 s -
Total appsec 11.097 s -27.422 ms (-0.2%)
Total iast 11.412 s 287.605 ms (2.6%)
Total profiling 11.147 s 22.322 ms (0.2%)
gantt
    title petclinic - break down per module: candidate=1.62.0-SNAPSHOT~2e6fda2251, baseline=1.62.0-SNAPSHOT~067d0d2c4b

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.231 ms) : 0, 1231
crashtracking [candidate] (1.223 ms) : 0, 1223
BytebuddyAgent [baseline] (637.992 ms) : 0, 637992
BytebuddyAgent [candidate] (632.125 ms) : 0, 632125
AgentMeter [baseline] (29.687 ms) : 0, 29687
AgentMeter [candidate] (29.521 ms) : 0, 29521
GlobalTracer [baseline] (252.356 ms) : 0, 252356
GlobalTracer [candidate] (248.719 ms) : 0, 248719
AppSec [baseline] (32.642 ms) : 0, 32642
AppSec [candidate] (32.005 ms) : 0, 32005
Debugger [baseline] (60.938 ms) : 0, 60938
Debugger [candidate] (59.912 ms) : 0, 59912
Remote Config [baseline] (613.165 µs) : 0, 613
Remote Config [candidate] (599.686 µs) : 0, 600
Telemetry [baseline] (8.263 ms) : 0, 8263
Telemetry [candidate] (8.088 ms) : 0, 8088
Flare Poller [baseline] (9.975 ms) : 0, 9975
Flare Poller [candidate] (7.369 ms) : 0, 7369
section appsec
crashtracking [baseline] (1.234 ms) : 0, 1234
crashtracking [candidate] (1.223 ms) : 0, 1223
BytebuddyAgent [baseline] (665.82 ms) : 0, 665820
BytebuddyAgent [candidate] (664.144 ms) : 0, 664144
AgentMeter [baseline] (12.164 ms) : 0, 12164
AgentMeter [candidate] (12.106 ms) : 0, 12106
GlobalTracer [baseline] (250.104 ms) : 0, 250104
GlobalTracer [candidate] (249.193 ms) : 0, 249193
IAST [baseline] (24.674 ms) : 0, 24674
IAST [candidate] (24.564 ms) : 0, 24564
AppSec [baseline] (185.175 ms) : 0, 185175
AppSec [candidate] (184.65 ms) : 0, 184650
Debugger [baseline] (66.315 ms) : 0, 66315
Debugger [candidate] (66.182 ms) : 0, 66182
Remote Config [baseline] (624.708 µs) : 0, 625
Remote Config [candidate] (602.227 µs) : 0, 602
Telemetry [baseline] (8.783 ms) : 0, 8783
Telemetry [candidate] (8.496 ms) : 0, 8496
Flare Poller [baseline] (3.593 ms) : 0, 3593
Flare Poller [candidate] (3.553 ms) : 0, 3553
section iast
crashtracking [baseline] (1.227 ms) : 0, 1227
crashtracking [candidate] (1.228 ms) : 0, 1228
BytebuddyAgent [baseline] (804.963 ms) : 0, 804963
BytebuddyAgent [candidate] (806.867 ms) : 0, 806867
AgentMeter [baseline] (11.382 ms) : 0, 11382
AgentMeter [candidate] (11.556 ms) : 0, 11556
GlobalTracer [baseline] (239.877 ms) : 0, 239877
GlobalTracer [candidate] (241.705 ms) : 0, 241705
IAST [baseline] (25.752 ms) : 0, 25752
IAST [candidate] (26.182 ms) : 0, 26182
AppSec [baseline] (31.737 ms) : 0, 31737
AppSec [candidate] (32.221 ms) : 0, 32221
Debugger [baseline] (58.073 ms) : 0, 58073
Debugger [candidate] (61.99 ms) : 0, 61990
Remote Config [baseline] (526.351 µs) : 0, 526
Remote Config [candidate] (1.137 ms) : 0, 1137
Telemetry [baseline] (14.941 ms) : 0, 14941
Telemetry [candidate] (11.212 ms) : 0, 11212
Flare Poller [baseline] (3.611 ms) : 0, 3611
Flare Poller [candidate] (3.502 ms) : 0, 3502
section profiling
crashtracking [baseline] (1.189 ms) : 0, 1189
crashtracking [candidate] (1.18 ms) : 0, 1180
BytebuddyAgent [baseline] (691.811 ms) : 0, 691811
BytebuddyAgent [candidate] (690.616 ms) : 0, 690616
AgentMeter [baseline] (9.129 ms) : 0, 9129
AgentMeter [candidate] (9.098 ms) : 0, 9098
GlobalTracer [baseline] (206.93 ms) : 0, 206930
GlobalTracer [candidate] (207.037 ms) : 0, 207037
AppSec [baseline] (32.343 ms) : 0, 32343
AppSec [candidate] (32.429 ms) : 0, 32429
Debugger [baseline] (64.703 ms) : 0, 64703
Debugger [candidate] (65.451 ms) : 0, 65451
Remote Config [baseline] (572.916 µs) : 0, 573
Remote Config [candidate] (561.8 µs) : 0, 562
Telemetry [baseline] (8.552 ms) : 0, 8552
Telemetry [candidate] (7.775 ms) : 0, 7775
Flare Poller [baseline] (3.53 ms) : 0, 3530
Flare Poller [candidate] (3.533 ms) : 0, 3533
ProfilingAgent [baseline] (93.851 ms) : 0, 93851
ProfilingAgent [candidate] (93.724 ms) : 0, 93724
Profiling [baseline] (94.421 ms) : 0, 94421
Profiling [candidate] (94.287 ms) : 0, 94287
Loading

Load

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master brian.marks/perf-pending-trace
git_commit_date 1775810809 1775825423
git_commit_sha 067d0d2 2e6fda2
release_version 1.62.0-SNAPSHOT~067d0d2c4b 1.62.0-SNAPSHOT~2e6fda2251
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1775827640 1775827640
ci_job_id 1585203295 1585203295
ci_pipeline_id 107084930 107084930
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-1-2v5azfl4 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-1-2v5azfl4 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 0 performance improvements and 3 performance regressions! Performance is the same for 17 metrics, 16 unstable metrics.

scenario Δ mean agg_http_req_duration_p50 Δ mean agg_http_req_duration_p95 Δ mean throughput candidate mean agg_http_req_duration_p50 candidate mean agg_http_req_duration_p95 candidate mean throughput baseline mean agg_http_req_duration_p50 baseline mean agg_http_req_duration_p95 baseline mean throughput
scenario:load:petclinic:iast:high_load worse
[+365.440µs; +1223.146µs] or [+2.050%; +6.861%]
same
[-352.074µs; +1602.467µs] or [-1.197%; +5.447%]
unstable
[-35.387op/s; +16.637op/s] or [-13.847%; +6.510%]
18.622ms 30.046ms 246.188op/s 17.827ms 29.421ms 255.562op/s
scenario:load:petclinic:profiling:high_load worse
[+1.427ms; +2.003ms] or [+7.921%; +11.116%]
worse
[+1.337ms; +2.461ms] or [+4.571%; +8.412%]
unstable
[-45.653op/s; +4.028op/s] or [-17.978%; +1.586%]
19.729ms 31.148ms 233.125op/s 18.014ms 29.249ms 253.938op/s
Request duration reports for insecure-bank
gantt
    title insecure-bank - request duration [CI 0.99] : candidate=1.62.0-SNAPSHOT~2e6fda2251, baseline=1.62.0-SNAPSHOT~067d0d2c4b
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.236 ms) : 1224, 1248
.   : milestone, 1236,
iast (3.241 ms) : 3195, 3286
.   : milestone, 3241,
iast_FULL (5.928 ms) : 5868, 5988
.   : milestone, 5928,
iast_GLOBAL (3.685 ms) : 3629, 3741
.   : milestone, 3685,
profiling (2.173 ms) : 2154, 2193
.   : milestone, 2173,
tracing (1.868 ms) : 1852, 1883
.   : milestone, 1868,
section candidate
no_agent (1.228 ms) : 1217, 1240
.   : milestone, 1228,
iast (3.367 ms) : 3318, 3416
.   : milestone, 3367,
iast_FULL (5.888 ms) : 5829, 5947
.   : milestone, 5888,
iast_GLOBAL (3.595 ms) : 3536, 3654
.   : milestone, 3595,
profiling (2.146 ms) : 2126, 2166
.   : milestone, 2146,
tracing (1.923 ms) : 1906, 1939
.   : milestone, 1923,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.236 ms [1.224 ms, 1.248 ms] -
iast 3.241 ms [3.195 ms, 3.286 ms] 2.004 ms (162.1%)
iast_FULL 5.928 ms [5.868 ms, 5.988 ms] 4.692 ms (379.5%)
iast_GLOBAL 3.685 ms [3.629 ms, 3.741 ms] 2.449 ms (198.1%)
profiling 2.173 ms [2.154 ms, 2.193 ms] 936.83 µs (75.8%)
tracing 1.868 ms [1.852 ms, 1.883 ms] 631.208 µs (51.1%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.228 ms [1.217 ms, 1.24 ms] -
iast 3.367 ms [3.318 ms, 3.416 ms] 2.139 ms (174.1%)
iast_FULL 5.888 ms [5.829 ms, 5.947 ms] 4.66 ms (379.4%)
iast_GLOBAL 3.595 ms [3.536 ms, 3.654 ms] 2.367 ms (192.7%)
profiling 2.146 ms [2.126 ms, 2.166 ms] 918.283 µs (74.8%)
tracing 1.923 ms [1.906 ms, 1.939 ms] 694.472 µs (56.5%)
Request duration reports for petclinic
gantt
    title petclinic - request duration [CI 0.99] : candidate=1.62.0-SNAPSHOT~2e6fda2251, baseline=1.62.0-SNAPSHOT~067d0d2c4b
    dateFormat X
    axisFormat %s
section baseline
no_agent (18.424 ms) : 18235, 18614
.   : milestone, 18424,
appsec (18.469 ms) : 18282, 18655
.   : milestone, 18469,
code_origins (18.102 ms) : 17921, 18283
.   : milestone, 18102,
iast (18.257 ms) : 18071, 18443
.   : milestone, 18257,
profiling (18.375 ms) : 18193, 18557
.   : milestone, 18375,
tracing (17.759 ms) : 17584, 17934
.   : milestone, 17759,
section candidate
no_agent (19.046 ms) : 18852, 19240
.   : milestone, 19046,
appsec (18.664 ms) : 18478, 18850
.   : milestone, 18664,
code_origins (18.061 ms) : 17882, 18239
.   : milestone, 18061,
iast (18.954 ms) : 18764, 19144
.   : milestone, 18954,
profiling (20.026 ms) : 19829, 20223
.   : milestone, 20026,
tracing (17.844 ms) : 17671, 18017
.   : milestone, 17844,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 18.424 ms [18.235 ms, 18.614 ms] -
appsec 18.469 ms [18.282 ms, 18.655 ms] 44.647 µs (0.2%)
code_origins 18.102 ms [17.921 ms, 18.283 ms] -321.824 µs (-1.7%)
iast 18.257 ms [18.071 ms, 18.443 ms] -167.234 µs (-0.9%)
profiling 18.375 ms [18.193 ms, 18.557 ms] -49.124 µs (-0.3%)
tracing 17.759 ms [17.584 ms, 17.934 ms] -664.73 µs (-3.6%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 19.046 ms [18.852 ms, 19.24 ms] -
appsec 18.664 ms [18.478 ms, 18.85 ms] -381.768 µs (-2.0%)
code_origins 18.061 ms [17.882 ms, 18.239 ms] -985.31 µs (-5.2%)
iast 18.954 ms [18.764 ms, 19.144 ms] -91.771 µs (-0.5%)
profiling 20.026 ms [19.829 ms, 20.223 ms] 979.996 µs (5.1%)
tracing 17.844 ms [17.671 ms, 18.017 ms] -1.202 ms (-6.3%)

Dacapo

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master brian.marks/perf-pending-trace
git_commit_date 1775810809 1775825423
git_commit_sha 067d0d2 2e6fda2
release_version 1.62.0-SNAPSHOT~067d0d2c4b 1.62.0-SNAPSHOT~2e6fda2251
See matching parameters
Baseline Candidate
application biojava biojava
ci_job_date 1775827356 1775827356
ci_job_id 1585203297 1585203297
ci_pipeline_id 107084930 107084930
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-2-0oma0rf7 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-2-0oma0rf7 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 12 metrics, 0 unstable metrics.

Execution time for biojava
gantt
    title biojava - execution time [CI 0.99] : candidate=1.62.0-SNAPSHOT~2e6fda2251, baseline=1.62.0-SNAPSHOT~067d0d2c4b
    dateFormat X
    axisFormat %s
section baseline
no_agent (15.067 s) : 15067000, 15067000
.   : milestone, 15067000,
appsec (14.849 s) : 14849000, 14849000
.   : milestone, 14849000,
iast (18.291 s) : 18291000, 18291000
.   : milestone, 18291000,
iast_GLOBAL (18.327 s) : 18327000, 18327000
.   : milestone, 18327000,
profiling (15.038 s) : 15038000, 15038000
.   : milestone, 15038000,
tracing (14.949 s) : 14949000, 14949000
.   : milestone, 14949000,
section candidate
no_agent (14.944 s) : 14944000, 14944000
.   : milestone, 14944000,
appsec (14.928 s) : 14928000, 14928000
.   : milestone, 14928000,
iast (18.697 s) : 18697000, 18697000
.   : milestone, 18697000,
iast_GLOBAL (18.013 s) : 18013000, 18013000
.   : milestone, 18013000,
profiling (15.318 s) : 15318000, 15318000
.   : milestone, 15318000,
tracing (15.088 s) : 15088000, 15088000
.   : milestone, 15088000,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 15.067 s [15.067 s, 15.067 s] -
appsec 14.849 s [14.849 s, 14.849 s] -218.0 ms (-1.4%)
iast 18.291 s [18.291 s, 18.291 s] 3.224 s (21.4%)
iast_GLOBAL 18.327 s [18.327 s, 18.327 s] 3.26 s (21.6%)
profiling 15.038 s [15.038 s, 15.038 s] -29.0 ms (-0.2%)
tracing 14.949 s [14.949 s, 14.949 s] -118.0 ms (-0.8%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 14.944 s [14.944 s, 14.944 s] -
appsec 14.928 s [14.928 s, 14.928 s] -16.0 ms (-0.1%)
iast 18.697 s [18.697 s, 18.697 s] 3.753 s (25.1%)
iast_GLOBAL 18.013 s [18.013 s, 18.013 s] 3.069 s (20.5%)
profiling 15.318 s [15.318 s, 15.318 s] 374.0 ms (2.5%)
tracing 15.088 s [15.088 s, 15.088 s] 144.0 ms (1.0%)
Execution time for tomcat
gantt
    title tomcat - execution time [CI 0.99] : candidate=1.62.0-SNAPSHOT~2e6fda2251, baseline=1.62.0-SNAPSHOT~067d0d2c4b
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.489 ms) : 1478, 1501
.   : milestone, 1489,
appsec (2.537 ms) : 2482, 2592
.   : milestone, 2537,
iast (2.279 ms) : 2209, 2348
.   : milestone, 2279,
iast_GLOBAL (2.326 ms) : 2256, 2396
.   : milestone, 2326,
profiling (2.093 ms) : 2038, 2148
.   : milestone, 2093,
tracing (2.079 ms) : 2025, 2133
.   : milestone, 2079,
section candidate
no_agent (1.486 ms) : 1474, 1498
.   : milestone, 1486,
appsec (2.546 ms) : 2490, 2601
.   : milestone, 2546,
iast (2.277 ms) : 2207, 2346
.   : milestone, 2277,
iast_GLOBAL (2.329 ms) : 2259, 2399
.   : milestone, 2329,
profiling (2.111 ms) : 2055, 2166
.   : milestone, 2111,
tracing (2.077 ms) : 2023, 2131
.   : milestone, 2077,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.489 ms [1.478 ms, 1.501 ms] -
appsec 2.537 ms [2.482 ms, 2.592 ms] 1.048 ms (70.3%)
iast 2.279 ms [2.209 ms, 2.348 ms] 789.331 µs (53.0%)
iast_GLOBAL 2.326 ms [2.256 ms, 2.396 ms] 836.792 µs (56.2%)
profiling 2.093 ms [2.038 ms, 2.148 ms] 603.47 µs (40.5%)
tracing 2.079 ms [2.025 ms, 2.133 ms] 590.041 µs (39.6%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.486 ms [1.474 ms, 1.498 ms] -
appsec 2.546 ms [2.49 ms, 2.601 ms] 1.06 ms (71.3%)
iast 2.277 ms [2.207 ms, 2.346 ms] 790.592 µs (53.2%)
iast_GLOBAL 2.329 ms [2.259 ms, 2.399 ms] 842.707 µs (56.7%)
profiling 2.111 ms [2.055 ms, 2.166 ms] 624.483 µs (42.0%)
tracing 2.077 ms [2.023 ms, 2.131 ms] 590.791 µs (39.8%)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@bm1549 bm1549 marked this pull request as ready for review April 10, 2026 13:31
@bm1549 bm1549 requested a review from a team as a code owner April 10, 2026 13:31
@bm1549 bm1549 requested a review from dougqh April 10, 2026 13:31
@Override
void touch() {
lastReferenced = timeSource.getNanoTicks();
LAST_REFERENCED.lazySet(this, timeSource.getNanoTicks());
Copy link
Copy Markdown
Contributor

@dougqh dougqh Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. I haven't really benchmarked lazySet. My guess is this makes a bigger difference on ARM with its more relaxed memory model than x86, but we should measure.

Copy link
Copy Markdown
Contributor

@dougqh dougqh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me.

I'd like to figure out a way to hook in the Spring PetClinic throughput tests that I've been doing, so we can get a better idea of overall impact. Based on the profiling I've done, I suspect the overall gain here is quite small.

While this is in the top 20 CPU time consumers in my stress test, the total time was <1%, so I'm guessing we won't see too much overall difference from this.

All that said, it is a straight-forward well-contained change, so why not?

@bm1549 bm1549 added this pull request to the merge queue Apr 10, 2026
@dd-octo-sts
Copy link
Copy Markdown
Contributor

dd-octo-sts bot commented Apr 10, 2026

/merge

@gh-worker-devflow-routing-ef8351
Copy link
Copy Markdown

gh-worker-devflow-routing-ef8351 bot commented Apr 10, 2026

View all feedbacks in Devflow UI.

2026-04-10 13:47:52 UTC ℹ️ Start processing command /merge


2026-04-10 13:47:56 UTC ℹ️ MergeQueue: pull request added to the queue

The expected merge time in master is approximately 2h (p90).


2026-04-10 15:01:36 UTC ℹ️ MergeQueue: This merge request was merged

@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Apr 10, 2026
@gh-worker-dd-mergequeue-cf854d gh-worker-dd-mergequeue-cf854d bot merged commit f1608f5 into master Apr 10, 2026
573 checks passed
@gh-worker-dd-mergequeue-cf854d gh-worker-dd-mergequeue-cf854d bot deleted the brian.marks/perf-pending-trace branch April 10, 2026 15:01
@github-actions github-actions bot added this to the 1.62.0 milestone Apr 10, 2026
Copy link
Copy Markdown
Contributor

@dougqh dougqh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a local throughput test, this appears to be neutral to slightly negative.
Unfortunately, macro tests are subject to run-to-run variance, so I cay conclusively yet.

@dougqh
Copy link
Copy Markdown
Contributor

dougqh commented Apr 10, 2026

In a local throughput test, this appears to be neutral to slightly negative. Unfortunately, macro tests are subject to run-to-run variance, so I cay conclusively yet.

I ran more complete throughput test at a variety of heap sizes. Overall, the change looks good.
Slight improvement at larger heap sizes. Neutral at smaller heap sizes.

That's in line with what I expected from the benchmark results included in the PR and from the profiles from stress tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp: core Tracer core tag: ai generated Largely based on code generated by an AI or LLM type: enhancement Enhancements and improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants