Optimize PendingTrace span registration and time tracking by bm1549 · Pull Request #11078 · DataDog/dd-trace-java

bm1549 · 2026-04-10T11:28:05Z

What Does This Do

Two optimizations to PendingTrace, which is on the critical path for every span start and finish:

Replace volatile write of lastReferenced with lazySet: The lastReferenced field is written on every getCurrentTimeNano() call (span start/finish) but only read by a background PendingTraceBuffer thread for approximate timeout detection. The full StoreLoad memory fence from a volatile write is unnecessary — lazySet provides release-store semantics (StoreStore barrier only), which is sufficient since the reader tolerates staleness on the order of seconds.
Guard ROOT_SPAN CAS with volatile read: registerSpan() calls ROOT_SPAN.compareAndSet(this, null, span) on every span creation, but only the first call (root span) succeeds. For all subsequent spans, the CAS fails after acquiring exclusive cache line ownership. Adding if (rootSpan == null) before the CAS replaces the failed CAS (which requires cache line ownership + write barrier) with a cheap volatile read for non-root spans.

Motivation

PendingTrace.getCurrentTimeNano() and registerSpan() execute on every span start. In high-throughput applications creating millions of spans per second, eliminating a memory fence and a failed CAS per span adds up.

Why this is faster

lazySet vs volatile write: On x86, volatile writes emit a lock prefix or mfence (StoreLoad barrier). lazySet emits only a StoreStore barrier, which is a no-op on x86 (TSO memory model). On ARM, it avoids the dmb ish full barrier, using only dmb ishst (store barrier). This eliminates serialization of the store buffer on every span start/finish.
CAS guard: compareAndSet requires exclusive cache line ownership (MESI E/M state) even when it fails. A volatile read only needs shared ownership (S state). For non-root spans (the vast majority), this avoids a cache line transition on the rootSpan field.

The buffer worker may see a slightly stale lastReferenced, which could trigger marginally earlier trace splitting but cannot cause data loss.

Benchmark results (8 threads, JDK 21, macOS aarch64, Fork=1, Warmup=2, Measurement=3)

Benchmark	Baseline (ops/ns)	Optimized (ops/ns)	Change
`getCurrentTimeNano`	0.019	0.019	0%
`getCurrentTimeNano_contended`	0.014	0.015	+7.1%
`startAndFinishSpan`	~0.0001	~0.0001	~0%
`systemNanoTime` (control)	0.019	0.019	0%

Note: The lazySet optimization primarily reduces memory fence cost, which is more impactful on ARM than x86 (where StoreStore is already free under TSO). The getCurrentTimeNano_contended benchmark shows modest improvement under cross-thread contention. The startAndFinishSpan benchmark includes significant overhead from span creation/finish that dominates the CAS-guard savings.

Human readability score: 9.5/10

Both changes use patterns already present in the same file (AtomicLongFieldUpdater for other fields, volatile read guards elsewhere).

Additional Notes

Also adds TimeSourceBenchmark.java JMH benchmark covering PendingTrace.getCurrentTimeNano(), contended variant, startAndFinishSpan, System.nanoTime() baseline, and CoreTracer.getTimeWithNanoTicks().

tag: no release note
tag: ai generated

Contributor Checklist

Format the title according to the contribution guidelines
Assign the type: and comp: labels
Avoid linking keywords — use solves instead

🤖 Generated with Claude Code

tag: no release note tag: ai generated Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

pr-commenter · 2026-04-10T12:15:28Z

Benchmarks

⚠️ Warning: Baseline build not found for merge-base commit. Comparing against the latest commit on master instead.

Startup

Parameters

	Baseline	Candidate
baseline_or_candidate	baseline	candidate
git_branch	master	brian.marks/perf-pending-trace
git_commit_date	1775810809	1775825423
git_commit_sha	`067d0d2`	`2e6fda2`
release_version	1.62.0-SNAPSHOT~067d0d2c4b	1.62.0-SNAPSHOT~2e6fda2251

See matching parameters

	Baseline	Candidate
application	insecure-bank	insecure-bank
ci_job_date	1775827167	1775827167
ci_job_id	1585203292	1585203292
ci_pipeline_id	107084930	107084930
cpu_model	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version	Linux runner-zfyrx7zua-project-304-concurrent-1-3j97dfry 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux	Linux runner-zfyrx7zua-project-304-concurrent-1-3j97dfry 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
module	Agent	Agent
parent	None	None

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 58 metrics, 13 unstable metrics.

Startup time reports for insecure-bank

gantt
    title insecure-bank - global startup overhead: candidate=1.62.0-SNAPSHOT~2e6fda2251, baseline=1.62.0-SNAPSHOT~067d0d2c4b

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.056 s) : 0, 1055960
Total [baseline] (8.843 s) : 0, 8842824
Agent [candidate] (1.057 s) : 0, 1057315
Total [candidate] (8.845 s) : 0, 8845293
section iast
Agent [baseline] (1.22 s) : 0, 1219681
Total [baseline] (9.53 s) : 0, 9529530
Agent [candidate] (1.223 s) : 0, 1222930
Total [candidate] (9.56 s) : 0, 9560114

baseline results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.056 s	-
Agent	iast	1.22 s	163.721 ms (15.5%)
Total	tracing	8.843 s	-
Total	iast	9.53 s	686.706 ms (7.8%)

candidate results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.057 s	-
Agent	iast	1.223 s	165.615 ms (15.7%)
Total	tracing	8.845 s	-
Total	iast	9.56 s	714.822 ms (8.1%)

gantt
    title insecure-bank - break down per module: candidate=1.62.0-SNAPSHOT~2e6fda2251, baseline=1.62.0-SNAPSHOT~067d0d2c4b

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.244 ms) : 0, 1244
crashtracking [candidate] (1.225 ms) : 0, 1225
BytebuddyAgent [baseline] (631.876 ms) : 0, 631876
BytebuddyAgent [candidate] (633.35 ms) : 0, 633350
AgentMeter [baseline] (29.5 ms) : 0, 29500
AgentMeter [candidate] (29.396 ms) : 0, 29396
GlobalTracer [baseline] (248.753 ms) : 0, 248753
GlobalTracer [candidate] (248.594 ms) : 0, 248594
AppSec [baseline] (32.158 ms) : 0, 32158
AppSec [candidate] (31.994 ms) : 0, 31994
Debugger [baseline] (59.197 ms) : 0, 59197
Debugger [candidate] (58.987 ms) : 0, 58987
Remote Config [baseline] (603.013 µs) : 0, 603
Remote Config [candidate] (586.69 µs) : 0, 587
Telemetry [baseline] (8.068 ms) : 0, 8068
Telemetry [candidate] (8.071 ms) : 0, 8071
Flare Poller [baseline] (8.327 ms) : 0, 8327
Flare Poller [candidate] (8.955 ms) : 0, 8955
section iast
crashtracking [baseline] (1.229 ms) : 0, 1229
crashtracking [candidate] (1.228 ms) : 0, 1228
BytebuddyAgent [baseline] (798.738 ms) : 0, 798738
BytebuddyAgent [candidate] (800.079 ms) : 0, 800079
AgentMeter [baseline] (11.35 ms) : 0, 11350
AgentMeter [candidate] (11.38 ms) : 0, 11380
GlobalTracer [baseline] (238.276 ms) : 0, 238276
GlobalTracer [candidate] (239.459 ms) : 0, 239459
IAST [baseline] (25.726 ms) : 0, 25726
IAST [candidate] (25.74 ms) : 0, 25740
AppSec [baseline] (31.673 ms) : 0, 31673
AppSec [candidate] (31.877 ms) : 0, 31877
Debugger [baseline] (60.177 ms) : 0, 60177
Debugger [candidate] (59.489 ms) : 0, 59489
Remote Config [baseline] (545.751 µs) : 0, 546
Remote Config [candidate] (532.508 µs) : 0, 533
Telemetry [baseline] (11.903 ms) : 0, 11903
Telemetry [candidate] (13.16 ms) : 0, 13160
Flare Poller [baseline] (3.437 ms) : 0, 3437
Flare Poller [candidate] (3.613 ms) : 0, 3613

Startup time reports for petclinic

gantt
    title petclinic - global startup overhead: candidate=1.62.0-SNAPSHOT~2e6fda2251, baseline=1.62.0-SNAPSHOT~067d0d2c4b

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.07 s) : 0, 1070157
Total [baseline] (11.192 s) : 0, 11191836
Agent [candidate] (1.056 s) : 0, 1055775
Total [candidate] (11.124 s) : 0, 11124195
section appsec
Agent [baseline] (1.255 s) : 0, 1255083
Total [baseline] (11.134 s) : 0, 11133619
Agent [candidate] (1.251 s) : 0, 1251266
Total [candidate] (11.097 s) : 0, 11096773
section iast
Agent [baseline] (1.229 s) : 0, 1228663
Total [baseline] (11.398 s) : 0, 11397508
Agent [candidate] (1.234 s) : 0, 1233916
Total [candidate] (11.412 s) : 0, 11411800
section profiling
Agent [baseline] (1.184 s) : 0, 1183985
Total [baseline] (11.154 s) : 0, 11153842
Agent [candidate] (1.183 s) : 0, 1182693
Total [candidate] (11.147 s) : 0, 11146517

baseline results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.07 s	-
Agent	appsec	1.255 s	184.926 ms (17.3%)
Agent	iast	1.229 s	158.505 ms (14.8%)
Agent	profiling	1.184 s	113.828 ms (10.6%)
Total	tracing	11.192 s	-
Total	appsec	11.134 s	-58.217 ms (-0.5%)
Total	iast	11.398 s	205.672 ms (1.8%)
Total	profiling	11.154 s	-37.994 ms (-0.3%)

candidate results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.056 s	-
Agent	appsec	1.251 s	195.492 ms (18.5%)
Agent	iast	1.234 s	178.142 ms (16.9%)
Agent	profiling	1.183 s	126.918 ms (12.0%)
Total	tracing	11.124 s	-
Total	appsec	11.097 s	-27.422 ms (-0.2%)
Total	iast	11.412 s	287.605 ms (2.6%)
Total	profiling	11.147 s	22.322 ms (0.2%)

gantt
    title petclinic - break down per module: candidate=1.62.0-SNAPSHOT~2e6fda2251, baseline=1.62.0-SNAPSHOT~067d0d2c4b

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.231 ms) : 0, 1231
crashtracking [candidate] (1.223 ms) : 0, 1223
BytebuddyAgent [baseline] (637.992 ms) : 0, 637992
BytebuddyAgent [candidate] (632.125 ms) : 0, 632125
AgentMeter [baseline] (29.687 ms) : 0, 29687
AgentMeter [candidate] (29.521 ms) : 0, 29521
GlobalTracer [baseline] (252.356 ms) : 0, 252356
GlobalTracer [candidate] (248.719 ms) : 0, 248719
AppSec [baseline] (32.642 ms) : 0, 32642
AppSec [candidate] (32.005 ms) : 0, 32005
Debugger [baseline] (60.938 ms) : 0, 60938
Debugger [candidate] (59.912 ms) : 0, 59912
Remote Config [baseline] (613.165 µs) : 0, 613
Remote Config [candidate] (599.686 µs) : 0, 600
Telemetry [baseline] (8.263 ms) : 0, 8263
Telemetry [candidate] (8.088 ms) : 0, 8088
Flare Poller [baseline] (9.975 ms) : 0, 9975
Flare Poller [candidate] (7.369 ms) : 0, 7369
section appsec
crashtracking [baseline] (1.234 ms) : 0, 1234
crashtracking [candidate] (1.223 ms) : 0, 1223
BytebuddyAgent [baseline] (665.82 ms) : 0, 665820
BytebuddyAgent [candidate] (664.144 ms) : 0, 664144
AgentMeter [baseline] (12.164 ms) : 0, 12164
AgentMeter [candidate] (12.106 ms) : 0, 12106
GlobalTracer [baseline] (250.104 ms) : 0, 250104
GlobalTracer [candidate] (249.193 ms) : 0, 249193
IAST [baseline] (24.674 ms) : 0, 24674
IAST [candidate] (24.564 ms) : 0, 24564
AppSec [baseline] (185.175 ms) : 0, 185175
AppSec [candidate] (184.65 ms) : 0, 184650
Debugger [baseline] (66.315 ms) : 0, 66315
Debugger [candidate] (66.182 ms) : 0, 66182
Remote Config [baseline] (624.708 µs) : 0, 625
Remote Config [candidate] (602.227 µs) : 0, 602
Telemetry [baseline] (8.783 ms) : 0, 8783
Telemetry [candidate] (8.496 ms) : 0, 8496
Flare Poller [baseline] (3.593 ms) : 0, 3593
Flare Poller [candidate] (3.553 ms) : 0, 3553
section iast
crashtracking [baseline] (1.227 ms) : 0, 1227
crashtracking [candidate] (1.228 ms) : 0, 1228
BytebuddyAgent [baseline] (804.963 ms) : 0, 804963
BytebuddyAgent [candidate] (806.867 ms) : 0, 806867
AgentMeter [baseline] (11.382 ms) : 0, 11382
AgentMeter [candidate] (11.556 ms) : 0, 11556
GlobalTracer [baseline] (239.877 ms) : 0, 239877
GlobalTracer [candidate] (241.705 ms) : 0, 241705
IAST [baseline] (25.752 ms) : 0, 25752
IAST [candidate] (26.182 ms) : 0, 26182
AppSec [baseline] (31.737 ms) : 0, 31737
AppSec [candidate] (32.221 ms) : 0, 32221
Debugger [baseline] (58.073 ms) : 0, 58073
Debugger [candidate] (61.99 ms) : 0, 61990
Remote Config [baseline] (526.351 µs) : 0, 526
Remote Config [candidate] (1.137 ms) : 0, 1137
Telemetry [baseline] (14.941 ms) : 0, 14941
Telemetry [candidate] (11.212 ms) : 0, 11212
Flare Poller [baseline] (3.611 ms) : 0, 3611
Flare Poller [candidate] (3.502 ms) : 0, 3502
section profiling
crashtracking [baseline] (1.189 ms) : 0, 1189
crashtracking [candidate] (1.18 ms) : 0, 1180
BytebuddyAgent [baseline] (691.811 ms) : 0, 691811
BytebuddyAgent [candidate] (690.616 ms) : 0, 690616
AgentMeter [baseline] (9.129 ms) : 0, 9129
AgentMeter [candidate] (9.098 ms) : 0, 9098
GlobalTracer [baseline] (206.93 ms) : 0, 206930
GlobalTracer [candidate] (207.037 ms) : 0, 207037
AppSec [baseline] (32.343 ms) : 0, 32343
AppSec [candidate] (32.429 ms) : 0, 32429
Debugger [baseline] (64.703 ms) : 0, 64703
Debugger [candidate] (65.451 ms) : 0, 65451
Remote Config [baseline] (572.916 µs) : 0, 573
Remote Config [candidate] (561.8 µs) : 0, 562
Telemetry [baseline] (8.552 ms) : 0, 8552
Telemetry [candidate] (7.775 ms) : 0, 7775
Flare Poller [baseline] (3.53 ms) : 0, 3530
Flare Poller [candidate] (3.533 ms) : 0, 3533
ProfilingAgent [baseline] (93.851 ms) : 0, 93851
ProfilingAgent [candidate] (93.724 ms) : 0, 93724
Profiling [baseline] (94.421 ms) : 0, 94421
Profiling [candidate] (94.287 ms) : 0, 94287

Load

Parameters

	Baseline	Candidate
baseline_or_candidate	baseline	candidate
git_branch	master	brian.marks/perf-pending-trace
git_commit_date	1775810809	1775825423
git_commit_sha	`067d0d2`	`2e6fda2`
release_version	1.62.0-SNAPSHOT~067d0d2c4b	1.62.0-SNAPSHOT~2e6fda2251

See matching parameters

	Baseline	Candidate
application	insecure-bank	insecure-bank
ci_job_date	1775827640	1775827640
ci_job_id	1585203295	1585203295
ci_pipeline_id	107084930	107084930
cpu_model	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version	Linux runner-zfyrx7zua-project-304-concurrent-1-2v5azfl4 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux	Linux runner-zfyrx7zua-project-304-concurrent-1-2v5azfl4 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 0 performance improvements and 3 performance regressions! Performance is the same for 17 metrics, 16 unstable metrics.

scenario	Δ mean agg_http_req_duration_p50	Δ mean agg_http_req_duration_p95	Δ mean throughput	candidate mean agg_http_req_duration_p50	candidate mean agg_http_req_duration_p95	candidate mean throughput	baseline mean agg_http_req_duration_p50	baseline mean agg_http_req_duration_p95	baseline mean throughput
scenario:load:petclinic:iast:high_load	worse [+365.440µs; +1223.146µs] or [+2.050%; +6.861%]	same [-352.074µs; +1602.467µs] or [-1.197%; +5.447%]	unstable [-35.387op/s; +16.637op/s] or [-13.847%; +6.510%]	18.622ms	30.046ms	246.188op/s	17.827ms	29.421ms	255.562op/s
scenario:load:petclinic:profiling:high_load	worse [+1.427ms; +2.003ms] or [+7.921%; +11.116%]	worse [+1.337ms; +2.461ms] or [+4.571%; +8.412%]	unstable [-45.653op/s; +4.028op/s] or [-17.978%; +1.586%]	19.729ms	31.148ms	233.125op/s	18.014ms	29.249ms	253.938op/s

Request duration reports for insecure-bank

gantt
    title insecure-bank - request duration [CI 0.99] : candidate=1.62.0-SNAPSHOT~2e6fda2251, baseline=1.62.0-SNAPSHOT~067d0d2c4b
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.236 ms) : 1224, 1248
.   : milestone, 1236,
iast (3.241 ms) : 3195, 3286
.   : milestone, 3241,
iast_FULL (5.928 ms) : 5868, 5988
.   : milestone, 5928,
iast_GLOBAL (3.685 ms) : 3629, 3741
.   : milestone, 3685,
profiling (2.173 ms) : 2154, 2193
.   : milestone, 2173,
tracing (1.868 ms) : 1852, 1883
.   : milestone, 1868,
section candidate
no_agent (1.228 ms) : 1217, 1240
.   : milestone, 1228,
iast (3.367 ms) : 3318, 3416
.   : milestone, 3367,
iast_FULL (5.888 ms) : 5829, 5947
.   : milestone, 5888,
iast_GLOBAL (3.595 ms) : 3536, 3654
.   : milestone, 3595,
profiling (2.146 ms) : 2126, 2166
.   : milestone, 2146,
tracing (1.923 ms) : 1906, 1939
.   : milestone, 1923,

baseline results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	1.236 ms [1.224 ms, 1.248 ms]	-
iast	3.241 ms [3.195 ms, 3.286 ms]	2.004 ms (162.1%)
iast_FULL	5.928 ms [5.868 ms, 5.988 ms]	4.692 ms (379.5%)
iast_GLOBAL	3.685 ms [3.629 ms, 3.741 ms]	2.449 ms (198.1%)
profiling	2.173 ms [2.154 ms, 2.193 ms]	936.83 µs (75.8%)
tracing	1.868 ms [1.852 ms, 1.883 ms]	631.208 µs (51.1%)

candidate results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	1.228 ms [1.217 ms, 1.24 ms]	-
iast	3.367 ms [3.318 ms, 3.416 ms]	2.139 ms (174.1%)
iast_FULL	5.888 ms [5.829 ms, 5.947 ms]	4.66 ms (379.4%)
iast_GLOBAL	3.595 ms [3.536 ms, 3.654 ms]	2.367 ms (192.7%)
profiling	2.146 ms [2.126 ms, 2.166 ms]	918.283 µs (74.8%)
tracing	1.923 ms [1.906 ms, 1.939 ms]	694.472 µs (56.5%)

Request duration reports for petclinic

gantt
    title petclinic - request duration [CI 0.99] : candidate=1.62.0-SNAPSHOT~2e6fda2251, baseline=1.62.0-SNAPSHOT~067d0d2c4b
    dateFormat X
    axisFormat %s
section baseline
no_agent (18.424 ms) : 18235, 18614
.   : milestone, 18424,
appsec (18.469 ms) : 18282, 18655
.   : milestone, 18469,
code_origins (18.102 ms) : 17921, 18283
.   : milestone, 18102,
iast (18.257 ms) : 18071, 18443
.   : milestone, 18257,
profiling (18.375 ms) : 18193, 18557
.   : milestone, 18375,
tracing (17.759 ms) : 17584, 17934
.   : milestone, 17759,
section candidate
no_agent (19.046 ms) : 18852, 19240
.   : milestone, 19046,
appsec (18.664 ms) : 18478, 18850
.   : milestone, 18664,
code_origins (18.061 ms) : 17882, 18239
.   : milestone, 18061,
iast (18.954 ms) : 18764, 19144
.   : milestone, 18954,
profiling (20.026 ms) : 19829, 20223
.   : milestone, 20026,
tracing (17.844 ms) : 17671, 18017
.   : milestone, 17844,

baseline results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	18.424 ms [18.235 ms, 18.614 ms]	-
appsec	18.469 ms [18.282 ms, 18.655 ms]	44.647 µs (0.2%)
code_origins	18.102 ms [17.921 ms, 18.283 ms]	-321.824 µs (-1.7%)
iast	18.257 ms [18.071 ms, 18.443 ms]	-167.234 µs (-0.9%)
profiling	18.375 ms [18.193 ms, 18.557 ms]	-49.124 µs (-0.3%)
tracing	17.759 ms [17.584 ms, 17.934 ms]	-664.73 µs (-3.6%)

candidate results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	19.046 ms [18.852 ms, 19.24 ms]	-
appsec	18.664 ms [18.478 ms, 18.85 ms]	-381.768 µs (-2.0%)
code_origins	18.061 ms [17.882 ms, 18.239 ms]	-985.31 µs (-5.2%)
iast	18.954 ms [18.764 ms, 19.144 ms]	-91.771 µs (-0.5%)
profiling	20.026 ms [19.829 ms, 20.223 ms]	979.996 µs (5.1%)
tracing	17.844 ms [17.671 ms, 18.017 ms]	-1.202 ms (-6.3%)

Dacapo

Parameters

	Baseline	Candidate
baseline_or_candidate	baseline	candidate
git_branch	master	brian.marks/perf-pending-trace
git_commit_date	1775810809	1775825423
git_commit_sha	`067d0d2`	`2e6fda2`
release_version	1.62.0-SNAPSHOT~067d0d2c4b	1.62.0-SNAPSHOT~2e6fda2251

See matching parameters

	Baseline	Candidate
application	biojava	biojava
ci_job_date	1775827356	1775827356
ci_job_id	1585203297	1585203297
ci_pipeline_id	107084930	107084930
cpu_model	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version	Linux runner-zfyrx7zua-project-304-concurrent-2-0oma0rf7 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux	Linux runner-zfyrx7zua-project-304-concurrent-2-0oma0rf7 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 12 metrics, 0 unstable metrics.

Execution time for biojava

gantt
    title biojava - execution time [CI 0.99] : candidate=1.62.0-SNAPSHOT~2e6fda2251, baseline=1.62.0-SNAPSHOT~067d0d2c4b
    dateFormat X
    axisFormat %s
section baseline
no_agent (15.067 s) : 15067000, 15067000
.   : milestone, 15067000,
appsec (14.849 s) : 14849000, 14849000
.   : milestone, 14849000,
iast (18.291 s) : 18291000, 18291000
.   : milestone, 18291000,
iast_GLOBAL (18.327 s) : 18327000, 18327000
.   : milestone, 18327000,
profiling (15.038 s) : 15038000, 15038000
.   : milestone, 15038000,
tracing (14.949 s) : 14949000, 14949000
.   : milestone, 14949000,
section candidate
no_agent (14.944 s) : 14944000, 14944000
.   : milestone, 14944000,
appsec (14.928 s) : 14928000, 14928000
.   : milestone, 14928000,
iast (18.697 s) : 18697000, 18697000
.   : milestone, 18697000,
iast_GLOBAL (18.013 s) : 18013000, 18013000
.   : milestone, 18013000,
profiling (15.318 s) : 15318000, 15318000
.   : milestone, 15318000,
tracing (15.088 s) : 15088000, 15088000
.   : milestone, 15088000,

baseline results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	15.067 s [15.067 s, 15.067 s]	-
appsec	14.849 s [14.849 s, 14.849 s]	-218.0 ms (-1.4%)
iast	18.291 s [18.291 s, 18.291 s]	3.224 s (21.4%)
iast_GLOBAL	18.327 s [18.327 s, 18.327 s]	3.26 s (21.6%)
profiling	15.038 s [15.038 s, 15.038 s]	-29.0 ms (-0.2%)
tracing	14.949 s [14.949 s, 14.949 s]	-118.0 ms (-0.8%)

candidate results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	14.944 s [14.944 s, 14.944 s]	-
appsec	14.928 s [14.928 s, 14.928 s]	-16.0 ms (-0.1%)
iast	18.697 s [18.697 s, 18.697 s]	3.753 s (25.1%)
iast_GLOBAL	18.013 s [18.013 s, 18.013 s]	3.069 s (20.5%)
profiling	15.318 s [15.318 s, 15.318 s]	374.0 ms (2.5%)
tracing	15.088 s [15.088 s, 15.088 s]	144.0 ms (1.0%)

Execution time for tomcat

gantt
    title tomcat - execution time [CI 0.99] : candidate=1.62.0-SNAPSHOT~2e6fda2251, baseline=1.62.0-SNAPSHOT~067d0d2c4b
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.489 ms) : 1478, 1501
.   : milestone, 1489,
appsec (2.537 ms) : 2482, 2592
.   : milestone, 2537,
iast (2.279 ms) : 2209, 2348
.   : milestone, 2279,
iast_GLOBAL (2.326 ms) : 2256, 2396
.   : milestone, 2326,
profiling (2.093 ms) : 2038, 2148
.   : milestone, 2093,
tracing (2.079 ms) : 2025, 2133
.   : milestone, 2079,
section candidate
no_agent (1.486 ms) : 1474, 1498
.   : milestone, 1486,
appsec (2.546 ms) : 2490, 2601
.   : milestone, 2546,
iast (2.277 ms) : 2207, 2346
.   : milestone, 2277,
iast_GLOBAL (2.329 ms) : 2259, 2399
.   : milestone, 2329,
profiling (2.111 ms) : 2055, 2166
.   : milestone, 2111,
tracing (2.077 ms) : 2023, 2131
.   : milestone, 2077,

baseline results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	1.489 ms [1.478 ms, 1.501 ms]	-
appsec	2.537 ms [2.482 ms, 2.592 ms]	1.048 ms (70.3%)
iast	2.279 ms [2.209 ms, 2.348 ms]	789.331 µs (53.0%)
iast_GLOBAL	2.326 ms [2.256 ms, 2.396 ms]	836.792 µs (56.2%)
profiling	2.093 ms [2.038 ms, 2.148 ms]	603.47 µs (40.5%)
tracing	2.079 ms [2.025 ms, 2.133 ms]	590.041 µs (39.6%)

candidate results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	1.486 ms [1.474 ms, 1.498 ms]	-
appsec	2.546 ms [2.49 ms, 2.601 ms]	1.06 ms (71.3%)
iast	2.277 ms [2.207 ms, 2.346 ms]	790.592 µs (53.2%)
iast_GLOBAL	2.329 ms [2.259 ms, 2.399 ms]	842.707 µs (56.7%)
profiling	2.111 ms [2.055 ms, 2.166 ms]	624.483 µs (42.0%)
tracing	2.077 ms [2.023 ms, 2.131 ms]	590.791 µs (39.8%)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

dougqh · 2026-04-10T13:39:37Z

dd-trace-core/src/main/java/datadog/trace/core/PendingTrace.java

  @Override
  void touch() {
-    lastReferenced = timeSource.getNanoTicks();
+    LAST_REFERENCED.lazySet(this, timeSource.getNanoTicks());


Interesting. I haven't really benchmarked lazySet. My guess is this makes a bigger difference on ARM with its more relaxed memory model than x86, but we should measure.

dougqh

This looks good to me.

I'd like to figure out a way to hook in the Spring PetClinic throughput tests that I've been doing, so we can get a better idea of overall impact. Based on the profiling I've done, I suspect the overall gain here is quite small.

While this is in the top 20 CPU time consumers in my stress test, the total time was <1%, so I'm guessing we won't see too much overall difference from this.

All that said, it is a straight-forward well-contained change, so why not?

dd-octo-sts · 2026-04-10T13:47:48Z

/merge

gh-worker-devflow-routing-ef8351 · 2026-04-10T13:47:52Z

View all feedbacks in Devflow UI.

2026-04-10 13:47:52 UTC ℹ️ Start processing command /merge

2026-04-10 13:47:56 UTC ℹ️ MergeQueue: pull request added to the queue

The expected merge time in master is approximately 2h (p90).

2026-04-10 15:01:36 UTC ℹ️ MergeQueue: This merge request was merged

dougqh

In a local throughput test, this appears to be neutral to slightly negative.
Unfortunately, macro tests are subject to run-to-run variance, so I cay conclusively yet.

dougqh · 2026-04-10T16:56:43Z

In a local throughput test, this appears to be neutral to slightly negative. Unfortunately, macro tests are subject to run-to-run variance, so I cay conclusively yet.

I ran more complete throughput test at a variety of heap sizes. Overall, the change looks good.
Slight improvement at larger heap sizes. Neutral at smaller heap sizes.

That's in line with what I expected from the benchmark results included in the PR and from the profiles from stress tests.

Optimize PendingTrace hot paths with lazySet and CAS guard

f852039

tag: no release note tag: ai generated Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

bm1549 added type: enhancement Enhancements and improvements comp: core Tracer core tag: ai generated Largely based on code generated by an AI or LLM labels Apr 10, 2026

Add startAndFinishSpan and contended benchmarks to TimeSourceBenchmark

2e6fda2

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

bm1549 marked this pull request as ready for review April 10, 2026 13:31

bm1549 requested a review from a team as a code owner April 10, 2026 13:31

bm1549 requested a review from dougqh April 10, 2026 13:31

dougqh reviewed Apr 10, 2026

View reviewed changes

dougqh approved these changes Apr 10, 2026

View reviewed changes

bm1549 added this pull request to the merge queue Apr 10, 2026

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Apr 10, 2026

gh-worker-dd-mergequeue-cf854d bot merged commit f1608f5 into master Apr 10, 2026
573 checks passed

gh-worker-dd-mergequeue-cf854d bot deleted the brian.marks/perf-pending-trace branch April 10, 2026 15:01

github-actions bot added this to the 1.62.0 milestone Apr 10, 2026

dougqh reviewed Apr 10, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize PendingTrace span registration and time tracking#11078

Optimize PendingTrace span registration and time tracking#11078
gh-worker-dd-mergequeue-cf854d[bot] merged 2 commits intomasterfrom
brian.marks/perf-pending-trace

bm1549 commented Apr 10, 2026 •

edited

Loading

Uh oh!

pr-commenter bot commented Apr 10, 2026 •

edited

Loading

Uh oh!

dougqh Apr 10, 2026 •

edited

Loading

Uh oh!

dougqh left a comment

Uh oh!

dd-octo-sts bot commented Apr 10, 2026

Uh oh!

gh-worker-devflow-routing-ef8351 bot commented Apr 10, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

dougqh left a comment

Uh oh!

dougqh commented Apr 10, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

bm1549 commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What Does This Do

Motivation

Why this is faster

Benchmark results (8 threads, JDK 21, macOS aarch64, Fork=1, Warmup=2, Measurement=3)

Human readability score: 9.5/10

Additional Notes

Contributor Checklist

Uh oh!

pr-commenter bot commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks

Startup

Parameters

Summary

Load

Parameters

Summary

Dacapo

Parameters

Summary

Uh oh!

dougqh Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dougqh left a comment

Choose a reason for hiding this comment

Uh oh!

dd-octo-sts bot commented Apr 10, 2026

Uh oh!

gh-worker-devflow-routing-ef8351 bot commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dougqh left a comment

Choose a reason for hiding this comment

Uh oh!

dougqh commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

bm1549 commented Apr 10, 2026 •

edited

Loading

pr-commenter bot commented Apr 10, 2026 •

edited

Loading

dougqh Apr 10, 2026 •

edited

Loading

gh-worker-devflow-routing-ef8351 bot commented Apr 10, 2026 •

edited

Loading

dougqh commented Apr 10, 2026 •

edited

Loading