Tag intermediate initializationError retries with test.final_status=skip in JUnit XML#11010
Tag intermediate initializationError retries with test.final_status=skip in JUnit XML#11010cbeauchesne wants to merge 15 commits intomasterfrom
initializationError retries with test.final_status=skip in JUnit XML#11010Conversation
buildSrc/src/main/kotlin/dd-trace-java.configure-tests.gradle.kts
Outdated
Show resolved
Hide resolved
buildSrc/src/main/kotlin/dd-trace-java.configure-tests.gradle.kts
Outdated
Show resolved
Hide resolved
buildSrc/src/main/kotlin/dd-trace-java.configure-tests.gradle.kts
Outdated
Show resolved
Hide resolved
eb73143 to
fc75f5f
Compare
|
Hi! 👋 Thanks for your pull request! 🎉 To help us review it, please make sure to:
If you need help, please check our contributing guidelines. |
initializationError retries with test.final_status=skip in JUnit XML
| System.err.println("File not found: " + xmlFile); | ||
| System.exit(1); | ||
| } | ||
| var doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(xmlFile); |
There was a problem hiding this comment.
❔ question: Should we add some flags about entity resolution (for example) here to prevent security issue?
There was a problem hiding this comment.
Which security issue do you have in mind ? The entire workflow and data are derivated from the public content of this repo, and the script itself can be modified during a PR.
There was a problem hiding this comment.
I think @PerfectSlayer refers to XML external entity, or other tricks: https://cheatsheetseries.owasp.org/cheatsheets/XML_External_Entity_Prevention_Cheat_Sheet.html#java.
There was a problem hiding this comment.
If it's those, everything that runs here is produced by the PR content, including the script that execute the command. So i'm don't think that there is any increase in the surface attack.
There was a problem hiding this comment.
Indeed, that said I think it doesn't hurt to have them at least as a "processing" contract, and it's simple it's just configuring the factory.
.gitlab/TagInitializationErrors.java
Outdated
| * | ||
| * <p>Gradle generates synthetic "initializationError" testcases in JUnit reports for setup methods. | ||
| * When a setup is retried and eventually succeeds, multiple testcases are created, with only the | ||
| * last one passing. All intermediate attempts are marked skip so Test Optimization is not misled. |
There was a problem hiding this comment.
🎯 suggestion: It would help if you describe the expected changes here. You can re-use stuff from the PR description 😉
There was a problem hiding this comment.
I wonder if you could provide in the javadoc a sample of the non modified junit test file, and the expected output.
Also noteworthy to know, since this code is running on Java 25 it's possible to use markdown javadoc: https://blog.jetbrains.com/idea/2025/04/markdown-in-java-docs-shut-up-and-take-my-comments/
BenchmarksStartupParameters
See matching parameters
SummaryFound 1 performance improvements and 0 performance regressions! Performance is the same for 60 metrics, 10 unstable metrics.
Startup time reports for petclinicgantt
title petclinic - global startup overhead: candidate=1.61.0-SNAPSHOT~3494f99b04, baseline=1.61.0-SNAPSHOT~2365c1251f
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.062 s) : 0, 1061861
Total [baseline] (11.123 s) : 0, 11122890
Agent [candidate] (1.064 s) : 0, 1064231
Total [candidate] (11.113 s) : 0, 11112663
section appsec
Agent [baseline] (1.252 s) : 0, 1251744
Total [baseline] (11.184 s) : 0, 11184237
Agent [candidate] (1.249 s) : 0, 1248782
Total [candidate] (11.131 s) : 0, 11131358
section iast
Agent [baseline] (1.227 s) : 0, 1227077
Total [baseline] (11.417 s) : 0, 11416964
Agent [candidate] (1.232 s) : 0, 1231560
Total [candidate] (11.403 s) : 0, 11403212
section profiling
Agent [baseline] (1.191 s) : 0, 1190856
Total [baseline] (11.169 s) : 0, 11169336
Agent [candidate] (1.182 s) : 0, 1181915
Total [candidate] (11.113 s) : 0, 11112528
gantt
title petclinic - break down per module: candidate=1.61.0-SNAPSHOT~3494f99b04, baseline=1.61.0-SNAPSHOT~2365c1251f
dateFormat X
axisFormat %s
section tracing
crashtracking [baseline] (1.196 ms) : 0, 1196
crashtracking [candidate] (1.217 ms) : 0, 1217
BytebuddyAgent [baseline] (633.976 ms) : 0, 633976
BytebuddyAgent [candidate] (637.853 ms) : 0, 637853
AgentMeter [baseline] (29.787 ms) : 0, 29787
AgentMeter [candidate] (29.679 ms) : 0, 29679
GlobalTracer [baseline] (250.919 ms) : 0, 250919
GlobalTracer [candidate] (250.755 ms) : 0, 250755
AppSec [baseline] (32.376 ms) : 0, 32376
AppSec [candidate] (32.333 ms) : 0, 32333
Debugger [baseline] (60.436 ms) : 0, 60436
Debugger [candidate] (60.661 ms) : 0, 60661
Remote Config [baseline] (600.191 µs) : 0, 600
Remote Config [candidate] (612.583 µs) : 0, 613
Telemetry [baseline] (8.088 ms) : 0, 8088
Telemetry [candidate] (8.131 ms) : 0, 8131
Flare Poller [baseline] (8.203 ms) : 0, 8203
Flare Poller [candidate] (6.665 ms) : 0, 6665
section appsec
crashtracking [baseline] (1.202 ms) : 0, 1202
crashtracking [candidate] (1.211 ms) : 0, 1211
BytebuddyAgent [baseline] (662.685 ms) : 0, 662685
BytebuddyAgent [candidate] (662.141 ms) : 0, 662141
AgentMeter [baseline] (12.104 ms) : 0, 12104
AgentMeter [candidate] (12.099 ms) : 0, 12099
GlobalTracer [baseline] (249.887 ms) : 0, 249887
GlobalTracer [candidate] (248.695 ms) : 0, 248695
AppSec [baseline] (185.025 ms) : 0, 185025
AppSec [candidate] (184.58 ms) : 0, 184580
Debugger [baseline] (66.712 ms) : 0, 66712
Debugger [candidate] (66.405 ms) : 0, 66405
Remote Config [baseline] (613.282 µs) : 0, 613
Remote Config [candidate] (599.511 µs) : 0, 600
Telemetry [baseline] (8.689 ms) : 0, 8689
Telemetry [candidate] (8.535 ms) : 0, 8535
Flare Poller [baseline] (3.669 ms) : 0, 3669
Flare Poller [candidate] (3.49 ms) : 0, 3490
IAST [baseline] (24.585 ms) : 0, 24585
IAST [candidate] (24.525 ms) : 0, 24525
section iast
crashtracking [baseline] (1.205 ms) : 0, 1205
crashtracking [candidate] (1.212 ms) : 0, 1212
BytebuddyAgent [baseline] (802.56 ms) : 0, 802560
BytebuddyAgent [candidate] (806.836 ms) : 0, 806836
AgentMeter [baseline] (11.396 ms) : 0, 11396
AgentMeter [candidate] (11.462 ms) : 0, 11462
GlobalTracer [baseline] (239.377 ms) : 0, 239377
GlobalTracer [candidate] (239.401 ms) : 0, 239401
AppSec [baseline] (31.252 ms) : 0, 31252
AppSec [candidate] (31.967 ms) : 0, 31967
Debugger [baseline] (59.471 ms) : 0, 59471
Debugger [candidate] (58.82 ms) : 0, 58820
Remote Config [baseline] (531.435 µs) : 0, 531
Remote Config [candidate] (529.619 µs) : 0, 530
Telemetry [baseline] (15.004 ms) : 0, 15004
Telemetry [candidate] (14.952 ms) : 0, 14952
Flare Poller [baseline] (3.643 ms) : 0, 3643
Flare Poller [candidate] (4.056 ms) : 0, 4056
IAST [baseline] (25.892 ms) : 0, 25892
IAST [candidate] (26.01 ms) : 0, 26010
section profiling
crashtracking [baseline] (1.187 ms) : 0, 1187
crashtracking [candidate] (1.196 ms) : 0, 1196
BytebuddyAgent [baseline] (692.622 ms) : 0, 692622
BytebuddyAgent [candidate] (690.665 ms) : 0, 690665
AgentMeter [baseline] (9.195 ms) : 0, 9195
AgentMeter [candidate] (9.101 ms) : 0, 9101
GlobalTracer [baseline] (209.428 ms) : 0, 209428
GlobalTracer [candidate] (206.515 ms) : 0, 206515
AppSec [baseline] (32.997 ms) : 0, 32997
AppSec [candidate] (32.539 ms) : 0, 32539
Debugger [baseline] (66.928 ms) : 0, 66928
Debugger [candidate] (65.436 ms) : 0, 65436
Remote Config [baseline] (590.651 µs) : 0, 591
Remote Config [candidate] (555.714 µs) : 0, 556
Telemetry [baseline] (7.944 ms) : 0, 7944
Telemetry [candidate] (7.79 ms) : 0, 7790
Flare Poller [baseline] (3.652 ms) : 0, 3652
Flare Poller [candidate] (3.503 ms) : 0, 3503
ProfilingAgent [baseline] (94.823 ms) : 0, 94823
ProfilingAgent [candidate] (93.362 ms) : 0, 93362
Profiling [baseline] (95.4 ms) : 0, 95400
Profiling [candidate] (93.916 ms) : 0, 93916
Startup time reports for insecure-bankgantt
title insecure-bank - global startup overhead: candidate=1.61.0-SNAPSHOT~3494f99b04, baseline=1.61.0-SNAPSHOT~2365c1251f
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.067 s) : 0, 1067173
Total [baseline] (8.895 s) : 0, 8895138
Agent [candidate] (1.056 s) : 0, 1056317
Total [candidate] (8.862 s) : 0, 8861907
section iast
Agent [baseline] (1.222 s) : 0, 1222148
Total [baseline] (9.575 s) : 0, 9574900
Agent [candidate] (1.224 s) : 0, 1224400
Total [candidate] (9.577 s) : 0, 9577011
gantt
title insecure-bank - break down per module: candidate=1.61.0-SNAPSHOT~3494f99b04, baseline=1.61.0-SNAPSHOT~2365c1251f
dateFormat X
axisFormat %s
section tracing
crashtracking [baseline] (1.221 ms) : 0, 1221
crashtracking [candidate] (1.205 ms) : 0, 1205
BytebuddyAgent [baseline] (639.602 ms) : 0, 639602
BytebuddyAgent [candidate] (632.513 ms) : 0, 632513
AgentMeter [baseline] (29.757 ms) : 0, 29757
AgentMeter [candidate] (29.268 ms) : 0, 29268
GlobalTracer [baseline] (251.474 ms) : 0, 251474
GlobalTracer [candidate] (248.525 ms) : 0, 248525
AppSec [baseline] (32.353 ms) : 0, 32353
AppSec [candidate] (32.096 ms) : 0, 32096
Debugger [baseline] (60.027 ms) : 0, 60027
Debugger [candidate] (59.623 ms) : 0, 59623
Remote Config [baseline] (597.629 µs) : 0, 598
Remote Config [candidate] (599.564 µs) : 0, 600
Telemetry [baseline] (8.146 ms) : 0, 8146
Telemetry [candidate] (8.096 ms) : 0, 8096
Flare Poller [baseline] (7.515 ms) : 0, 7515
Flare Poller [candidate] (8.209 ms) : 0, 8209
section iast
crashtracking [baseline] (1.206 ms) : 0, 1206
crashtracking [candidate] (1.195 ms) : 0, 1195
BytebuddyAgent [baseline] (800.362 ms) : 0, 800362
BytebuddyAgent [candidate] (801.836 ms) : 0, 801836
AgentMeter [baseline] (11.397 ms) : 0, 11397
AgentMeter [candidate] (11.402 ms) : 0, 11402
GlobalTracer [baseline] (238.698 ms) : 0, 238698
GlobalTracer [candidate] (238.938 ms) : 0, 238938
AppSec [baseline] (31.679 ms) : 0, 31679
AppSec [candidate] (27.822 ms) : 0, 27822
Debugger [baseline] (57.429 ms) : 0, 57429
Debugger [candidate] (61.716 ms) : 0, 61716
Remote Config [baseline] (533.967 µs) : 0, 534
Remote Config [candidate] (528.162 µs) : 0, 528
Telemetry [baseline] (14.26 ms) : 0, 14260
Telemetry [candidate] (14.369 ms) : 0, 14369
Flare Poller [baseline] (3.977 ms) : 0, 3977
Flare Poller [candidate] (4.358 ms) : 0, 4358
IAST [baseline] (25.79 ms) : 0, 25790
IAST [candidate] (25.784 ms) : 0, 25784
LoadParameters
See matching parameters
SummaryFound 4 performance improvements and 0 performance regressions! Performance is the same for 15 metrics, 17 unstable metrics.
Request duration reports for petclinicgantt
title petclinic - request duration [CI 0.99] : candidate=1.61.0-SNAPSHOT~3494f99b04, baseline=1.61.0-SNAPSHOT~2365c1251f
dateFormat X
axisFormat %s
section baseline
no_agent (18.728 ms) : 18537, 18919
. : milestone, 18728,
appsec (19.041 ms) : 18849, 19233
. : milestone, 19041,
code_origins (18.504 ms) : 18322, 18686
. : milestone, 18504,
iast (18.61 ms) : 18422, 18797
. : milestone, 18610,
profiling (20.56 ms) : 20347, 20772
. : milestone, 20560,
tracing (18.301 ms) : 18122, 18481
. : milestone, 18301,
section candidate
no_agent (18.683 ms) : 18495, 18871
. : milestone, 18683,
appsec (19.693 ms) : 19492, 19894
. : milestone, 19693,
code_origins (18.56 ms) : 18375, 18744
. : milestone, 18560,
iast (18.293 ms) : 18112, 18474
. : milestone, 18293,
profiling (19.427 ms) : 19232, 19622
. : milestone, 19427,
tracing (18.848 ms) : 18659, 19038
. : milestone, 18848,
Request duration reports for insecure-bankgantt
title insecure-bank - request duration [CI 0.99] : candidate=1.61.0-SNAPSHOT~3494f99b04, baseline=1.61.0-SNAPSHOT~2365c1251f
dateFormat X
axisFormat %s
section baseline
no_agent (1.381 ms) : 1368, 1394
. : milestone, 1381,
iast (3.651 ms) : 3596, 3705
. : milestone, 3651,
iast_FULL (6.859 ms) : 6785, 6933
. : milestone, 6859,
iast_GLOBAL (3.915 ms) : 3841, 3990
. : milestone, 3915,
profiling (2.597 ms) : 2569, 2626
. : milestone, 2597,
tracing (2.061 ms) : 2041, 2081
. : milestone, 2061,
section candidate
no_agent (1.344 ms) : 1332, 1357
. : milestone, 1344,
iast (3.451 ms) : 3411, 3490
. : milestone, 3451,
iast_FULL (6.537 ms) : 6467, 6607
. : milestone, 6537,
iast_GLOBAL (3.893 ms) : 3824, 3961
. : milestone, 3893,
profiling (2.415 ms) : 2391, 2440
. : milestone, 2415,
tracing (2.08 ms) : 2062, 2098
. : milestone, 2080,
DacapoParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 11 metrics, 1 unstable metrics. Execution time for biojavagantt
title biojava - execution time [CI 0.99] : candidate=1.61.0-SNAPSHOT~3494f99b04, baseline=1.61.0-SNAPSHOT~2365c1251f
dateFormat X
axisFormat %s
section baseline
no_agent (14.969 s) : 14969000, 14969000
. : milestone, 14969000,
appsec (14.569 s) : 14569000, 14569000
. : milestone, 14569000,
iast (17.978 s) : 17978000, 17978000
. : milestone, 17978000,
iast_GLOBAL (17.74 s) : 17740000, 17740000
. : milestone, 17740000,
profiling (14.739 s) : 14739000, 14739000
. : milestone, 14739000,
tracing (14.76 s) : 14760000, 14760000
. : milestone, 14760000,
section candidate
no_agent (14.923 s) : 14923000, 14923000
. : milestone, 14923000,
appsec (14.733 s) : 14733000, 14733000
. : milestone, 14733000,
iast (18.033 s) : 18033000, 18033000
. : milestone, 18033000,
iast_GLOBAL (18.013 s) : 18013000, 18013000
. : milestone, 18013000,
profiling (14.921 s) : 14921000, 14921000
. : milestone, 14921000,
tracing (14.858 s) : 14858000, 14858000
. : milestone, 14858000,
Execution time for tomcatgantt
title tomcat - execution time [CI 0.99] : candidate=1.61.0-SNAPSHOT~3494f99b04, baseline=1.61.0-SNAPSHOT~2365c1251f
dateFormat X
axisFormat %s
section baseline
no_agent (1.493 ms) : 1481, 1505
. : milestone, 1493,
appsec (2.559 ms) : 2504, 2614
. : milestone, 2559,
iast (2.283 ms) : 2214, 2352
. : milestone, 2283,
iast_GLOBAL (2.33 ms) : 2261, 2399
. : milestone, 2330,
profiling (2.129 ms) : 2072, 2185
. : milestone, 2129,
tracing (2.094 ms) : 2041, 2148
. : milestone, 2094,
section candidate
no_agent (1.499 ms) : 1488, 1511
. : milestone, 1499,
appsec (3.851 ms) : 3629, 4073
. : milestone, 3851,
iast (2.287 ms) : 2218, 2356
. : milestone, 2287,
iast_GLOBAL (2.33 ms) : 2261, 2400
. : milestone, 2330,
profiling (2.137 ms) : 2080, 2194
. : milestone, 2137,
tracing (2.094 ms) : 2040, 2147
. : milestone, 2094,
|
.gitlab/TagInitializationErrors.java
Outdated
| * | ||
| * <p>Gradle generates synthetic "initializationError" testcases in JUnit reports for setup methods. | ||
| * When a setup is retried and eventually succeeds, multiple testcases are created, with only the | ||
| * last one passing. All intermediate attempts are marked skip so Test Optimization is not misled. |
There was a problem hiding this comment.
I wonder if you could provide in the javadoc a sample of the non modified junit test file, and the expected output.
Also noteworthy to know, since this code is running on Java 25 it's possible to use markdown javadoc: https://blog.jetbrains.com/idea/2025/04/markdown-in-java-docs-shut-up-and-take-my-comments/
.gitlab/TagInitializationErrors.java
Outdated
| if (!modified) return; | ||
| var transformer = TransformerFactory.newInstance().newTransformer(); | ||
| transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8"); | ||
| transformer.transform(new DOMSource(doc), new StreamResult(xmlFile)); |
There was a problem hiding this comment.
note: This modifies the file in-place. What happens if the app fails? Does it leaves invalid documents?
| System.err.println("File not found: " + xmlFile); | ||
| System.exit(1); | ||
| } | ||
| var doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(xmlFile); |
There was a problem hiding this comment.
I think @PerfectSlayer refers to XML external entity, or other tricks: https://cheatsheetseries.owasp.org/cheatsheets/XML_External_Entity_Prevention_Cheat_Sheet.html#java.
Co-authored-by: Brice Dutheil <brice.dutheil@gmail.com>
Co-authored-by: Brice Dutheil <brice.dutheil@gmail.com>
Co-authored-by: Brice Dutheil <brice.dutheil@gmail.com>
Motivation
When a JUnit setup method (e.g. @BeforeAll) fails and is retried via Gradle's retry plugin, Gradle generates a synthetic
<testcase name="initializationError">for each attempt. If the final retry succeeds, the build passes, but Test Optimization receives all intermediate failure entries with no indication that they were retried, making them appear as genuine failures in the dashboard.What Does This Do
Add a doLast post-processor to every Test task that rewrites the JUnit XML reports after execution. For any suite with multiple
initializationErrortestcases (i.e. retries occurred), all entries except the last one are tagged with:The last entry is left unmodified, allowing Test Optimization to apply its default status inference based on the actual outcome. Files with only one (or zero) initializationError testcases are not modified.
The post-processor runs as a doLast action directly on the test task, keeping it within the task's up-to-date and caching scope so it doesn't interfere with downstream consumers of the JUnit reports.
Additional Notes
Contributor Checklist
type:and (comp:orinst:) labels in addition to any other useful labelsclose,fix, or any linking keywords when referencing an issueUse
solvesinstead, and assign the PR milestone to the issueJira ticket: [PROJ-IDENT]
Note: Once your PR is ready to merge, add it to the merge queue by commenting
/merge./merge -ccancels the queue request./merge -f --reason "reason"skips all merge queue checks; please use this judiciously, as some checks do not run at the PR-level. For more information, see this doc.