I used Opus to compare Arkime vs Rust vs Python and found some issues in Arkime which I've now fixed. Here are the issues it thinks are in Rust/Python if helpful.
JA4 Reference Implementation Bugs
Found while comparing the Rust and Python JA4 reference implementations against Arkime's capture implementation across 188 pcap files from the Arkime test suite.
Python (ja4.py) Bugs
1. JA4: Empty extension list hashes to SHA256("") instead of 000000000000
Pcaps: https3-301-get.pcap, socks-https-example.pcap
When a TLS Client Hello has no extensions, Python produces the SHA256 hash of an empty string (e3b0c44298fc) for the 3rd segment instead of 000000000000.
| Tool |
JA4 |
| Python |
t10d230100_6a57a6f57151_e3b0c44298fc |
| Rust |
t10d230100_6a57a6f57151_000000000000 |
| Arkime |
t10d230100_6a57a6f57151_000000000000 |
Both pcaps show the same issue. Rust and Arkime agree on 000000000000.
2. JA4H: HTTP version misdetected (1.0 reported as 1.1)
Pcaps: http-empty-useragent.pcap, v6-http.pcap
Python reports HTTP/1.1 (11) but the actual request line in the pcap is HTTP/1.0. Verified via tshark:
$ tshark -r http-empty-useragent.pcap -Y "http.request" -T fields -e http.request.version
HTTP/1.0
| Tool |
JA4H |
| Python |
ge11nn110000_d295f7cacc7a_... |
| Rust |
(no HTTP in this pcap for Rust) |
| Arkime |
ge10nn010000_b8bcd45ac095_... |
v6-http.pcap has the same version issue:
| Tool |
JA4H |
| Python |
ge11nn05en00_dff23709e538_... |
| Arkime |
ge10nn05en00_dff23709e538_... |
Note the header hashes match for v6-http.pcap (dff23709e538), confirming the only difference is the version field.
3. JA4H: Header count inflated (likely counting across requests)
Pcaps: http-empty-useragent.pcap, https-connect.pcap
Python reports far more headers than are present in individual HTTP requests. This may be caused by accumulating headers across multiple HTTP requests in the same TCP stream instead of counting per-request.
http-empty-useragent.pcap: The pcap contains a single GET request with 1 header (User-Agent):
GET / HTTP/1.0
User-Agent:
| Tool |
Header count in JA4H |
Full JA4H |
| Python |
11 |
ge11nn110000_d295f7cacc7a_... |
| Arkime |
01 |
ge10nn010000_b8bcd45ac095_... |
https-connect.pcap:
| Tool |
Header count in JA4H |
Full JA4H |
| Python |
38 |
co11nn380000_35ef01bf733f_... |
| Arkime |
01 |
co10nn010000_b8bcd45ac095_... |
4. JA4H: Cookie value hashes differ from Rust reference
Pcap: single-packets.pcap
For requests with cookies, the 4th JA4H segment (cookie field=value hash) differs between Python and the Rust reference, despite the raw cookie fields/values appearing identical.
Example — request with cookies pardot, visitor_id413862, visitor_id413862-hash:
| Tool |
JA4H |
| Python |
ge11cr06enus_8c2f9ef95269_d23bf79698dc_c1eaa758c543 |
| Rust |
ge11cr06enus_8c2f9ef95269_d23bf79698dc_69e42fa741fe |
| Arkime |
ge11cr06enus_8c2f9ef95269_d23bf79698dc_69e42fa741fe |
First 3 segments match (method, headers, cookie names). Only the 4th segment (cookie values hash) differs. Rust and Arkime produce identical results for all 6 unique JA4H values in this pcap.
Rust Reference Notes
No bugs found in the Rust implementation. All JA4/JA4S/JA4H/JA4T values that both Rust and Arkime produce match exactly across all 188 test pcaps.
Minor coverage note: Rust does not produce JA4 fingerprints for DTLS Client Hello packets in arkime_synthetic.pcap (Arkime produces dd2d020400_c1929292aa6b_c91bed236abd). This is a coverage gap, not a correctness issue.
Test Methodology
Compared three implementations across 188 pcap files from arkime/tests/pcap/:
- Rust:
/path/to/ja4 -j <pcap> (reference implementation)
- Python:
python3 ja4.py -J <pcap> (reference implementation)
- Arkime:
./capture --tests -o plugins=ja4plus.so -r <pcap>
All pcap files mentioned are available in the Arkime repository.
I used Opus to compare Arkime vs Rust vs Python and found some issues in Arkime which I've now fixed. Here are the issues it thinks are in Rust/Python if helpful.
JA4 Reference Implementation Bugs
Found while comparing the Rust and Python JA4 reference implementations against Arkime's capture implementation across 188 pcap files from the Arkime test suite.
Python (
ja4.py) Bugs1. JA4: Empty extension list hashes to SHA256("") instead of 000000000000
Pcaps:
https3-301-get.pcap,socks-https-example.pcapWhen a TLS Client Hello has no extensions, Python produces the SHA256 hash of an empty string (
e3b0c44298fc) for the 3rd segment instead of000000000000.t10d230100_6a57a6f57151_e3b0c44298fct10d230100_6a57a6f57151_000000000000t10d230100_6a57a6f57151_000000000000Both pcaps show the same issue. Rust and Arkime agree on
000000000000.2. JA4H: HTTP version misdetected (1.0 reported as 1.1)
Pcaps:
http-empty-useragent.pcap,v6-http.pcapPython reports HTTP/1.1 (
11) but the actual request line in the pcap isHTTP/1.0. Verified via tshark:ge11nn110000_d295f7cacc7a_...ge10nn010000_b8bcd45ac095_...v6-http.pcaphas the same version issue:ge11nn05en00_dff23709e538_...ge10nn05en00_dff23709e538_...Note the header hashes match for
v6-http.pcap(dff23709e538), confirming the only difference is the version field.3. JA4H: Header count inflated (likely counting across requests)
Pcaps:
http-empty-useragent.pcap,https-connect.pcapPython reports far more headers than are present in individual HTTP requests. This may be caused by accumulating headers across multiple HTTP requests in the same TCP stream instead of counting per-request.
http-empty-useragent.pcap: The pcap contains a single GET request with 1 header (User-Agent):11ge11nn110000_d295f7cacc7a_...01ge10nn010000_b8bcd45ac095_...https-connect.pcap:38co11nn380000_35ef01bf733f_...01co10nn010000_b8bcd45ac095_...4. JA4H: Cookie value hashes differ from Rust reference
Pcap:
single-packets.pcapFor requests with cookies, the 4th JA4H segment (cookie field=value hash) differs between Python and the Rust reference, despite the raw cookie fields/values appearing identical.
Example — request with cookies
pardot,visitor_id413862,visitor_id413862-hash:ge11cr06enus_8c2f9ef95269_d23bf79698dc_c1eaa758c543ge11cr06enus_8c2f9ef95269_d23bf79698dc_69e42fa741fege11cr06enus_8c2f9ef95269_d23bf79698dc_69e42fa741feFirst 3 segments match (method, headers, cookie names). Only the 4th segment (cookie values hash) differs. Rust and Arkime produce identical results for all 6 unique JA4H values in this pcap.
Rust Reference Notes
No bugs found in the Rust implementation. All JA4/JA4S/JA4H/JA4T values that both Rust and Arkime produce match exactly across all 188 test pcaps.
Minor coverage note: Rust does not produce JA4 fingerprints for DTLS Client Hello packets in
arkime_synthetic.pcap(Arkime producesdd2d020400_c1929292aa6b_c91bed236abd). This is a coverage gap, not a correctness issue.Test Methodology
Compared three implementations across 188 pcap files from
arkime/tests/pcap/:/path/to/ja4 -j <pcap>(reference implementation)python3 ja4.py -J <pcap>(reference implementation)./capture --tests -o plugins=ja4plus.so -r <pcap>All pcap files mentioned are available in the Arkime repository.