-
Notifications
You must be signed in to change notification settings - Fork 1
Improve Apple Silicon performance #3
Copy link
Copy link
Open
Labels
performanceOptimizations that enhances speed and performance of the moduleOptimizations that enhances speed and performance of the module
Description
The implementation seem to perform significantly worse on Apple Silicon (maybe also on other AArch64) chips. Though we use suggested vector length via std.simd.suggestVectorLength (which is likely 128-bits) something else might be required to take better advantage of vectors on such platforms.
Hyperfine benchmark results on Apple M4 Pro chip:
Benchmark 1: ./hparse/zig-out/bin/hparse
Time (mean ± σ): 1.464 s ± 0.011 s [User: 1.457 s, System: 0.005 s]
Range (min … max): 1.445 s … 1.481 s 10 runs
Benchmark 2: ./picohttpparser/picohttpparser
Time (mean ± σ): 964.7 ms ± 13.4 ms [User: 959.8 ms, System: 3.3 ms]
Range (min … max): 947.9 ms … 988.6 ms 10 runs
Benchmark 3: ./bench-httparse/target/release/bench-httparse
Time (mean ± σ): 752.5 ms ± 246.2 ms [User: 675.9 ms, System: 2.7 ms]
Range (min … max): 650.8 ms … 1452.6 ms 10 runs
Summary
./bench-httparse/target/release/bench-httparse ran
1.28 ± 0.42 times faster than ./picohttpparser/picohttpparser
1.95 ± 0.64 times faster than ./hparse/zig-out/bin/hparse
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
performanceOptimizations that enhances speed and performance of the moduleOptimizations that enhances speed and performance of the module