Skip to content

perf(snapshots): Parallelize image hashing with rayon#3250

Open
NicoHinderling wants to merge 2 commits intomasterfrom
03-26-perf_snapshots_parallelize_image_hashing_with_rayon
Open

perf(snapshots): Parallelize image hashing with rayon#3250
NicoHinderling wants to merge 2 commits intomasterfrom
03-26-perf_snapshots_parallelize_image_hashing_with_rayon

Conversation

@NicoHinderling
Copy link
Contributor

@NicoHinderling NicoHinderling commented Mar 26, 2026

Use rayon's par_iter to hash all images concurrently instead of sequentially. Also increase the hash read buffer from 8KB to 64KB to reduce syscall overhead. Reduces hashing time from 5.3s to 0.8s (6.6x speedup) on a 753-image / 99MB dataset.

Copy link
Contributor Author

NicoHinderling commented Mar 26, 2026

@NicoHinderling NicoHinderling marked this pull request as ready for review March 26, 2026 21:18
@NicoHinderling NicoHinderling requested review from a team as code owners March 26, 2026 21:18
@NicoHinderling NicoHinderling changed the base branch from fix/chunk-snapshot-uploads to graphite-base/3250 March 26, 2026 21:28
@NicoHinderling NicoHinderling force-pushed the 03-26-perf_snapshots_parallelize_image_hashing_with_rayon branch from 97fb068 to 3c123e0 Compare March 26, 2026 21:28
@NicoHinderling NicoHinderling changed the base branch from graphite-base/3250 to master March 26, 2026 21:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants