-
Notifications
You must be signed in to change notification settings - Fork 85
Open
Description
Context
PR #859 added cost_distance() with NumPy and Dask+NumPy backends. GPU backends (CuPy, Dask+CuPy) were intentionally deferred because the algorithm requires a fundamentally different approach on GPU.
Problem
The current implementation uses multi-source Dijkstra with a binary min-heap, which is inherently sequential — it processes pixels in priority-queue order. This maps poorly to GPU parallelism where thousands of threads need independent work.
Possible GPU algorithms
- Parallel label-correcting (Bellman-Ford variant): Wavefront propagation where all pixels at the current frontier are relaxed in parallel. Simple to implement on GPU but may require many iterations to converge.
- Delta-stepping: Bucketed relaxation that groups edges by cost into "delta" buckets. Pixels within each bucket can be relaxed in parallel. Better work-efficiency than naive Bellman-Ford but more complex to implement.
- Parallel Dijkstra with work-efficient heap: Use a GPU-friendly priority queue (e.g., bucket queue backed by atomic operations). Harder to implement but closest to the CPU algorithm's efficiency.
References
- Meyer & Sanders, "Delta-stepping: a parallelizable shortest path algorithm" (2003)
- Ortega-Arranz et al., "A new GPU-based approach to the Shortest Path problem" (2013)
- Davidson et al., "Work-Efficient Parallel GPU Methods for Single-Source Shortest Paths" (2014)
Scope
- CuPy backend for
cost_distance() - Dask+CuPy backend for
cost_distance() - Update README feature matrix GPU columns
- Add GPU test parametrization to
test_cost_distance.py
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels