VertexId/EdgeIdareuint64_taliases used across the codebase;Weightis anint64_tfor weighted edges.VertexEdgeMappingstores the[edge_start_index, edge_end_index)interval for a vertex in the global edge array.CSRGraphis the sequential graph view withvertex_array(prefix sums) andedge_array(adjacency list).DistributedCSRGraphwraps distributed vertex/edge arrays plus a matching array of edge weights. The vertex distribution (std::shared_ptr<DistributionStrategy>) defines ownership for vertices and, by extension, component roots in distributed algorithms.
DistributionStrategyis the abstract interface; every strategy must implementowner,local_size,to_local_index,to_global_index, andtotal_elements.BlockDistributionis constructed from a prefix-sum array over workers. Rankkowns indices[_distribution[k], _distribution[k+1]); all conversions between global/local indices use this array.RoundRobinDistributionowns everynum_ranks-th element.to_local_indexthrows if invoked on a rank that does not own the element.
- Constructed with a
DistributionStrategyand akamping::Communicator. Optional third argument fills the local slice with a constant. initialize_local(elements, rank)swaps in a pre-built local slice; it asserts the input matches the strategy’slocal_size(rank).set(global_index, value)queues remote writes that are flushed viaexchange(sub_comm); local indices are written immediately.set(global_index, value, OperationType)merges values withOperationType::{IDENTITY, MIN, MAX}locally or afterexchange.get(global_index)is only valid on the owning rank; it throws if called elsewhere.local_data()exposes the mutable local vector for in-place updates.print_local()/print_local_vertex()are debugging helpers.gather()exists for debugging/inspection only; production code must not rely on it because it materializes the entire array on one rank.
- Wraps a
std::vector<T>owned by a communicator; inserts append,removeerases the first match, andcontainsperforms a linear scan. deduplicate()sorts the local vector and removes duplicates; use it before communications when uniqueness matters.filter(predicate)removes all elements for whichpredicate(element)returnstrue.redistribute(mapping)prepares an all-to-all exchange based on a user-supplied ownership function. Elements mapped to the current rank remain local; the rest are sent and replaced with the received buffer.max_send_buffer_sizetracks the peak send volume for diagnostics.redistribute(mapping, join_communication)temporarily splits the communicator (withMPI_Comm_splitthrough kamping) when a subset should participate.updateCommunicator(new_comm)swaps the communicator in-place so the container aligns with externally split communicators.SortedDistributedSet<T>maintains the sorted invariant by inserting withlower_boundand reusing the base class utilities.
-
SetBasedDistributedFrontierwraps aDistributedSet<VertexId>for BFS-style waves. Construction seeds the frontier with vertices whose owner matches the local rank. It exposes:add(vertex)andclear()for local staging.exchange(mapping)to move vertices to their owning rank (typicallygraph->vertex_dist->owner).deduplicate()andfilter(predicate)to prune duplicates or already-processed vertices.local_frontier()for the current slice, plusmax_send_buffer_size()andupdateCommunicator()passthroughs.
-
TraditionalDistributedFrontier<T>is an alternative queue built onstd::unordered_set. Key methods:- Constructors accept a
DistributionStrategyand communicator, optionally with an initial frontier vector. addqueues local or remote vertices while preventing duplicates with_visited.exchange()performs an all-to-all, merges_next_frontierwith received vertices, assigns BFS levels in_distances, clears staging buffers, and returnstruewhen all ranks report an empty frontier (via logicalAND).visited(element)queries_visited;local_frontier()anddistances()expose the current frontier and per-vertex depth map.
- Constructors accept a
-
DistributedBFSconsumes aDistributedCSRGraph, communicator, and initial frontier. Internals:- Maintains
_distancesas aDistributedArray<uint64_t>initialized tostd::numeric_limits<uint64_t>::max(). - The BFS loop: visit every vertex in the current local frontier, set its distance with
OperationType::MIN, collect neighbors, rebuild the frontier,exchange()to send non-local vertices home,deduplicate(), andfilter()away already-reached vertices.global_activeis derived from a logical-OR allreduce overlocal_active. - Accessors:
distances(),frontier(),get_current_distance(), andupdateCommunicator().
- Maintains
-
TraditionalDistributedBFSusesTraditionalDistributedFrontierto propagate frontiers level by level untilexchange()signals global quiescence. Useful for comparisons with the set-based variant. -
BFSBasedDistributedConnectedComponentiterates over the block of vertices owned by each rank (block distribution expected). Whenever_bfs_runner.distances().get(vertex)is still∞, the vertex seeds a new BFS, and the component count increments. The final count is reduced viaplusallreduce. -
BFSBasedDistributedConnectedComponentWithSubCommOptimizationextends the above by splitting communicators once ranks exhaust their local vertex range, shrinking the active communicator for later BFS waves. -
LabelPropagationandMinLabelPropagationprototypes demonstrate howDistributedArraycan be reused for iterative label relaxation, but the current implementations are incomplete/commented out.
src/main.cppexposes a CLI (--generator,--start,--print-graph,--algorithm). It builds distributed vertex/edge arrays from KaGen output, constructs aDistributedCSRGraph, and runs eitherDistributedBFSorBFSBasedDistributedConnectedComponent. Timers are gathered and printed as JSON.src/run_bfs.cppmirrors the setup but selects a random global source vertex, runs an initial warm-up BFS to amortize MPI startup, and then measures a second BFS.src/run_cc.cppis the connected-components driver; it prints summary statistics (component count, rank count) after execution.
- Avoid
DistributedArray::gatherin production logic; it is intended for debugging small cases only. DistributedArray::initialize_localcurrently throws if the element count mismatches the strategy; a TODO suggests a known bug, so be cautious when reusing it with different distributions.- Many algorithms assume
BlockDistributionfor vertex ownership; ensure alternative strategies provide equivalent invariants before reuse. - Kamping collectives (
alltoallv,allgatherv,allreduce,barrier) are used to coordinate distributed state; synchronize timers before measurements when reproducibility matters.