`CsrParAssembler::assemble_pattern` significantly slower than `CsrAssembler::assemble_pattern` on a single thread

Benchmarks showed ~30-70% overhead for the parallel variant with `RAYON_NUM_THREADS=1`. The discrepancy seems to be primarily related to `rayon`, since some preliminary investigation showed that replacing e.g. `into_par_iter` with `into_iter` accounts for most of the overhead. Further overhead could be removed by using atomic locks (though this requires more thought for efficiently handling the multi-threaded case).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`CsrParAssembler::assemble_pattern` significantly slower than `CsrAssembler::assemble_pattern` on a single thread #58

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

CsrParAssembler::assemble_pattern significantly slower than CsrAssembler::assemble_pattern on a single thread #58

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

`CsrParAssembler::assemble_pattern` significantly slower than `CsrAssembler::assemble_pattern` on a single thread #58