Refactor: Replace unbounded thread spawning with rayon parallel iterators #300

Copilot · 2026-01-21T00:06:10Z

Several locations spawn unbounded threads during environment discovery, which can cause thread exhaustion on systems with many conda environments or symlinks.

Changes

pet-conda/lib.rs: get_conda_environments() now uses par_iter() instead of spawning one thread per path
pet-conda/environment_locations.rs: get_conda_environment_paths() uses par_iter() for environment enumeration
pet-homebrew/sym_links.rs: get_known_symlinks() uses par_iter() for symlink resolution

Added rayon = "1.11.0" to both crates.

Before/After

// Before: unbounded thread spawning
let mut threads = vec![];
for path in paths {
    threads.push(thread::spawn(move || get_conda_environment_info(&path, &mgr)));
}
// join all...

// After: bounded parallel iteration via rayon
paths.par_iter()
    .filter_map(|path| get_conda_environment_info(path, manager))
    .collect()

crates/pet/src/find.rs left unchanged—it uses thread::scope with a fixed number of locators/directories, which is already bounded.

Original prompt

This section details on the original issue you should resolve

<issue_title>Refactor: Consider using thread pool instead of unbounded thread spawning</issue_title>
<issue_description>## Summary
Several locations spawn an unbounded number of threads during environment discovery. On systems with many Python environments (e.g., many conda envs or large PATH), this could lead to thread exhaustion or excessive context switching.

Affected Locations

1. crates/pet-conda/src/lib.rs (lines 368-378)
fn get_conda_environments(paths: &Vec<PathBuf>, manager: &Option<CondaManager>) -> Vec<CondaEnvironment> {
    let mut threads = vec![];
    for path in paths {
        let path = path.clone();
        let mgr = manager.clone();
        threads.push(thread::spawn(move || {
            // ...
        }));
    }
    // ...
}
2. crates/pet-homebrew/src/sym_links.rs (lines 33-55)
let threads = symlinks
    .iter()
    .map(|symlink| {
        std::thread::spawn(move || {
            // ...
        })
    })
    .collect::<Vec<_>>();
3. crates/pet/src/find.rs

Multiple thread::scope with spawns for each path/locator.

Proposed Solutions

Option 1: Use rayon for parallel iteration
use rayon::prelude::*;

fn get_conda_environments(paths: &[PathBuf], manager: &Option<CondaManager>) -> Vec<CondaEnvironment> {
    paths.par_iter()
        .filter_map(|path| get_conda_environment_info(path, manager))
        .collect()
}
Option 2: Use bounded thread pool
use std::sync::mpsc;
use threadpool::ThreadPool;

let pool = ThreadPool::new(num_cpus::get());
for path in paths {
    pool.execute(move || {
        // ...
    });
}
pool.join();
Option 3: Use thread::scope with chunking
thread::scope(|s| {
    for chunk in paths.chunks(num_cpus::get()) {
        s.spawn(|| {
            for path in chunk {
                // process path
            }
        });
    }
});
Benefits

Controlled parallelism based on CPU count

Better resource management

Avoid thread exhaustion on systems with hundreds of environments

rayon provides work-stealing for better load balancing

Considerations

rayon adds a dependency but is widely used and well-maintained

thread::scope (used in many places already) is good for structured concurrency

The impact depends on typical environment counts

Priority

Low - Current implementation works but could cause issues at scale.</issue_description>

<agent_instructions>Fix the issue. See if the recommendation makes sense and create a fix.</agent_instructions>

Comments on the Issue (you are @copilot in this section)

Fixes Refactor: Consider using thread pool instead of unbounded thread spawning #288

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

…tors - Add rayon 1.11.0 dependency to pet-conda and pet-homebrew crates - Refactor get_conda_environments() in lib.rs to use par_iter() - Refactor get_conda_environment_paths() in environment_locations.rs to use par_iter() - Refactor get_known_symlinks() in sym_links.rs to use par_iter() This provides controlled parallelism based on CPU count, better resource management, and avoids potential thread exhaustion on systems with hundreds of Python environments. Co-authored-by: karthiknadig <3840081+karthiknadig@users.noreply.github.com>

karthiknadig · 2026-01-21T02:04:52Z

PR Review: Replace unbounded thread spawning with rayon

cc @copilot

Overall Assessment: Positive with minor observations

The PR correctly addresses a legitimate issue—unbounded thread spawning can cause thread exhaustion on systems with many conda environments (hundreds is common for data scientists).

Strengths

Correct problem identification: The current code spawns one OS thread per path/symlink, which is inefficient and resource-intensive. On systems with 200+ conda environments, this creates 200+ threads.
Good solution choice: Rayon is the idiomatic Rust solution for this:
- Work-stealing thread pool (bounded to CPU count by default)
- Efficient parallel iteration with par_iter()
- Well-maintained and widely used in the Rust ecosystem
Clean refactoring: The changes are minimal and semantically equivalent across all three locations.
Correct decision to leave find.rs unchanged: The existing thread::scope with bounded spawns (fixed number of locators/workspace directories) is appropriate.

Observations

Semantic equivalence in get_known_symlinks(): The transformation correctly preserves the original behavior—capturing known_symlinks before parallel iteration matches the original per-thread cloning.
Rayon global pool interaction: Since find.rs uses thread::scope which spawns locators in parallel, and each Conda locator now internally uses rayon, all parallel work beneficially shares the same bounded thread pool.
The filter_map in lib.rs: Nice simplification—it cleanly handles the Option return instead of the original vec wrapper pattern.

Verification

// Before (environment_locations.rs)
threads.push(thread::spawn(move || get_environments(&path)));

// After
.flat_map(|path| get_environments(path))

Semantically correct—par_iter() borrows by reference and get_environments takes &Path.

Recommendation

Looks good to merge once CI passes. The dependency addition (rayon 1.11) is reasonable and the refactoring is sound.

Initial plan

b2a7706

Copilot AI assigned Copilot and karthiknadig Jan 21, 2026

Copilot started work on behalf of karthiknadig January 21, 2026 00:06 View session

Copilot AI changed the title ~~[WIP] Refactor to use thread pool for environment discovery~~ Refactor: Replace unbounded thread spawning with rayon parallel iterators Jan 21, 2026

Copilot AI requested a review from karthiknadig January 21, 2026 00:16

Copilot finished work on behalf of karthiknadig January 21, 2026 00:16

Copilot started work on behalf of karthiknadig January 21, 2026 02:05 View session

Copilot finished work on behalf of karthiknadig January 21, 2026 02:06

karthiknadig approved these changes Jan 21, 2026

View reviewed changes

karthiknadig marked this pull request as ready for review January 21, 2026 02:37

karthiknadig enabled auto-merge (squash) January 21, 2026 02:37

vs-code-engineering bot added this to the January 2026 milestone Jan 21, 2026

roblourens approved these changes Jan 21, 2026

View reviewed changes

karthiknadig merged commit 82e47f0 into main Jan 21, 2026
26 checks passed

karthiknadig deleted the copilot/refactor-thread-spawning branch January 21, 2026 03:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactor: Replace unbounded thread spawning with rayon parallel iterators #300

Refactor: Replace unbounded thread spawning with rayon parallel iterators #300

Copilot AI commented Jan 21, 2026 •

edited

Loading

Uh oh!

karthiknadig commented Jan 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Refactor: Replace unbounded thread spawning with rayon parallel iterators #300

Refactor: Replace unbounded thread spawning with rayon parallel iterators #300

Conversation

Copilot AI commented Jan 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Before/After

Affected Locations

1. crates/pet-conda/src/lib.rs (lines 368-378)

2. crates/pet-homebrew/src/sym_links.rs (lines 33-55)

3. crates/pet/src/find.rs

Proposed Solutions

Option 1: Use rayon for parallel iteration

Option 2: Use bounded thread pool

Option 3: Use thread::scope with chunking

Benefits

Considerations

Priority

Comments on the Issue (you are @copilot in this section)

Uh oh!

karthiknadig commented Jan 21, 2026

PR Review: Replace unbounded thread spawning with rayon

Overall Assessment: Positive with minor observations

Strengths

Observations

Verification

Recommendation

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Jan 21, 2026 •

edited

Loading

1. `crates/pet-conda/src/lib.rs` (lines 368-378)

2. `crates/pet-homebrew/src/sym_links.rs` (lines 33-55)

3. `crates/pet/src/find.rs`

Option 1: Use `rayon` for parallel iteration

Option 3: Use `thread::scope` with chunking