Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Jan 21, 2026

  • Create a new LocatorCache<K, V> generic caching abstraction in pet-core
  • Add get(), get_or_insert_with(), insert(), contains_key(), clear(), values(), len(), is_empty(), and clone_map() methods
  • Create type aliases EnvironmentCache and ManagerCache for common uses
  • Refactor pet-conda to use the new cache abstraction
  • Refactor pet-linux-global-python to use the new cache abstraction
  • Run tests to verify the changes work correctly
  • Request code review
  • Run CodeQL security checker
  • Address PR review feedback:
    • Add insert_many() method for atomic batch inserts (addresses lock granularity concern)
    • Add #[must_use] attribute to get_or_insert_with()
    • Update pet-linux-global-python to use insert_many() for symlinks + executable

Summary

This PR introduces a generic LocatorCache<K, V> abstraction in the pet-core crate that consolidates the common caching patterns used across multiple locators. The cache provides:

  1. Thread-safe operations using RwLock
  2. Efficient read access with separate read/write locks
  3. Double-check locking in get_or_insert_with() to prevent redundant computation
  4. Atomic batch inserts via insert_many() for better lock efficiency
  5. Type aliases for common use cases (EnvironmentCache, ManagerCache)

Security Summary

No security vulnerabilities were discovered by CodeQL analysis.

Original prompt

This section details on the original issue you should resolve

<issue_title>Refactor: Consolidate caching patterns across locators into a shared abstraction</issue_title>
<issue_description>## Summary
Multiple locators implement nearly identical caching patterns for environments and managers. Consolidating these into a shared abstraction would reduce code duplication and ensure consistent behavior.

Current Duplicate Patterns

Pattern: Cache with mutex + contains_key check

Found in:

  • crates/pet-conda/src/lib.rs
  • crates/pet-poetry/src/lib.rs
  • crates/pet-windows-store/src/lib.rs
  • crates/pet-linux-global-python/src/lib.rs
// This pattern appears in multiple files:
pub struct SomeLocator {
    environments: Arc<Mutex<HashMap<PathBuf, PythonEnvironment>>>,
}

impl SomeLocator {
    fn find_with_cache(&self) -> Option<...> {
        let environments = self.environments.lock().unwrap();
        if environments.contains_key(&key) {
            return Some(environments.get(&key).unwrap().clone());
        }
        // ... compute and cache
    }
    
    fn clear(&self) {
        self.environments.lock().unwrap().clear();
    }
}

Proposed Solution

Create a generic caching wrapper

// In pet-core or a new pet-cache crate
pub struct LocatorCache<K, V> {
    cache: RwLock<HashMap<K, V>>,
}

impl<K: Eq + Hash, V: Clone> LocatorCache<K, V> {
    pub fn new() -> Self {
        Self { cache: RwLock::new(HashMap::new()) }
    }
    
    pub fn get(&self, key: &K) -> Option<V> {
        self.cache.read().unwrap().get(key).cloned()
    }
    
    pub fn get_or_insert_with<F>(&self, key: K, f: F) -> V
    where
        F: FnOnce() -> Option<V>,
    {
        // Check read lock first
        if let Some(v) = self.cache.read().unwrap().get(&key) {
            return v.clone();
        }
        // Upgrade to write lock and compute
        if let Some(value) = f() {
            self.cache.write().unwrap().insert(key, value.clone());
            value
        }
    }
    
    pub fn clear(&self) {
        self.cache.write().unwrap().clear();
    }
}

// Type aliases for common uses
pub type EnvironmentCache = LocatorCache<PathBuf, PythonEnvironment>;
pub type ManagerCache = LocatorCache<PathBuf, EnvManager>;

Usage in locators

pub struct Conda {
    pub environments: EnvironmentCache,
    pub managers: ManagerCache,
    // ...
}

impl Conda {
    fn try_from(&self, env: &PythonEnv) -> Option<PythonEnvironment> {
        self.environments.get_or_insert_with(path.clone(), || {
            // compute environment
        })
    }
}

Benefits

  1. Consistency: All locators behave the same way
  2. Less code: Remove duplicate boilerplate
  3. Centralized improvements: RwLock upgrade, better error handling, etc. benefit all locators
  4. Testability: Cache behavior can be tested once

Priority

Low - Requires significant refactoring but improves maintainability long-term.</issue_description>

Comments on the Issue (you are @copilot in this section)


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

…ux-global-python

Co-authored-by: karthiknadig <3840081+karthiknadig@users.noreply.github.com>
Copilot AI changed the title [WIP] Refactor caching patterns across locators into a shared abstraction Refactor: Consolidate caching patterns into shared LocatorCache abstraction Jan 21, 2026
Copilot AI requested a review from karthiknadig January 21, 2026 01:56
@karthiknadig
Copy link
Member

PR Review: Refactor Caching Patterns into Shared LocatorCache Abstraction

cc @copilot


Overall: Generally good, with a few concerns worth discussing


✅ What I Like

  1. Well-designed LocatorCache abstraction - The generic cache with RwLock is cleaner than duplicate Mutex-based patterns scattered across locators.

  2. Correct double-check locking in get_or_insert_with - The implementation properly:

    • Checks with read lock first
    • Computes the value outside any lock (preventing holding the lock during expensive operations)
    • Re-checks after acquiring write lock before inserting (prevents race conditions)
  3. RwLock instead of Mutex - Better for read-heavy workloads since multiple readers can access concurrently.

  4. Good unit tests - The cache module includes test coverage for the key operations.

  5. Net code reduction - 226 additions vs 92 deletions for adding a reusable abstraction is reasonable.


⚠️ Concerns

1. Lock Granularity Change in pet-linux-global-python

The original code in find_and_report_global_pythons_in held one lock for multiple inserts:

let mut reported_executables = reported_executables.lock().unwrap();
for symlink in symlinks {
    reported_executables.insert(symlink.clone(), env.clone());
}
if let Some(exe) = env.executable.clone() {
    reported_executables.insert(exe, env.clone());
}

The new code acquires/releases the lock on each insert. This:

  • Is less efficient (more lock acquisitions)
  • Allows another thread to interleave between symlink inserts

This is likely fine in practice, but semantically different. Consider adding a batch insert_many method if this pattern is common.

2. clone_map() clones the entire HashMap

In get_info_for_telemetry, the code calls self.environments.clone_map(). For large caches, this is expensive. The original self.environments.lock().unwrap().clone() has the same cost, so this isn't a regression, but a convenience method like for_each() or using values() (which the PR does add and use elsewhere) might be more idiomatic.

3. conda_executable still uses RwLock manually in pet-conda

Minor inconsistency - the PR changes environments and managers to use RwLock via LocatorCache, but conda_executable: Arc<RwLock<Option<PathBuf>>> still uses RwLock directly. Consider if this should also be wrapped or if a simple option cache pattern is needed.

4. The type aliases aren't used

The PR defines:

pub type EnvironmentCache = LocatorCache<PathBuf, PythonEnvironment>;
pub type ManagerCache = LocatorCache<PathBuf, EnvManager>;

But pet-conda still uses LocatorCache<PathBuf, CondaManager> (which is different from EnvManager). These aliases are nice for documentation but won't unify the conda manager cache. Not a bug, just an observation.


🔍 Minor Nits

  1. The bin parameter in find_and_report_global_pythons_in changed from &str to &Path - a nice improvement but undocumented in the PR description.

  2. Consider #[must_use] on get_or_insert_with to prevent accidentally ignoring the return value.


Verdict

I'd lean toward approving with minor suggestions. The core abstraction is sound, tests are included, and it achieves the stated goal of consolidating duplicate patterns. The lock granularity change is worth being aware of but unlikely to cause issues in practice.

…st_use] to get_or_insert_with

Co-authored-by: karthiknadig <3840081+karthiknadig@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Jan 21, 2026

PR Review: Refactor Caching Patterns into Shared LocatorCache Abstraction

cc @copilot...

Addressed in c8f43b0:

  1. Lock granularity: Added insert_many() method and updated pet-linux-global-python to use it for atomic batch inserts of symlinks + executable.

  2. #[must_use]: Added to get_or_insert_with().

Re: other observations:

  • conda_executable remains RwLock<Option<PathBuf>> intentionally - it's a single configuration value, not a key-value cache
  • Type aliases are for documentation/future use - CondaManager differs from EnvManager by design

@karthiknadig
Copy link
Member

Thanks @copilot - the fixes look good!

  • insert_many() with single write lock ✅
  • #[must_use] on get_or_insert_with
  • Test coverage for batch insert ✅

The explanations for conda_executable and the type aliases make sense. LGTM 👍

@karthiknadig karthiknadig marked this pull request as ready for review January 21, 2026 02:36
@karthiknadig karthiknadig enabled auto-merge (squash) January 21, 2026 02:36
@vs-code-engineering vs-code-engineering bot added this to the January 2026 milestone Jan 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Refactor: Consolidate caching patterns across locators into a shared abstraction

2 participants