Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 38 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -278,12 +278,49 @@ This is most useful when using sccache for Rust compilation, as rustc supports u

---

Normalizing Paths with `SCCACHE_BASEDIRS`
-----------------------------------------

By default, sccache requires absolute paths to match for cache hits. To enable cache sharing across different build directories, you can set `SCCACHE_BASEDIRS` to strip a base directory from paths before hashing:

```bash
export SCCACHE_BASEDIRS=/home/user/project
```

You can also specify multiple base directories by separating them with `|` (pipe character). When multiple directories are provided, the longest matching prefix is used:

```bash
export SCCACHE_BASEDIRS="/home/user/project|/home/user/workspace"
```

Path matching is **case-insensitive** on Windows and **case-sensitive** on other operating systems.

This is similar to ccache's `CCACHE_BASEDIR` and helps when:
* Building the same project from different directories
* Sharing cache between CI jobs with different checkout paths
* Multiple developers working with different username paths
* Working with multiple project checkouts simultaneously

**Note:** Only absolute paths are supported. Relative paths will be ignored with a warning.

You can also configure this in the sccache config file:

```toml
# Single directory
basedirs = ["/home/user/project"]

# Or multiple directories
basedirs = ["/home/user/project", "/home/user/workspace"]
```

---

Known Caveats
-------------

### General

* Absolute paths to files must match to get a cache hit. This means that even if you are using a shared cache, everyone will have to build at the same absolute path (i.e. not in `$HOME`) in order to benefit each other. In Rust this includes the source for third party crates which are stored in `$HOME/.cargo/registry/cache` by default.
* By default, absolute paths to files must match to get a cache hit. To work around this, use `SCCACHE_BASEDIRS` (see above) to normalize paths before hashing.

### Rust

Expand Down
14 changes: 14 additions & 0 deletions docs/Configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,19 @@
# If specified, wait this long for the server to start up.
server_startup_timeout_ms = 10000

# Base directory (or directories) to strip from paths for cache key computation.
# Similar to ccache's CCACHE_BASEDIR. This enables cache hits across
# different absolute paths when compiling the same source code.
# Can be an array of paths. When multiple paths are provided,
# the longest matching prefix is used.
# Path matching is case-insensitive on Windows and case-sensitive on other OSes.
# For example, if basedir is "/home/user/project", then paths like
# "/home/user/project/src/main.c" will be normalized to "./src/main.c"
# for caching purposes.
basedirs = ["/home/user/project"]
# Or multiple directories:
# basedirs = ["/home/user/project", "/home/user/workspace"]
Comment on lines +9 to +20

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bit of a rephrasing suggestion:

Suggested change
# Base directory (or directories) to strip from paths for cache key computation.
# Similar to ccache's CCACHE_BASEDIR. This enables cache hits across
# different absolute paths when compiling the same source code.
# Can be an array of paths. When multiple paths are provided,
# the longest matching prefix is used.
# Path matching is case-insensitive on Windows and case-sensitive on other OSes.
# For example, if basedir is "/home/user/project", then paths like
# "/home/user/project/src/main.c" will be normalized to "./src/main.c"
# for caching purposes.
basedirs = ["/home/user/project"]
# Or multiple directories:
# basedirs = ["/home/user/project", "/home/user/workspace"]
# Base directories to strip from source paths during cache key
# computation.
#
# Similar to ccache's CCACHE_BASEDIR, but supports multiple paths.
#
# 'basedirs' enables cache hits across different absolute root
# paths when compiling the same source code, such as between
# parallel checkouts of the same project, Git worktrees, or different
# users in a shared environment.
# When multiple paths are provided, the longest matching prefix
# is applied.
#
# Path matching is case-insensitive on Windows and case-sensitive on other OSes.
#
# Example:
# basedir = ["/home/user/project"] results in the path prefix rewrite:
# "/home/user/project/src/main.c" -> "./src/main.c"
basedirs = ["/home/user/project"]
# basedirs = ["/home/user/project", "/home/user/workspace"]

I'd not say "can be an array of paths" if the "one path" example already is an array. What happens if you do basedirs = "/home/foo/bar"? (I hope it's not the dreaded Python-like behaviour that it starts chewing away an array of characters one by one…)


[dist]
# where to find the scheduler
scheduler_url = "http://1.2.3.4:10600"
Expand Down Expand Up @@ -134,6 +147,7 @@ Note that some env variables may need sccache server restart to take effect.

* `SCCACHE_ALLOW_CORE_DUMPS` to enable core dumps by the server
* `SCCACHE_CONF` configuration file path
* `SCCACHE_BASEDIRS` base directory (or directories) to strip from paths for cache key computation. This is similar to ccache's `CCACHE_BASEDIR` and enables cache hits across different absolute paths when compiling the same source code. Multiple directories can be separated by `|` (pipe character). When multiple directories are specified, the longest matching prefix is used. Path matching is **case-insensitive** on Windows and **case-sensitive** on other operating systems. Environment variable takes precedence over file configuration. Only absolute paths are supported; relative paths will be ignored with a warning.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually when you provide multiple directories or file paths in environment variables (such as PATH, LD_LIBRARY_PATH, LD_PRELOAD, etc.), the convention is to use ; as a separator. Why was | chosen here instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Linux way is to use : as a PATH delimiter, but the Windows uses : for drive letter providing

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Linux way is to use : as a PATH delimiter

Ah, yes. Of course I mixed it up with CMake, which generally uses ; as an array separator…

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I chosen | because any of , and : could be a part of the valid path. Didn't think of other options, but testing it blown my mind:

> ls -ld test*dir
drwxr-xr-x 2 felixoid felixoid 6 Dec 24 14:13  test:dir
drwxr-xr-x 2 felixoid felixoid 6 Dec 24 14:12 'test;dir'
drwxr-xr-x 2 felixoid felixoid 6 Dec 24 14:13 'test|dir'

The colon doesn't look like an option to me because of the c:/ format. Coma is too familiar.

I am open for suggestions. Maybe, ; looks like a decent compromise.

* `SCCACHE_CACHED_CONF`
* `SCCACHE_IDLE_TIMEOUT` how long the local daemon process waits for more client requests before exiting, in seconds. Set to `0` to run sccache permanently
* `SCCACHE_STARTUP_NOTIFY` specify a path to a socket which will be used for server completion notification
Expand Down
27 changes: 27 additions & 0 deletions src/cache/cache.rs
Original file line number Diff line number Diff line change
Expand Up @@ -381,6 +381,10 @@ pub trait Storage: Send + Sync {
// Enable by default, only in local mode
PreprocessorCacheModeConfig::default()
}
/// Return the base directories for path normalization if configured
fn basedirs(&self) -> &[PathBuf] {
&[]
}
/// Return the preprocessor cache entry for a given preprocessor key,
/// if it exists.
/// Only applicable when using preprocessor cache mode.
Expand Down Expand Up @@ -736,12 +740,35 @@ pub fn storage_from_config(
let preprocessor_cache_mode_config = config.fallback_cache.preprocessor_cache_mode;
let rw_mode = config.fallback_cache.rw_mode.into();
debug!("Init disk cache with dir {:?}, size {}", dir, size);

// Validate that all basedirs are absolute paths
let basedirs: Vec<PathBuf> = config
.basedirs
.iter()
.filter_map(|p| {
if p.is_absolute() {
Some(p.clone())
} else {
warn!(
"Ignoring relative basedir path: {:?}. Only absolute paths are supported.",
p
);
Comment on lines +752 to +755

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a relative basedir is an invalid configuration, wouldn't it make more sense to error here and get the dev/server admin to fix their config?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By "error," you mean "fail to start"? It could be an option as well, I didn't think of it as a misconfiguration.

Looks OK to me to hard fail here.

None
}
})
.collect();

if !basedirs.is_empty() {
debug!("Using basedirs for path normalization: {:?}", basedirs);
}

Ok(Arc::new(DiskCache::new(
dir,
size,
pool,
preprocessor_cache_mode_config,
rw_mode,
basedirs,
)))
}

Expand Down
6 changes: 6 additions & 0 deletions src/cache/disk.rs
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,7 @@ pub struct DiskCache {
preprocessor_cache_mode_config: PreprocessorCacheModeConfig,
preprocessor_cache: Arc<Mutex<LazyDiskCache>>,
rw_mode: CacheMode,
basedirs: Vec<PathBuf>,
}

impl DiskCache {
Expand All @@ -84,6 +85,7 @@ impl DiskCache {
pool: &tokio::runtime::Handle,
preprocessor_cache_mode_config: PreprocessorCacheModeConfig,
rw_mode: CacheMode,
basedirs: Vec<PathBuf>,
) -> DiskCache {
DiskCache {
lru: Arc::new(Mutex::new(LazyDiskCache::Uninit {
Expand All @@ -99,6 +101,7 @@ impl DiskCache {
max_size,
})),
rw_mode,
basedirs,
}
}
}
Expand Down Expand Up @@ -181,6 +184,9 @@ impl Storage for DiskCache {
fn preprocessor_cache_mode_config(&self) -> PreprocessorCacheModeConfig {
self.preprocessor_cache_mode_config
}
fn basedirs(&self) -> &[PathBuf] {
&self.basedirs
}
async fn get_preprocessor_cache_entry(&self, key: &str) -> Result<Option<Box<dyn ReadSeek>>> {
let key = normalize_key(key);
Ok(self
Expand Down
Loading