Standalone CLI tool that warms Qumulo cluster caches by recursively walking a directory tree and
calling the fetch-data REST API for each file. No file data is transferred to the client; the
cluster reads data directly into server-side caches.
- Python 3.10+. No dependencies beyond the standard library.
- Qumulo Core 7.8.4+ for the
fetch-dataAPI.
python3 -m pip install --user --break-system-packages git+https://github.com/Qumulo/qfetch.git
--break-system-packages is safe here. qfetch has no dependencies, so there is
nothing to conflict with.
qfetch --host <host> --token <token> --path <dir> [options]
Or, from within the project directory:
python3 -m qfetch --host <host> --token <token> --path <dir>
Create a long-lived access token using the Qumulo CLI:
qq auth_create_access_token --self -f token.json
Then use --token-file token.json.
| Flag | Default | Description |
|---|---|---|
--host |
(required) | Hostname, IP, comma-separated list, or IP range |
--token |
API bearer token | |
--token-file |
Path to token JSON file (as created by qq auth_create_access_token -f) |
|
--path |
(required) | Directory path to fetch |
--port |
8000 | REST API port |
--walkers |
4 | Parallel directory walker threads |
--workers |
8 | Parallel file fetch threads |
--insecure |
off | Disable SSL certificate verification |
--max-bytes |
unlimited | Max bytes to fetch per file (B, KB, MB, GB, TB suffixes) |
--no-progress |
off | Disable progress output on stderr |
One of --token or --token-file is required.
Distribute connections across cluster nodes for higher throughput. Each thread gets a sticky connection to one node, assigned round-robin.
# Comma-separated
qfetch --host 10.0.0.1,10.0.0.2,10.0.0.3 ...
# IP range (expands last octet)
qfetch --host 10.0.0.1-10.0.0.4 ...
# Mixed
qfetch --host node1,10.0.0.1-10.0.0.3 ...
$ qfetch \
--host 10.100.0.33-10.100.0.36 \
--token-file token.json \
--insecure \
--walkers 32 \
--workers 64 \
--path /data
Discovered: 3897 files | Fetched: 3897/3897 files | 3.7 GB fetched (1.5 GB/s)
{
"files_found": 3897,
"files_fetched": 3897,
"bytes_fetched": 4013286225,
"elapsed_seconds": 2.49,
"bytes_per_second": 1614567465
}
Progress is printed to stderr; the JSON summary goes to stdout.
resolve_path(--path)
|
v
dir_queue (seeded with root)
|
v
N walker threads ──> list directory entries (paginated) ───┐
| |
├── subdirectories -> back into dir_queue |
└── files -> file_queue |
| |
v |
M fetcher threads ──> POST /fetch-data (loop) |
| |
v |
Progress counters <─────────────────────────┘
- Walkers expand the tree in parallel (BFS-like). A
WalkCoordinatortracks in-flight directories and signals completion when all are enumerated. - Fetchers drain the file queue and call the fetch-data API in a loop until each file is fully cached.
- Connections are persistent per thread (
http.client.HTTPSConnectionwith thread-local storage) to avoid TLS handshake overhead.
From the project directory:
python3 -m unittest