fix: respect HF_HUB_OFFLINE in download_model to avoid network calls#614
Conversation
📝 WalkthroughWalkthroughThe Estimated code review effort🎯 1 (Trivial) | ⏱️ ~2 minutes Possibly related issues
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment Tip CodeRabbit can generate a title for your PR based on the changes.Add |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@fastembed/common/model_management.py`:
- Around line 398-400: Update the HF_HUB_OFFLINE check to treat common truthy
variants (e.g., "1", "true", "yes", "on") case-insensitively instead of only
"1": read env = os.environ.get("HF_HUB_OFFLINE", "0").lower() and if env in
{"1","true","yes","on"} and not local_files_only set local_files_only = True and
kwargs["local_files_only"] = True (this change should be applied where
local_files_only and kwargs["local_files_only"] are currently set).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: c8f28c3e-2adb-493f-8790-db9130a2df1d
📒 Files selected for processing (1)
fastembed/common/model_management.py
When HF_HUB_OFFLINE is set to a truthy value (1, true, yes, on), download_model() should treat local_files_only=True to avoid any network calls. Currently, even with the local-cache-first pass (which may fail due to missing metadata), the retry loop still calls download_files_from_huggingface() without local_files_only, which triggers model_info() — a network API call that immediately fails in offline mode. This causes an unnecessary fallback to GCS download from storage.googleapis.com. By setting local_files_only=True when HF_HUB_OFFLINE is enabled: 1. The HF local cache pass works if the model is cached 2. The retry loop skips the network-dependent HF path entirely 3. retrieve_model_gcs() only checks for local fast-* directories 4. No network calls are attempted at all The truthy value check aligns with huggingface_hub's own parsing of HF_HUB_OFFLINE, which accepts "1", "true", "yes", "on" (case-insensitive). This is critical for air-gapped / restricted environments where both HuggingFace and Google Cloud Storage are unreachable. Made-with: Cursor
51f5789 to
5b9e072
Compare
|
Hey @amasolov Thanks for pointing it out and creating a fix! |
|
Though, @amasolov are you sure you were using the latest version of fastembed? In the latest version there is this code: if hf_source:
try:
cache_kwargs = deepcopy(kwargs)
cache_kwargs["local_files_only"] = True
return Path(
cls.download_files_from_huggingface(
hf_source,
cache_dir=cache_dir,
extra_patterns=extra_patterns,
**cache_kwargs,
)
)
except Exception:
pass
finally:
enable_progress_bars()Which tries to read a model from the disk if it exists, and if it does not - it fallbacks to the normal downloading process. I tried running this simple snippet 2 times: one with internet connection available (to download the model) and then completely without internet connection and it ran successfuly: from fastembed import TextEmbedding
te = TextEmbedding(cache_dir='./offline_models')
print(next(te.embed('qwerty'))) |
|
Nevertheless, I still find it a good thing to add, it is available as of fastembed 0.8.0 |
Fixes #615
Related: #565
Summary
When
HF_HUB_OFFLINEis set to a truthy value (1,true,yes,on),download_model()should not attempt any network calls. Currently, even though there is alocal_files_only=Truefirst pass, if it fails (e.g. missing metadata file), the retry loop still callsdownload_files_from_huggingface()withoutlocal_files_only, which triggersmodel_info()— a network API call that immediately raisesEnvironmentErrorin offline mode. This causes an unnecessary fallback to GCS, downloading ~83MB fromstorage.googleapis.comon every startup.In air-gapped / restricted environments where both HuggingFace and Google Cloud Storage are unreachable, this means fastembed cannot load models that are already present in the local cache.
Fix
Set
local_files_only=Trueat the top ofdownload_model()whenHF_HUB_OFFLINEis set to a truthy value. This ensures:snapshot_download(..., local_files_only=True)— works if the model is cachedretriesis set to 1 (no unnecessary retries)hf_source and not local_files_onlyisFalse)retrieve_model_gcs()only checks for localfast-*directories without downloadingThe truthy value check (
1,TRUE,YES,ON, case-insensitive) aligns withhuggingface_hub's own parsing ofHF_HUB_OFFLINE.Context
This was discovered while deploying NeMo Guardrails on Red Hat OpenShift AI in corporate air-gapped environments. With
HF_HUB_OFFLINE=1, fastembed'sall-MiniLM-L6-v2model (pre-cached in the container image during build) could not be loaded — the HF path raised an offline error, and the GCS fallback tried to download fromstorage.googleapis.comwhich was also blocked.All Submissions:
New Feature Submissions:
pre-commitwithpip3 install pre-commitand set up hooks withpre-commit install?