Skip to content

Fix discovery watcher ignoring frontend's --model-path#1

Open
Pernekhan wants to merge 4 commits intomainfrom
claude/fix-discovery-model-path-IkGWj
Open

Fix discovery watcher ignoring frontend's --model-path#1
Pernekhan wants to merge 4 commits intomainfrom
claude/fix-discovery-model-path-IkGWj

Conversation

@Pernekhan
Copy link
Copy Markdown
Collaborator

Summary

  • Fix ModelWatcher ignoring the frontend's --model-path when processing discovered workers in disaggregated serving setups
  • Pass the frontend's local model path into ModelWatcher and use it to re-point discovered worker card file references (config.json, tokenizer.json, etc.) to the frontend's local directory before download_config()
  • Since files exist locally after update_dir(), download_config() returns early via has_local_files() without attempting any HuggingFace download

Problem

When the dynamo frontend is deployed with --model-path pointing to a local tokenizer directory, the discovery watcher ignores this path entirely. It attempts to load model config/tokenizer from the path advertised by discovered workers — a local filesystem path that only exists on the worker nodes. This causes model registration to fail (/v1/models returns empty, inference returns 404).

Changes

  1. lib/llm/src/discovery/watcher.rs — Added local_model_path: Option<PathBuf> field to ModelWatcher. In do_worker_set_registration(), calls card.update_dir() with the frontend's local path before download_config().
  2. lib/llm/src/model_card.rs — Made update_dir() public so the watcher can call it.
  3. lib/llm/src/entrypoint/input/common.rs — Passes LocalModel::path() to ModelWatcher::new().
  4. lib/llm/src/entrypoint/input/http.rs — Same plumbing through run_watcher().
  5. lib/llm/src/entrypoint/input/grpc.rs — Same plumbing through run_watcher().
  6. lib/llm/tests/http_metrics.rs — Updated test call sites with None for the new parameter.

Test plan

  • Deploy disaggregated prefill + decode workers with local model weights
  • Deploy frontend with --model-path pointing to local tokenizer directory and --discovery-backend=kubernetes
  • Verify /v1/models returns the discovered model
  • Verify /v1/chat/completions returns successful inference results
  • Verify no HuggingFace download attempts in frontend logs when local files are present
  • Verify existing single-node deployments (no --model-path on frontend) still work via HF download fallback

https://claude.ai/code/session_01FDtNTJPnuwHymY56WGNUg4

…enizer

When the dynamo frontend is deployed with --model-path pointing to a local
tokenizer directory, the discovery watcher was ignoring this path entirely.
Instead, it attempted to load model config/tokenizer from the path advertised
by discovered workers, which is a local filesystem path that only exists on
the worker nodes. This caused model registration to fail in disaggregated
serving setups.

The fix passes the frontend's local model path into ModelWatcher and uses it
to re-point the discovered worker's card file references (config.json,
tokenizer.json, etc.) to the frontend's local directory before attempting
download_config(). Since the files exist locally, download_config() returns
early via has_local_files() without attempting any HuggingFace download.

https://claude.ai/code/session_01FDtNTJPnuwHymY56WGNUg4
@Pernekhan Pernekhan force-pushed the claude/fix-discovery-model-path-IkGWj branch 2 times, most recently from a129347 to 4232f93 Compare March 24, 2026 18:42
Extract `prepare_card_for_download` from `do_worker_set_registration` so the
watcher's card preparation logic can be unit-tested independently.

The key test `test_prepare_card_with_local_path_succeeds` exercises the actual
watcher code path and FAILS without the fix — verified by removing the
`update_dir` call from `prepare_card_for_download`.

https://claude.ai/code/session_01FDtNTJPnuwHymY56WGNUg4
@Pernekhan Pernekhan force-pushed the claude/fix-discovery-model-path-IkGWj branch from 4232f93 to b7d4a74 Compare March 24, 2026 23:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants