fix(install): skip cache for unpinned virtual deps when on-disk hash matches lockfile (closes #1548)#1553
Open
danielmeppiel wants to merge 1 commit into
Open
Conversation
…matches lockfile When a git/virtual-file dependency was recorded in the lockfile without a resolved_commit (e.g. ADO partial-clone fallback could not pin a SHA, or a virtual-file dep was carved out without a commit anchor), the second install bypassed the content-hash cache-skip and re-downloaded the package every time. Re-clones of floating refs can yield a slightly different fresh hash, which then trips the supply-chain mismatch check in sources.py with a false-positive 'supply-chain attack' alert. Add cache-skip parity with the existing resolved_commit and registry branches: when the lockfile recorded a content_hash for an unpinned dep and the on-disk content still hashes to that value, skip re-download. Supply-chain detection is preserved -- the new branch only fires when on-disk content is verified unchanged; any real divergence falls through to the fresh-download flow where the existing mismatch check still runs. closes #1548 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Fixes a false-positive supply-chain Content hash mismatch failure on the second apm install run for unpinned git/virtual-file dependencies by allowing cache reuse when the on-disk content still matches the lockfile-recorded content_hash.
Changes:
- Extend
_resolve_download_strategycache-skip logic to handle lockfile entries that have acontent_hashbut noresolved_commit(and no ref drift / update mode). - Add unit tests covering the new skip/no-skip behavior for unpinned git dependencies based on
verify_package_hash. - Add a changelog entry describing the fix.
Show a summary per file
| File | Description |
|---|---|
src/apm_cli/install/phases/integrate.py |
Adds a cache-skip branch for unpinned deps when on-disk content matches the lockfile content_hash. |
tests/unit/install/test_integrate_phase3w5.py |
Adds regression tests for the new unpinned-dependency cache-skip behavior (positive + negative + guard). |
CHANGELOG.md |
Documents the fix under Unreleased. |
Copilot's findings
- Files reviewed: 3/3 changed files
- Comments generated: 1
|
|
||
| ### Fixed | ||
|
|
||
| - `apm install` no longer fails the second run with a false-positive `Content hash mismatch ... supply-chain attack` error for unpinned git/virtual-file dependencies whose lockfile entry could not record a `resolved_commit` (e.g. ADO partial-clone fallback paths). When the on-disk content still hashes to the lockfile-recorded value, the redundant re-download is now skipped, mirroring the existing cache-skip parity with pinned and registry deps. Real content divergence still falls through to the supply-chain verification path. (closes #1548) -- by @OskarKlintrot |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #1548.
apm installwas failing on the second run with a false-positiveContent hash mismatch ... supply-chain attackerror for unpinned git/virtual-file dependencies. This PR adds the missing cache-skip branch so the redundant re-download (and the false alert it triggered) goes away, without weakening supply-chain detection.Problem
When a git or virtual-file dep was recorded in
apm.lock.yamlwithout aresolved_commit-- typically because the underlying transport fell back from partial-clone to full bare-clone (observed on Azure DevOps repos whose server does not support partial-clone filter v2) and APM had no SHA to pin against the floating ref -- the cache-skip logic in_resolve_download_strategy(src/apm_cli/install/phases/integrate.py:113-160) had no branch covering the case. The two existing branches required either:locked_dep.resolved_committo be set and not"cached", orlocked_dep.source == "registry".Unpinned git deps fell through both,
lockfile_matchstayedFalse, and the package was re-downloaded on every install. A re-clone of a floating ref can produce a fresh hash that does not byte-match the lockfile-recordedcontent_hash, so the supply-chain check atsrc/apm_cli/install/sources.py:766-787thensys.exit(1)with "supply-chain attack" wording -- the false positive reported in #1548.Fix
Add cache-skip parity for the unpinned class. In
_resolve_download_strategy, after the existing registry branch, add aneliffor(locked_dep, locked_dep.content_hash, not ref_changed, not update_refs). When_should_skip_redownloadconfirms the on-disk content still hashes to the lockfile-recorded value, setlockfile_match = True(with the_via_content_hash_onlyflag for the existing self-heal pipeline). The package is intact, so re-downloading it is wasted work and the source of the false positive.This matches Option 1 from the triage brief. Option 2 (record a
resolved_commiteven on partial-clone-fallback paths) is more architecturally correct but ripples through the lockfile schema and self-heal chain; this PR keeps the change isolated and reversible.Supply-chain boundary
The supply-chain check is preserved. The new branch only fires when
verify_package_hash(install_path, locked_dep.content_hash)returnsTrue-- i.e. on-disk bytes already hash to the lockfile-recorded value. We never suppress a fresh download that would otherwise have produced a different hash than the lockfile; we only suppress a redundant download whose result we have already verified against the lockfile via the on-disk content. Real divergence (file actually tampered) still falls through to the fresh-download path wheresources.pyruns the existing mismatch check.Tests
tests/unit/install/test_integrate_phase3w5.py-- three new cases onTestResolveDownloadStrategy:test_unpinned_git_dep_skips_when_on_disk_hash_matches-- positive: withresolved_commit=None,content_hash="deadbeef",source="git"andverify_package_hash -> True, assertsskip_download is True. Fails at HEAD without the fix; passes after. Verified the mutation-break gate bygit stashof the integrate.py change: this test fails, confirming the fix is the cause of the green.test_unpinned_git_dep_redownloads_when_on_disk_hash_differs-- negative gate: same setup,verify_package_hash -> False, assertsskip_download is False. Proves a real on-disk mismatch is NOT masked by the new branch.test_unpinned_git_dep_no_skip_when_lockfile_lacks_content_hash-- guard: when neitherresolved_commitnorcontent_hashare present there is no integrity anchor, so the branch must not fire.Full suite:
1638 passedacrosstests/unit/install/,tests/unit/test_install_command.py,tests/unit/test_drift_registry.py,tests/unit/test_drift_detection.py.How to test locally
apm.ymlwith a git dep on a repo whose server cannot serve a partial-clone (or any virtual-file dep that lands without aresolved_commit).apm installonce -- this succeeds and writes a lockfile entry withcontent_hashset andresolved_commitempty.apm installa second time. Before this PR:[x] Content hash mismatch ... supply-chain attack. After this PR: install succeeds, the cached package is reused, no re-download, no false alert.Validation evidence
uv run --extra dev ruff check src/ tests/-- silent.uv run --extra dev ruff format --check src/ tests/-- silent.uv run --extra dev python -m pylint --disable=all --enable=R0801 --min-similarity-lines=10 --fail-on=R0801 src/apm_cli/-- 10.00/10.bash scripts/lint-auth-signals.sh-- clean.test_integrate_phase3w5.py; broader: 1638 / 1638 across install + drift modules.ASCII-only.
Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com