-
Notifications
You must be signed in to change notification settings - Fork 66
[sha512] image/copy: allow users to force digest algorithm in CLI (e.g. skopeo copy --force-digest)
#552
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
✅ A new PR has been created in buildah to vendor these changes: containers/buildah#6597 |
mtrmac
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this is valuable.
On the process side, I’m not quite sure how to go about building this — I’d like to avoid exposing and API that plain doesn’t work for some transports. I think I’d prefer:
- Building this ~one complete feature at a time, e.g. if we add
PutBlobOptions.Digests, then actually implement it in all transports immediately. We should never be failing to enforceForceDigest…and reporting success. (In many cases, aSituation{CannotChangeDigestReason: "a sha256-only transport not updated yet", digest: sha256Value}in a transport would be fine, that’s what the tracker is for, as long as the enforcement of the rules is all there immediately.) - Not providing the public API until all pieces are in place. If we need something for testing, a temporary
BrokenSetForceDigestAlgorithm /* Unstable API, may be changed or removed any time */?
Without a deep review or revisiting earlier PRs, I think missing pieces here are
- multi-platform images: should not be too difficult, but is fairly orthogonal
TryReusingBlob: this +BlobInfoCachewill be significant, it has ~the same prerequisite question “what layer reuse is valid cross-digest” as c/storage — and that’s a major reason why I’d prefer to build this a feature at a time, so that we can get plainPutBlobor configs done without blocking on this- The
PutBlobPartialcode path (plausibly should be just a failure becausepull --force-digest=makes little sense, and would be difficult to implement, I’m not sure) - DiffID computation (maybe keep sha256 and fail on schema1??)
ZstdChunkedcompression using sha512 digests for all its data
image/copy/single.go
Outdated
| if destInfo.Digest != srcInfo.Digest { | ||
| return fmt.Errorf("Internal error: copying uncompressed config blob %s changed digest to %s", srcInfo.Digest, destInfo.Digest) | ||
| // Allow digest algorithm changes when forcing a specific digest algorithm | ||
| forcingDifferentAlgo := ic.c.options.digestOptions.MustUseSet() != "" && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- See the similar check
copyBlobFromStream - “MustUse” Is not the only reason a copy might choose a different algorithm (e.g. we might be pushing to a sha256-only-transport or registry), hard-coding a check only for
MustUsehere is overly specific - Compare the end of
copyLayers, where the equality check is immediately paired with manifest edits [and see elsewhere about manifest edits], so that we always set up edits if we know they are needed, and we minimize the risk of the two getting out of sync. I’m not sure whether the check+update should live incopyConfigorcopyUpdatedConfigAndManifest(guessing the latter), but they should be together.
image/copy/single.go
Outdated
| } | ||
|
|
||
| ic.c.Printf("Writing manifest to image destination\n") | ||
| manifestDigest, err := manifest.Digest(man) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(This is another part that will need updating, for multi-arch images.)
|
FWIW containers/buildah#6597 hits |
I think #512 has a fix, but I’m not going to stay around today to wait for containers/buildah#6552 to confirm. |
I can go ahead with this. Let me know if you'd much rather the other preference though. |
|
I didn’t mean the two bullet points as alternatives, but as two parts of the same plan. My main concern here is having some way of tracking what doesn’t work yet; all of that will need to either be implemented or at least be updated to cleanly fail. So I was thinking “one feature at a time, in all transports”. Doing it that way also means we immediately run into difficulties if some transports’ structures make a design difficult. There may be other ways to track things or to structure the implementation. And maybe that whole discussion is premature, and we need a prototype PR, or a few, sort of like this, testing the full workflow to just run into things and see what are the things that will need to be done. |
The goal is to allow building the digest-choice-dependent machinery from the bottom up without committing to an API while we don't understand the problem space; for now, we don't expose any way for users to actually make a choice. This code is never called at this point, so this should not change behavior. Signed-off-by: Miloslav Trmač <mitr@redhat.com>
for now, purely to let us internally work on implementing digest choice / conversions; there is no way for a caller to set it. Should not change behavior, the field has no users. Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
This is: - UNUSABLE (no external API) - INCOMPLETE (manifest digests, TryReusingBlob, ...) - BROKEN (various code assumes that digest returned from upload matches input) - INCONSISTENT (PutBlobOptions.Digests is only implemented in one transport, and silently ignored elsewhere) Signed-off-by: Miloslav Trmač <mitr@redhat.com>
8a75ccc to
f15717e
Compare
Add UpdateConfigDigest() method to the manifest interface to support updating config descriptor digests when forcing different digest algorithms. This foundational change allows callers to update config digests uniformly across different manifest formats (OCI, Docker schema2) while properly failing for unsupported formats (Docker schema1, which lacks a separate config descriptor). Implementations: - OCI and Docker schema2: Update config digest directly - Docker schema1: Return error (no separate config descriptor) This is needed to support forcing digest algorithms in image copy operations, where the config blob may be re-hashed with a different algorithm. Signed-off-by: Lokesh Mandvekar <lsm5@redhat.com>
Add public API to force a specific digest algorithm when copying images. This allows users to override the default sha256 digest algorithm. The API is marked as "Broken" (unstable) because: - Not all transports support it yet - Multi-arch images are not yet handled - Some edge cases still need work See containers#552 for status. Key behaviors: - Validates algorithm is available on the system - Prevents reconfiguration (fails if already set) - Respects existing digestOptions if already configured This is the user-facing entry point; actual implementation follows in subsequent commits. Signed-off-by: Lokesh Mandvekar <lsm5@redhat.com>
Implement the core logic to force digest algorithms when copying images: 1. Config digest handling: - Modify copyConfig() to return new digest when algorithm changes - Add updateManifestConfigDigest() helper to update config digest in manifest - Use manifest abstraction layer for uniform digest updates 2. Manifest digest selection: - Use digestOptions.Choose() to select manifest digest algorithm - Apply selected algorithm via manifest.DigestWithAlgorithm() 3. Digest mismatch handling: - Allow digest changes only when algorithms differ - Detect corruption when same algorithm produces different digest This builds on the manifest abstraction layer and API from previous commits to enable the actual digest forcing functionality. Note: Guards for unsupported scenarios (multi-arch, schema1, etc.) are added in the next commit to keep this focused on core logic. Signed-off-by: Lokesh Mandvekar <lsm5@redhat.com>
Add guards and limitations for cases where digest forcing is not yet supported or incompatible with certain features: 1. Multi-arch images (multiple.go): - Not yet implemented; needs per-instance config digest updates - Fail cleanly with clear error message - See containers#552 (comment) 2. Docker schema1 manifests (single.go): - Lack separate config descriptor, only support sha256 - Fail early when forcing non-sha256 algorithms 3. zstd:chunked compression (single.go): - Uses sha256 for internal TOC and chunk digests - Fall back to plain zstd when forcing non-sha256 - Fail if requireCompressionFormatMatch is set 4. PutBlobPartial optimization (single.go): - Incompatible with forcing different digest (needs full layer) - Disable partial pull when algorithms don't match 5. Blob reuse optimization (single.go): - Skip blob reuse when digest algorithms don't match - Future: could support cross-digest reuse via BlobInfoCache These guards ensure clean failures and fallbacks rather than silent corruption or confusing errors. Signed-off-by: Lokesh Mandvekar <lsm5@redhat.com>
f15717e to
eca922f
Compare
|
@lsm5 should this still have a draft tag? |
mtrmac
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A review primarily of “manifest: Add UpdateConfigDigest() to manifest abstraction layer”.
|
|
||
| // UpdateConfigDigest updates the config descriptor's digest in the manifest. | ||
| func (m *manifestSchema1) UpdateConfigDigest(newDigest digest.Digest) error { | ||
| return fmt.Errorf("cannot update config digest for schema1 manifest") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not really “cannot update”, the digest doesn’t exist at all. Something like “internal error: caller asked to update non-existent config digest in schema1 manifest”?
Similarly in manifest/.
|
|
||
| func TestSchema1UpdateConfigDigestFails(t *testing.T) { | ||
| m := manifestSchema1FromFixture(t, "schema2-to-schema1-by-docker.json") | ||
| newDigest := digest.Digest("sha512:abcdef1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef1234567890ab") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Using a sha256 digest would be a tiny bit more representative — sha512 is … unreasonable for schema1.)
|
|
||
| err := m.UpdateConfigDigest(newDigest) | ||
| require.Error(t, err) | ||
| assert.Contains(t, err.Error(), "cannot update config digest for schema1 manifest") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I’m generally not much of a fan of tests for error messages, but it doesn’t really hurt. [It’s definitely not worth defining a new error type just for this purpose.])
|
|
||
| // UpdateConfigDigest updates the config descriptor's digest in the manifest. | ||
| // This returns an error if the manifest does not support config digest updates (e.g., schema1 manifests). | ||
| UpdateConfigDigest(newDigest digest.Digest) error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
manifest.Manifest is a public type, so adding new methods to it would be an API break :-|
The image/ callers use specific types and not this interface, so it should be fine to just not add the method to the interface.
| } | ||
|
|
||
| // UpdateConfigDigest updates the config descriptor's digest in the manifest. | ||
| func (m *manifestSchema2) UpdateConfigDigest(newDigest digest.Digest) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a general principle, I’d like the image/ package to have unit tests if at all possible. (It’s isolated and memory-only, so it is easily possible unlike copy/, and it’s frequently doing somewhat non-obvious things — and it’s easier to hold the line when it almost has full coverage already.)
| } | ||
|
|
||
| // FIXME: Single image only; multi-arch needs per-instance config digest updates. | ||
| // See https://github.com/containers/container-libs/pull/552#discussion_r2611627578 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don’t see how this comment matches this code. (And, below, there is manifestDigestAlgo already trying to handle that…)
| // FIXME: Single image only; multi-arch needs per-instance config digest updates. | ||
| // See https://github.com/containers/container-libs/pull/552#discussion_r2611627578 | ||
| if newConfigDigest != nil { | ||
| man, err = ic.updateManifestConfigDigest(man, pendingImage, *newConfigDigest) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way updateManifestConfigDigest etc. works, by first serializing the in-memory data into man a bit above, then parsing it here, updating, and serializing again, is unnecessarily costly.
Instead, we can still edit pendingImage, and serialize into man only after this edit:
(Also, conceptually — we have types.Image.UpdatedImage with ManifestUpdateOptions.LayerInfos, so adding ManifestUpdateOptions.ConfigInfo *types.BlobInfo would be symmetrical, and useful for external callers.)
manifestUpdatesForConfig, err := ic.copyConfig() // see elsewhere about closely co-locating the code that changes values and that arranges for manifest updates
if ! /* generalization of noPendingManifestUpdates */(manifestUpdatesForConfig) {
if ic.cannotModifyManifestReason != "" { /* fail, just to be sure */ }
pendingImage, err = pendingImage.UpdatedImage(ctx, manifestUpdatesForConfig)
}
man, _, err := pendingImage.Manifest(ctx)And to implement that, genericManifest.UpdateConfigDigest does not need to exist — add a ConfigInfo field to ManifestUpdateOptions and implement it by calling the manifests’ UpdateConfigInfo.
| // UpdateConfigDigest updates the config descriptor's digest in the manifest. | ||
| // This returns an error if the manifest does not support config digest updates. | ||
| func (m *OCI1) UpdateConfigDigest(newDigest digest.Digest) error { | ||
| m.Config.Digest = newDigest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not sufficient alone — we might need to update (well, clear) .Data and URLs.
So [as much as the use of BlobInfo to mean both “information about a blob” and “edits to a blob” is unclean, and how we have the ugly CompressionOperation: PreserveOriginal, CompressionAlgorithm: notNil semantics there], I think it’s better if this is UpdateConfigInfo(types.BlobInfo), mirroring UpdateLayerInfos (and probably sharing a helper to do the single-descriptor update). Then we can have copyBlobFromStream worry about producing a correct BlobInfo in that situation.
mtrmac
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review of “image/copy: Add BrokenSetForceDestinationDigestAlgorithm() API”. ACK to having that API.
| if !algo.Available() { | ||
| return fmt.Errorf("digest algorithm %q is not available", algo.String()) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
digests.MustUse already contains this check.
| // It currently only enforces the digest algorithm for a subset of transports and operations. | ||
| // See https://github.com/containers/container-libs/pull/552 for implementation status. | ||
| func (o *Options) BrokenSetForceDestinationDigestAlgorithm(algo digest.Algorithm) error { | ||
| if o.digestOptions.MustUseSet() != "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if o.digestOptions != digests.Options{} would be more general for the future — if some part of a caller is setting any of the options using some future API, this one should not entirely overwrite that configuration.
| } | ||
| // FIXME: digestsOptions exists to gradually build the feature. Provide public API once fully implemented. | ||
| // Set default only if not configured by BrokenSetForceDestinationDigestAlgorithm | ||
| if options.digestOptions.MustUseSet() == "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
digestOptions != digests.Options{}
| assert.Error(t, err) | ||
| } | ||
|
|
||
| func TestBrokenSetForceDestinationDigestAlgorithm(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This belongs to copy_test.go, and TestOptionsBroken… to match the convention.
mtrmac
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A review of “image/copy: Implement core digest forcing logic” (though see earlier about the config digest uploads)
| if stream.info.Digest != "" && uploadedInfo.Digest != stream.info.Digest { | ||
| return types.BlobInfo{}, fmt.Errorf("Internal error writing blob %s, blob with digest %s saved with digest %s", srcInfo.Digest, stream.info.Digest, uploadedInfo.Digest) | ||
| // If algorithms match, the whole digest values must match | ||
| if stream.info.Digest.Algorithm() == uploadedInfo.Digest.Algorithm() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Conceptually, this goes:
- If
stream.Info.Digestis set (otherwise N/A) - … and if the algorithms match (otherwise the comparison is meaningless)
- then the full values should match.
Sure, it’s all within effectively a series of &&, so the ultimate behavior is the same either way, but I think it’s a good habit to only consume values when they are known to exist and to be relevant.
| // Choose the digest algorithm based on digest options | ||
| manifestDigestAlgo, err := ic.c.options.digestOptions.Choose(digests.Situation{}) | ||
| if err != nil { | ||
| return nil, "", fmt.Errorf("choosing manifest digest algorithm: %w", err) | ||
| } | ||
| manifestDigest, err := manifest.DigestWithAlgorithm(man, manifestDigestAlgo) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This probably doesn’t work in general — with a sha256-only transport (e.g. an old registry), we would successfully upload the manifest, compute a sha512 digest, put the sha512 digest into a manifest list — but the sha512 digest would not be resolvable by the transport.
So we will need some cooperation with the transport to choose a digest, that’s easy enough (private.ImageDestination.PutManifestWithOptions(ctx, man, PutManifestOptions{IsNotTopLevel: instanceDigest != nil, Digests: …, CannotChangeManifestReason: …} and have that new method choose an algorithm and compute it. (Changing how we do IsNotTopLevel so that we don’t have to know the instanceDigest value in advance.)
The
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have noted some of the registry concerns in https://issues.redhat.com/browse/PROJQUAY-10117 ; there is also pre-existing upstream discussion in opencontainers/distribution-spec#543 and opencontainers/distribution-spec#494 (although I didn’t carefully read either of them).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note in particular opencontainers/distribution-spec#543 (comment) .
mtrmac
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finally, a review of “image/copy: Add guards for unsupported digest forcing scenarios” — looking only at the proposed changes, not really checking whether anything is missing.
Though the PutBlobPartial path probably needs a guard as well.
| if srcInfo.Digest.Algorithm() != forcedAlgo { | ||
| logrus.Debugf("Skipping blob reuse for %s: digest algorithm %s doesn't match forced algorithm %s", | ||
| srcInfo.Digest, srcInfo.Digest.Algorithm(), forcedAlgo) | ||
| canTryReuse = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably also check tocDigest.
| return types.BlobInfo{}, "", fmt.Errorf("trying to reuse blob %s at destination: %w", srcInfo.Digest, err) | ||
| // FIXME: Blob reuse disabled when forcing different digest algorithm. | ||
| canTryReuse := true | ||
| if forcedAlgo := ic.c.options.digestOptions.MustUseSet(); forcedAlgo != "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this might check for “is not the default digestOptions” (not feeling too strongly about this) — and definitely Warnf if we skip this important performance aspect.
|
|
||
| // FIXME: Multi-arch not supported yet; would need config digest updates for each instance. | ||
| // See https://github.com/containers/container-libs/pull/552#discussion_r2611627578 | ||
| if c.options.digestOptions.MustUseSet() != "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can happen at the very top of the function — and it should check for any non-default digestOptions.
Depends on #512 . Only see HEAD commit here.