Skip to content

Conversation

@lsm5
Copy link
Member

@lsm5 lsm5 commented Dec 10, 2025

Depends on #512 . Only see HEAD commit here.

@github-actions github-actions bot added the image Related to "image" package label Dec 10, 2025
podmanbot pushed a commit to podmanbot/buildah that referenced this pull request Dec 10, 2025
@podmanbot
Copy link

✅ A new PR has been created in buildah to vendor these changes: containers/buildah#6597

Copy link
Contributor

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is valuable.

On the process side, I’m not quite sure how to go about building this — I’d like to avoid exposing and API that plain doesn’t work for some transports. I think I’d prefer:

  • Building this ~one complete feature at a time, e.g. if we add PutBlobOptions.Digests, then actually implement it in all transports immediately. We should never be failing to enforce ForceDigest… and reporting success. (In many cases, a Situation{CannotChangeDigestReason: "a sha256-only transport not updated yet", digest: sha256Value} in a transport would be fine, that’s what the tracker is for, as long as the enforcement of the rules is all there immediately.)
  • Not providing the public API until all pieces are in place. If we need something for testing, a temporary BrokenSetForceDigestAlgorithm /* Unstable API, may be changed or removed any time */?

Without a deep review or revisiting earlier PRs, I think missing pieces here are

  • multi-platform images: should not be too difficult, but is fairly orthogonal
  • TryReusingBlob: this + BlobInfoCache will be significant, it has ~the same prerequisite question “what layer reuse is valid cross-digest” as c/storage — and that’s a major reason why I’d prefer to build this a feature at a time, so that we can get plain PutBlob or configs done without blocking on this
  • The PutBlobPartial code path (plausibly should be just a failure because pull --force-digest= makes little sense, and would be difficult to implement, I’m not sure)
  • DiffID computation (maybe keep sha256 and fail on schema1??)
  • ZstdChunked compression using sha512 digests for all its data

if destInfo.Digest != srcInfo.Digest {
return fmt.Errorf("Internal error: copying uncompressed config blob %s changed digest to %s", srcInfo.Digest, destInfo.Digest)
// Allow digest algorithm changes when forcing a specific digest algorithm
forcingDifferentAlgo := ic.c.options.digestOptions.MustUseSet() != "" &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • See the similar check copyBlobFromStream
  • “MustUse” Is not the only reason a copy might choose a different algorithm (e.g. we might be pushing to a sha256-only-transport or registry), hard-coding a check only for MustUse here is overly specific
  • Compare the end of copyLayers, where the equality check is immediately paired with manifest edits [and see elsewhere about manifest edits], so that we always set up edits if we know they are needed, and we minimize the risk of the two getting out of sync. I’m not sure whether the check+update should live in copyConfig or copyUpdatedConfigAndManifest (guessing the latter), but they should be together.

}

ic.c.Printf("Writing manifest to image destination\n")
manifestDigest, err := manifest.Digest(man)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This is another part that will need updating, for multi-arch images.)

@mtrmac
Copy link
Contributor

mtrmac commented Dec 11, 2025

FWIW containers/buildah#6597 hits internal error: use of uninitialized digests.Options — entirely my fault, it’s already broken in #512, but, still a blocker to merging this as well.

@mtrmac
Copy link
Contributor

mtrmac commented Dec 11, 2025

FWIW containers/buildah#6597 hits internal error: use of uninitialized digests.Options

I think #512 has a fix, but I’m not going to stay around today to wait for containers/buildah#6552 to confirm.

@lsm5
Copy link
Member Author

lsm5 commented Dec 12, 2025

I think I’d prefer:
* Not providing the public API until all pieces are in place. If we need something for testing, a temporary BrokenSetForceDigestAlgorithm /* Unstable API, may be changed or removed any time */?

I can go ahead with this. Let me know if you'd much rather the other preference though.

@mtrmac
Copy link
Contributor

mtrmac commented Dec 12, 2025

I didn’t mean the two bullet points as alternatives, but as two parts of the same plan.


My main concern here is having some way of tracking what doesn’t work yet; all of that will need to either be implemented or at least be updated to cleanly fail. So I was thinking “one feature at a time, in all transports”. Doing it that way also means we immediately run into difficulties if some transports’ structures make a design difficult.

There may be other ways to track things or to structure the implementation. And maybe that whole discussion is premature, and we need a prototype PR, or a few, sort of like this, testing the full workflow to just run into things and see what are the things that will need to be done.

The goal is to allow building the digest-choice-dependent machinery
from the bottom up without committing to an API while we don't understand
the problem space; for now, we don't expose any way for users to
actually make a choice.

This code is never called at this point, so this should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
for now, purely to let us internally work on implementing digest
choice / conversions; there is no way for a caller to set it.

Should not change behavior, the field has no users.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
This is:
- UNUSABLE (no external API)
- INCOMPLETE (manifest digests, TryReusingBlob, ...)
- BROKEN (various code assumes that digest returned from upload matches input)
- INCONSISTENT (PutBlobOptions.Digests is only implemented in one transport,
  and silently ignored elsewhere)

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
@lsm5 lsm5 force-pushed the skopeo-copy-force-digest branch from 8a75ccc to f15717e Compare December 26, 2025 21:11
podmanbot pushed a commit to podmanbot/buildah that referenced this pull request Dec 26, 2025
lsm5 added 4 commits December 26, 2025 16:16
Add UpdateConfigDigest() method to the manifest interface to support
updating config descriptor digests when forcing different digest algorithms.

This foundational change allows callers to update config digests uniformly
across different manifest formats (OCI, Docker schema2) while properly
failing for unsupported formats (Docker schema1, which lacks a separate
config descriptor).

Implementations:
- OCI and Docker schema2: Update config digest directly
- Docker schema1: Return error (no separate config descriptor)

This is needed to support forcing digest algorithms in image copy operations,
where the config blob may be re-hashed with a different algorithm.

Signed-off-by: Lokesh Mandvekar <lsm5@redhat.com>
Add public API to force a specific digest algorithm when copying images.
This allows users to override the default sha256 digest algorithm.

The API is marked as "Broken" (unstable) because:
- Not all transports support it yet
- Multi-arch images are not yet handled
- Some edge cases still need work

See containers#552 for status.

Key behaviors:
- Validates algorithm is available on the system
- Prevents reconfiguration (fails if already set)
- Respects existing digestOptions if already configured

This is the user-facing entry point; actual implementation follows
in subsequent commits.

Signed-off-by: Lokesh Mandvekar <lsm5@redhat.com>
Implement the core logic to force digest algorithms when copying images:

1. Config digest handling:
   - Modify copyConfig() to return new digest when algorithm changes
   - Add updateManifestConfigDigest() helper to update config digest in manifest
   - Use manifest abstraction layer for uniform digest updates

2. Manifest digest selection:
   - Use digestOptions.Choose() to select manifest digest algorithm
   - Apply selected algorithm via manifest.DigestWithAlgorithm()

3. Digest mismatch handling:
   - Allow digest changes only when algorithms differ
   - Detect corruption when same algorithm produces different digest

This builds on the manifest abstraction layer and API from previous commits
to enable the actual digest forcing functionality.

Note: Guards for unsupported scenarios (multi-arch, schema1, etc.) are added
in the next commit to keep this focused on core logic.

Signed-off-by: Lokesh Mandvekar <lsm5@redhat.com>
Add guards and limitations for cases where digest forcing is not yet
supported or incompatible with certain features:

1. Multi-arch images (multiple.go):
   - Not yet implemented; needs per-instance config digest updates
   - Fail cleanly with clear error message
   - See containers#552 (comment)

2. Docker schema1 manifests (single.go):
   - Lack separate config descriptor, only support sha256
   - Fail early when forcing non-sha256 algorithms

3. zstd:chunked compression (single.go):
   - Uses sha256 for internal TOC and chunk digests
   - Fall back to plain zstd when forcing non-sha256
   - Fail if requireCompressionFormatMatch is set

4. PutBlobPartial optimization (single.go):
   - Incompatible with forcing different digest (needs full layer)
   - Disable partial pull when algorithms don't match

5. Blob reuse optimization (single.go):
   - Skip blob reuse when digest algorithms don't match
   - Future: could support cross-digest reuse via BlobInfoCache

These guards ensure clean failures and fallbacks rather than silent
corruption or confusing errors.

Signed-off-by: Lokesh Mandvekar <lsm5@redhat.com>
@lsm5 lsm5 force-pushed the skopeo-copy-force-digest branch from f15717e to eca922f Compare December 26, 2025 21:29
podmanbot pushed a commit to podmanbot/buildah that referenced this pull request Dec 26, 2025
@TomSweeneyRedHat
Copy link
Member

@lsm5 should this still have a draft tag?

@lsm5 lsm5 marked this pull request as ready for review January 9, 2026 14:35
Copy link
Contributor

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A review primarily of “manifest: Add UpdateConfigDigest() to manifest abstraction layer”.


// UpdateConfigDigest updates the config descriptor's digest in the manifest.
func (m *manifestSchema1) UpdateConfigDigest(newDigest digest.Digest) error {
return fmt.Errorf("cannot update config digest for schema1 manifest")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not really “cannot update”, the digest doesn’t exist at all. Something like “internal error: caller asked to update non-existent config digest in schema1 manifest”?

Similarly in manifest/.


func TestSchema1UpdateConfigDigestFails(t *testing.T) {
m := manifestSchema1FromFixture(t, "schema2-to-schema1-by-docker.json")
newDigest := digest.Digest("sha512:abcdef1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef1234567890ab")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Using a sha256 digest would be a tiny bit more representative — sha512 is … unreasonable for schema1.)


err := m.UpdateConfigDigest(newDigest)
require.Error(t, err)
assert.Contains(t, err.Error(), "cannot update config digest for schema1 manifest")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I’m generally not much of a fan of tests for error messages, but it doesn’t really hurt. [It’s definitely not worth defining a new error type just for this purpose.])


// UpdateConfigDigest updates the config descriptor's digest in the manifest.
// This returns an error if the manifest does not support config digest updates (e.g., schema1 manifests).
UpdateConfigDigest(newDigest digest.Digest) error
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

manifest.Manifest is a public type, so adding new methods to it would be an API break :-|

The image/ callers use specific types and not this interface, so it should be fine to just not add the method to the interface.

}

// UpdateConfigDigest updates the config descriptor's digest in the manifest.
func (m *manifestSchema2) UpdateConfigDigest(newDigest digest.Digest) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a general principle, I’d like the image/ package to have unit tests if at all possible. (It’s isolated and memory-only, so it is easily possible unlike copy/, and it’s frequently doing somewhat non-obvious things — and it’s easier to hold the line when it almost has full coverage already.)

}

// FIXME: Single image only; multi-arch needs per-instance config digest updates.
// See https://github.com/containers/container-libs/pull/552#discussion_r2611627578
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t see how this comment matches this code. (And, below, there is manifestDigestAlgo already trying to handle that…)

// FIXME: Single image only; multi-arch needs per-instance config digest updates.
// See https://github.com/containers/container-libs/pull/552#discussion_r2611627578
if newConfigDigest != nil {
man, err = ic.updateManifestConfigDigest(man, pendingImage, *newConfigDigest)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way updateManifestConfigDigest etc. works, by first serializing the in-memory data into man a bit above, then parsing it here, updating, and serializing again, is unnecessarily costly.

Instead, we can still edit pendingImage, and serialize into man only after this edit:

(Also, conceptually — we have types.Image.UpdatedImage with ManifestUpdateOptions.LayerInfos, so adding ManifestUpdateOptions.ConfigInfo *types.BlobInfo would be symmetrical, and useful for external callers.)

manifestUpdatesForConfig, err := ic.copyConfig() // see elsewhere about closely co-locating the code that changes values and that arranges for manifest updates

if ! /* generalization of noPendingManifestUpdates */(manifestUpdatesForConfig) {
    if ic.cannotModifyManifestReason != "" { /* fail, just to be sure */ }
    pendingImage, err = pendingImage.UpdatedImage(ctx, manifestUpdatesForConfig)
}

man, _, err := pendingImage.Manifest(ctx)

And to implement that, genericManifest.UpdateConfigDigest does not need to exist — add a ConfigInfo field to ManifestUpdateOptions and implement it by calling the manifests’ UpdateConfigInfo.

// UpdateConfigDigest updates the config descriptor's digest in the manifest.
// This returns an error if the manifest does not support config digest updates.
func (m *OCI1) UpdateConfigDigest(newDigest digest.Digest) error {
m.Config.Digest = newDigest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not sufficient alone — we might need to update (well, clear) .Data and URLs.

So [as much as the use of BlobInfo to mean both “information about a blob” and “edits to a blob” is unclean, and how we have the ugly CompressionOperation: PreserveOriginal, CompressionAlgorithm: notNil semantics there], I think it’s better if this is UpdateConfigInfo(types.BlobInfo), mirroring UpdateLayerInfos (and probably sharing a helper to do the single-descriptor update). Then we can have copyBlobFromStream worry about producing a correct BlobInfo in that situation.

Copy link
Contributor

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review of “image/copy: Add BrokenSetForceDestinationDigestAlgorithm() API”. ACK to having that API.

Comment on lines +179 to +181
if !algo.Available() {
return fmt.Errorf("digest algorithm %q is not available", algo.String())
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

digests.MustUse already contains this check.

// It currently only enforces the digest algorithm for a subset of transports and operations.
// See https://github.com/containers/container-libs/pull/552 for implementation status.
func (o *Options) BrokenSetForceDestinationDigestAlgorithm(algo digest.Algorithm) error {
if o.digestOptions.MustUseSet() != "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if o.digestOptions != digests.Options{} would be more general for the future — if some part of a caller is setting any of the options using some future API, this one should not entirely overwrite that configuration.

}
// FIXME: digestsOptions exists to gradually build the feature. Provide public API once fully implemented.
// Set default only if not configured by BrokenSetForceDestinationDigestAlgorithm
if options.digestOptions.MustUseSet() == "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

digestOptions != digests.Options{}

assert.Error(t, err)
}

func TestBrokenSetForceDestinationDigestAlgorithm(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This belongs to copy_test.go, and TestOptionsBroken… to match the convention.

Copy link
Contributor

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A review of “image/copy: Implement core digest forcing logic” (though see earlier about the config digest uploads)

Comment on lines 140 to +142
if stream.info.Digest != "" && uploadedInfo.Digest != stream.info.Digest {
return types.BlobInfo{}, fmt.Errorf("Internal error writing blob %s, blob with digest %s saved with digest %s", srcInfo.Digest, stream.info.Digest, uploadedInfo.Digest)
// If algorithms match, the whole digest values must match
if stream.info.Digest.Algorithm() == uploadedInfo.Digest.Algorithm() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conceptually, this goes:

  • If stream.Info.Digest is set (otherwise N/A)
  • … and if the algorithms match (otherwise the comparison is meaningless)
  • then the full values should match.

Sure, it’s all within effectively a series of &&, so the ultimate behavior is the same either way, but I think it’s a good habit to only consume values when they are known to exist and to be relevant.

Comment on lines +611 to +616
// Choose the digest algorithm based on digest options
manifestDigestAlgo, err := ic.c.options.digestOptions.Choose(digests.Situation{})
if err != nil {
return nil, "", fmt.Errorf("choosing manifest digest algorithm: %w", err)
}
manifestDigest, err := manifest.DigestWithAlgorithm(man, manifestDigestAlgo)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️

This probably doesn’t work in general — with a sha256-only transport (e.g. an old registry), we would successfully upload the manifest, compute a sha512 digest, put the sha512 digest into a manifest list — but the sha512 digest would not be resolvable by the transport.

So we will need some cooperation with the transport to choose a digest, that’s easy enough (private.ImageDestination.PutManifestWithOptions(ctx, man, PutManifestOptions{IsNotTopLevel: instanceDigest != nil, Digests: …, CannotChangeManifestReason: …} and have that new method choose an algorithm and compute it. (Changing how we do IsNotTopLevel so that we don’t have to know the instanceDigest value in advance.)


The ⚠️ part is that for registries, we will need a “does the registry allow referring to manifests via $algorithm” API, and I’m not sure whether that is in scope of the existing proposals; we might need to raise that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have noted some of the registry concerns in https://issues.redhat.com/browse/PROJQUAY-10117 ; there is also pre-existing upstream discussion in opencontainers/distribution-spec#543 and opencontainers/distribution-spec#494 (although I didn’t carefully read either of them).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally, a review of “image/copy: Add guards for unsupported digest forcing scenarios” — looking only at the proposed changes, not really checking whether anything is missing.

Though the PutBlobPartial path probably needs a guard as well.

if srcInfo.Digest.Algorithm() != forcedAlgo {
logrus.Debugf("Skipping blob reuse for %s: digest algorithm %s doesn't match forced algorithm %s",
srcInfo.Digest, srcInfo.Digest.Algorithm(), forcedAlgo)
canTryReuse = false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably also check tocDigest.

return types.BlobInfo{}, "", fmt.Errorf("trying to reuse blob %s at destination: %w", srcInfo.Digest, err)
// FIXME: Blob reuse disabled when forcing different digest algorithm.
canTryReuse := true
if forcedAlgo := ic.c.options.digestOptions.MustUseSet(); forcedAlgo != "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might check for “is not the default digestOptions” (not feeling too strongly about this) — and definitely Warnf if we skip this important performance aspect.


// FIXME: Multi-arch not supported yet; would need config digest updates for each instance.
// See https://github.com/containers/container-libs/pull/552#discussion_r2611627578
if c.options.digestOptions.MustUseSet() != "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can happen at the very top of the function — and it should check for any non-default digestOptions.

@lsm5 lsm5 marked this pull request as draft January 13, 2026 11:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

image Related to "image" package

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants