cmd, commit: change --pause default to true#28801
Conversation
6a942fe to
8a8e25c
Compare
Luap99
left a comment
There was a problem hiding this comment.
pre existing but ContainerStateStopping is also technically a running container state so should that also cause a pause? I mean commit take the container lock so the other stop process will not kill it until commit will be done then which seems fine.
Also pause does error for containers with NoCgroups set, no idea how common this is but I guess breaking them by default is not nice? Or do we think it is best to make them pass --pause=false as explicit acknowledgement of the risks
| if err := c.pause(); err != nil { | ||
| return nil, fmt.Errorf("pausing container %q to commit: %w", c.ID(), err) | ||
| } | ||
| handlerName := fmt.Sprintf("commit-unpause-%s", c.ID()) |
There was a problem hiding this comment.
worth noting that the container is locked here and thus we should never have two caller getting here at the same time as part of the podman service?
| return nil, fmt.Errorf("pausing container %q to commit: %w", c.ID(), err) | ||
| } | ||
| handlerName := fmt.Sprintf("commit-unpause-%s", c.ID()) | ||
| if err := shutdown.Register(handlerName, func(sig os.Signal) error { |
There was a problem hiding this comment.
That is a race condition, the process can get killed between pause() and the call here, the handler must be registered before pause() is called.
There was a problem hiding this comment.
thanks, I'll fix this one
| Expect(ec).To(Equal(0)) | ||
| Expect(podmanTest.NumberOfContainers()).To(Equal(1)) | ||
|
|
||
| // Default is --pause=true, test explicit --pause=false |
There was a problem hiding this comment.
That is is not testing any relevant, there is nothing that ensure that the process was indeed paused or not paused so the flag might as well be a NOP.
But then I don't know of a good race free way to know if it was paused there either? Do we have something in the podman state we could show in inspect?
There was a problem hiding this comment.
something like:
python3 -c "
import select, sys
f = open(sys.argv[1])
f.read()
p = select.poll()
p.register(f, select.POLLPRI)
while True:
p.poll()
f.seek(0)
echo('something happened')
" /sys/fs/cgroup/cgroup.events
inside the container we could detect that something happened on the cgroup, but reading the value of the file is not race free. Maybe this is good enough?
There was a problem hiding this comment.
I am nor familiar with cgroups behaviours, if the event is proof enough that we got frozen then sure this sounds like a good idea.
giuseppe
left a comment
There was a problem hiding this comment.
Also pause does error for containers with NoCgroups set, no idea how common this is but I guess breaking them by default is not nice? Or do we think it is best to make them pass --pause=false as explicit acknowledgement of the risks
I think it is better to expect an explicit --pause=false in this case and understand there are some risks involved with this action.
| return nil, fmt.Errorf("pausing container %q to commit: %w", c.ID(), err) | ||
| } | ||
| handlerName := fmt.Sprintf("commit-unpause-%s", c.ID()) | ||
| if err := shutdown.Register(handlerName, func(sig os.Signal) error { |
There was a problem hiding this comment.
thanks, I'll fix this one
8a8e25c to
60044d2
Compare
|
I'm a NACK on requiring |
|
|
||
| ## DESCRIPTION | ||
| **podman commit** creates an image based on a changed *container*. The author of the image can be set using the **--author** OPTION. Various image instructions can be configured with the **--change** OPTION and a commit message can be set using the **--message** OPTION. The *container* and its processes aren't paused while the image is committed. If this is not desired, the **--pause** OPTION can be set to *true*. When the commit is complete, Podman prints out the ID of the new image. | ||
| **podman commit** creates an image based on a changed *container*. The author of the image can be set using the **--author** OPTION. Various image instructions can be configured with the **--change** OPTION and a commit message can be set using the **--message** OPTION. The *container* and its processes are paused while the image is committed. If this is not desired, the **--pause** OPTION can be set to *false*. When the commit is complete, Podman prints out the ID of the new image. |
There was a problem hiding this comment.
I would recommend adding something about security in here, give a reason for why this is useful
how to make explicit that it is a security risk that doesn't happen with a sane configuration? That defeats this change, because in this way we make explicit that the user accepted the risk and |
I mean logically using commit is rare, using no cgroups mode is rare, using both at the same time is does even rarer so I think requiring an explicit arg is fine in practise, we should just make sure the error/docs tells the user what do. And if we decide to merge it like this/next week before 6.0 we can call it a breaking change if you want. |
|
For reference, I am LGTM on the rest of the PR |
|
@Luap99 I feel like supported configurations should not require outside intervention to function, and |
f63eabf to
0b60208
Compare
how can I differentiate between an explicit |
ContainerStateStopping is still a running state where processes are active, so it should also be paused during commit. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
When --pause defaults to true, a Ctrl-C during commit would leave the container paused. Register a shutdown handler that unpauses the container on SIGINT/SIGTERM so it is always restored to its running state. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
0b60208 to
4769c4c
Compare
|
@mtrmac PTAL |
4769c4c to
f4f14ae
Compare
|
I am not happy with the special handling for nocgroups, it makes us unsafe in certain cases and I've reverted to the previous version. I see that Docker does pause the container by default on commit and there is a |
We are working on a a different kind of change that should, eventually, make the trade-off of features vs. performance (vs. compatibility) ~moot. Certainly, that code will be unproven and risky for a while, but maybe we don’t need to be so drastic as to break existing scripts in the default configuration. |
nocgroups is used rarely, so I don't think we will really break many scripts out there |
|
IMO this is about consistency, tar'ing all files while the container process can write to them can lead to "corrupted" exports as you end up with partial states between files. pause on its own would not safe us from reading half written files if we pause the process during writes, but at least files wont change while we read them which greatly reduces risk. |
| # Start a background watcher before commit. | ||
| _cgroup_watch "$events_file" 10000 > "$watch_log" & | ||
| local watcher_pid=$! | ||
| sleep 0.5 |
There was a problem hiding this comment.
Looking at my new CI work 0.5 is not nearly good enough,
a better race free way would be to add a print between
p.register(fd, select.POLLPRI | select.POLLERR)
print("poll ready")
events = p.poll(int(sys.argv[2]))
and then loop cat the log file for that print before running podman commit
| type -p python3 &>/dev/null || skip "python3 needed for cgroup poll(2)" | ||
| if is_rootless && ! is_cgroupsv2; then | ||
| skip "'podman pause' (rootless) only works with cgroups v2" | ||
| fi |
There was a problem hiding this comment.
I though we removed all the cgroups conditions? As we not longer support v1 we should not bother with these checks and just drop them
|
|
||
| # bats test_tags=ci:parallel | ||
| @test "podman commit --pause default" { | ||
| skip_if_remote "commit --pause via api does not change the cgroup freeze state locally" |
There was a problem hiding this comment.
That is not true though?! Remote tests always run on the same system and must be. So you can read the cgroup info just fine.
I think testing remote is generally important so we know the API side works the same.
Pause the container by default during commit. It is safer as it avoids conflicts, and potentially security issues, when another process is accessing the container rootfs. Originally this was not done because it was a breaking change and rootless containers weren't able to use the freezer cgroup controller. Now that we support only cgroup v2, there is no gap anymore with root (exotic configurations can still use --pause=false). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
f4f14ae to
69139f3
Compare
|
I said this in Matrix, but I think I've been outvoted and am dropping my NACK |
Pause the container by default during commit. It is safer as it avoids conflicts, and potentially security issues, when another process is accessing the container rootfs.
Originally this was not done because it was a breaking change and rootless containers weren't able to use the freezer cgroup controller. Now that we support only cgroup v2, there is no gap anymore with root (exotic configurations can still use --pause=false).
Checklist
Ensure you have completed the following checklist for your pull request to be reviewed:
commits. (
git commit -s). (If needed, usegit commit -s --amend). The author email must matchthe sign-off email address. See CONTRIBUTING.md
for more information.
Fixes: #00000in commit message (if applicable)make validatepr(format/lint checks)Noneif no user-facing changes)Does this PR introduce a user-facing change?