fix(install): set DM_DISABLE_UDEV=1 to prevent dm semaphore deadlock in container IPC namespace#2090
Conversation
There was a problem hiding this comment.
Code Review
This pull request addresses a deadlock issue with cryptsetup in containerized environments by setting the DM_DISABLE_UDEV=1 environment variable. The fix is well-explained and seems correct. My main feedback is to suggest a safer, unsafe-free way to set the environment variable for the child processes, which would improve the robustness of the implementation.
cgwalters
left a comment
There was a problem hiding this comment.
I reopened and commented on #421 which is about the larger problem here - if we did that then there'd be no devmapper usage at all at installation time by bootc.
The container's default IPC namespace is isolated from the host.
I'd rather change that via changing our default install flow to use --ipc=host. We already propagate /run so this really makes sense.
As far as setenv - we already have global_init() that is the place for this.
812ff6a to
23d3fbe
Compare
23d3fbe to
3636925
Compare
|
Reworked per your feedback. Added I kept |
3636925 to
f64cc0f
Compare
2ce942d to
c5a751a
Compare
crates/lib/src/cli.rs
Outdated
| let pid1ipcns = std::fs::File::open("/proc/1/ns/ipc").context("open pid1 ipcns")?; | ||
| rustix::thread::move_into_link_name_space( | ||
| pid1ipcns.as_fd(), | ||
| Some(rustix::thread::LinkNameSpaceType::Ipc), |
There was a problem hiding this comment.
This doesn't compile, the enum member has a longer name
crates/lib/src/cli.rs
Outdated
| // cryptsetup operations to deadlock on semop(). The primary fix is to | ||
| // run the install container with --ipc=host; this is defense-in-depth | ||
| // for cases where the caller forgets that flag. | ||
| if let Ok(ns_pid1) = std::fs::read_link("/proc/1/ns/ipc") { |
There was a problem hiding this comment.
As a general rule avoid "error swallowing" - let's just propagate the error if we can't open. It shouldn't happen and if it does we want to know.
Inside a container with an isolated IPC namespace (the podman/docker default), udevd on the host cannot see the container's semaphores, causing cryptsetup luksOpen/luksClose to deadlock on semop(). The primary fix is adding --ipc=host to the documented podman invocations. As defense-in-depth, call setns() into /proc/1/ns/ipc at the very start of global_init() when the process is in a different IPC namespace than pid 1, so that devmapper's udev synchronization works correctly even if the caller omits --ipc=host. Signed-off-by: Andrew Dunn <andrew@dunn.dev>
c5a751a to
eb332db
Compare
|
CI failures are all quay.io/registry 502 Bad Gateway errors during container image pulls, not code issues. Should pass on re-run. |
Summary
Prevent the libdevmapper udev cookie semaphore deadlock that causes
cryptsetup luksOpen/luksCloseto hang inside containers with isolated IPC namespaces.Root cause
When
bootc installruns inside a container (the standard invocation), the container has its own IPC namespace by default. libdevmapper creates System V semaphores ("udev cookies") to synchronize with udevd, but udevd runs in the host IPC namespace and can't see the container's semaphores. Thesemop()call blocks forever.Full analysis: #2089
Fix
Two changes, as suggested in review:
--ipc=hostin documented invocations (primary fix): Add--ipc=hostto thepodman runcommands inbootc-install.md. This shares the host IPC namespace so libdevmapper's semaphores reach udevd. We already pass--pid=hostand propagate/run, so--ipc=hostis consistent.DM_DISABLE_UDEV=1inglobal_init()(defense-in-depth): Set the env var early in process initialization, alongside the existingHOMEworkaround. This tells libdevmapper to skip udev synchronization entirely, catching cases where IPC sharing is not configured.Fixes: #2089
Related: #421