Conversation
Add a method in BLSConfig and Grub Menuconfig to get the boot artifact name, i.e. get the name of the UKI or the name of the directory containing the Kernel + Initrd. The names are stripped of all our custom prefixes and suffixes, so basically they return the verity digest part of the name. This is useful for GC-ing Kernel + Initrd that are shared among multiple deployments since we can't rely on the composefs= parameter in the options as the cmdline verity digest might be different than the verity digest of the shared Kernel + Initrd. Tests written by Claude Code (Opus) Signed-off-by: Pragyan Poudyal <pragyanpoudyal41999@gmail.com>
There was a problem hiding this comment.
Code Review
This pull request refactors the handling of shared boot binaries in bootc-composefs by dynamically hashing binaries to identify duplicates and introducing a BootloaderEntry struct for better artifact management. It also updates garbage collection logic and expands integration tests. Key feedback includes a critical regression where boot binaries are no longer written during fresh installations, potential fragility in the duplicate detection logic when encountering incomplete directories, and performance concerns regarding repeated binary hashing.
| pub(crate) fn find_vmlinuz_initrd_duplicate( | ||
| storage: &Storage, | ||
| digest: &str, | ||
| ) -> Result<Option<String>> { | ||
| let boot_dir = storage.bls_boot_binaries_dir()?; | ||
|
|
||
| for entry in boot_dir.entries_utf8()? { | ||
| let entry = entry?; | ||
| let dir_name = entry.file_name()?; | ||
|
|
||
| if !entry.file_type()?.is_dir() { | ||
| continue; | ||
| } | ||
|
|
||
| let config = depl | ||
| .open_dir() | ||
| .with_context(|| format!("Opening {depl_file_name}"))? | ||
| .read_to_string(format!("{depl_file_name}.origin")) | ||
| .context("Reading origin file")?; | ||
| let Some(..) = dir_name.strip_prefix(TYPE1_BOOT_DIR_PREFIX) else { | ||
| continue; | ||
| }; | ||
|
|
||
| let ini = tini::Ini::from_string(&config) | ||
| .with_context(|| format!("Failed to parse file {depl_file_name}.origin as ini"))?; | ||
| let entry_digest = compute_boot_digest_type1(&boot_dir.open_dir(&dir_name)?)?; | ||
|
|
||
| match ini.get::<String>(ORIGIN_KEY_BOOT, ORIGIN_KEY_BOOT_DIGEST) { | ||
| Some(hash) => { | ||
| if hash == digest { | ||
| match symlink_to { | ||
| Some(ref mut prev) => prev.push(depl_file_name.to_string()), | ||
| None => symlink_to = Some(vec![depl_file_name.to_string()]), | ||
| } | ||
| } | ||
| } | ||
|
|
||
| // No SHASum recorded in origin file | ||
| // `symlink_to` is already none, but being explicit here | ||
| None => symlink_to = None, | ||
| }; | ||
| if entry_digest == digest { | ||
| return Ok(Some(dir_name)); | ||
| } | ||
| } |
There was a problem hiding this comment.
The find_vmlinuz_initrd_duplicate function is fragile because it calls compute_boot_digest_type1 which fails if any directory matching the TYPE1_BOOT_DIR_PREFIX is missing the expected vmlinuz or initrd files. If a previous operation left a stray or corrupted directory in /boot, this function will cause bootc upgrade to fail entirely. It would be safer to skip directories that cannot be hashed.
There was a problem hiding this comment.
This would be fixed by the GC, but the GC would need to be run manually. Maybe we update the error message, or should we run the GC before and after the upgrade? @cgwalters I believe ostree does a GC before the upgrade operation?
There was a problem hiding this comment.
Hmm well one thing here is we should ensure we atomically generate the dir by using the "create tempdir + rename_exchange" model.
But yes we should also be robust to partially written dirs - and sure we could GC those before upgrade too, but it seems like it'd also work to ensure we ignore corrupted/partial ones instead of erroring out.
a2f5216 to
2acf235
Compare
|
Currently btw ostree never shares entries between deployments, but it makes sense to do so for karg differences! |
| /// Returns name of UKI in case of EFI config | ||
| /// Returns name of the directory containing Kernel + Initrd in case of Non-EFI config | ||
| /// | ||
| /// The names are stripped of our custom prefix and suffixes, so this basically returns |
There was a problem hiding this comment.
I don't like the "basically" here. Either it's a verity digest or it's not.
| /// | ||
| /// The names are stripped of our custom prefix and suffixes, so this basically returns | ||
| /// the verity digest part of the name | ||
| pub(crate) fn boot_artifact_name(&self) -> Result<&str> { |
There was a problem hiding this comment.
So in that vein...and yes this is not a new problem but I'd like to start using a formal type for string digests. https://docs.rs/oci-spec/latest/oci_spec/image/struct.Digest.html is one candidate.
There was a problem hiding this comment.
I put up composefs/composefs-rs#278 related to this
| } | ||
|
|
||
| #[test] | ||
| fn test_boot_artifact_name_efi_success() -> Result<()> { |
There was a problem hiding this comment.
Let's rework these to be table-driven tests, it really helps reduce the amount of code.
Instead of looking in the ".origin" files and trying to match the boot_digest to digests in other origin files, we now simply re-compute the sha256sum for vmlinuz + initrd for all boot entries present. This fixes the bug that arises after mutiple upgrades where the original deployment that created the boot entry has been garbage collected, so we end up linking to another deployment that does have the same boot digest, but the verity digest doesn't match the verity digest used for the name of the directory where we store the kernel + initrd pair. Signed-off-by: Pragyan Poudyal <pragyanpoudyal41999@gmail.com>
This fixes a bug where a shared Type1 entry would get GCd even when it's in use due to the original image that created it being deleted. Combined with the fact that we were comparing the fsverity digest in the options field of the BLS config (which will be different than the name of the directory containing the vmlinuz + initrd pair). Now, we compare against the directory name when GC-ing boot binaries Fixes: bootc-dev#2102 Signed-off-by: Pragyan Poudyal <pragyanpoudyal41999@gmail.com>
Add a test to make sure we do not GC shared Type1 entries when they're still referenced Remove openh264 from repos to speed up installs Signed-off-by: Pragyan Poudyal <pragyanpoudyal41999@gmail.com>
boot/config: Add method to get boot artifact name
Add a method in BLSConfig and Grub Menuconfig to get the boot artifact
name, i.e. get the name of the UKI or the name of the directory
containing the Kernel + Initrd. The names are stripped of all our custom
prefixes and suffixes, so basically they return the verity digest part
of the name.
This is useful for GC-ing Kernel + Initrd that are shared among multiple
deployments since we can't rely on the composefs= parameter in the
options as the cmdline verity digest might be different than the verity
digest of the shared Kernel + Initrd.
Tests written by Claude Code (Opus)
composefs/boot: Change the way we find sharable vmlinuz + initrd
Instead of looking in the ".origin" files and trying to match the
boot_digest to digests in other origin files, we now simply re-compute
the sha256sum for vmlinuz + initrd for all boot entries present.
This fixes the bug that arises after mutiple upgrades where the original
deployment that created the boot entry has been garbage collected, so we
end up linking to another deployment that does have the same boot
digest, but the verity digest doesn't match the verity digest used for
the name of the directory where we store the kernel + initrd pair.
composefs/gc: Fix shared entry GC even when they're in use
This fixes a bug where a shared Type1 entry would get GCd even when it's
in use due to the original image that created it being deleted. Combined
with the fact that we were comparing the fsverity digest in the options
field of the BLS config (which will be different than the name of the
directory containing the vmlinuz + initrd pair).
Now, we compare against the directory name when GC-ing boot binaries
Fixes: #2102
test: Add test for GC-ing shared Type1 entries
Add a test to make sure we do not GC shared Type1 entries when they're
still referenced