Skip to content

DNM - for testing OCPBUGS-65645#6016

Draft
isabella-janssen wants to merge 3 commits into
openshift:mainfrom
isabella-janssen:ocpbugs-65645-claude-testing
Draft

DNM - for testing OCPBUGS-65645#6016
isabella-janssen wants to merge 3 commits into
openshift:mainfrom
isabella-janssen:ocpbugs-65645-claude-testing

Conversation

@isabella-janssen
Copy link
Copy Markdown
Member

@isabella-janssen isabella-janssen commented May 7, 2026

- What I did

- How to verify it

- Description for the changelog

Summary by CodeRabbit

  • New Features
    • On CoreOS (EL-based) systems, extension packages are now verified during first-boot/node initialization. If configured extension packages are missing or verification fails, the node will not complete the Done transition and will report the validation failure.

isabella-janssen and others added 2 commits May 7, 2026 13:42
Resolves: OCPBUGS-65645

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@openshift-merge-bot
Copy link
Copy Markdown
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 7, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 7, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 7, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: c16627aa-f2ef-42c1-acfb-eadc7bff7c83

📥 Commits

Reviewing files that changed from the base of the PR and between e718454 and 2a7bf4a.

📒 Files selected for processing (1)
  • pkg/daemon/update.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • pkg/daemon/update.go

Walkthrough

Adds an EL-only verification step during first-boot "Done" transition: the daemon checks that MachineConfig.Spec.Extensions packages exist in the RPM DB and aborts the Done transition if verification fails; non-EL systems skip the check.

Changes

Extension package verification + Done transition gate

Layer / File(s) Summary
Extension verification implementation
pkg/daemon/update.go
Added (*CoreOSDaemon).verifyExtensionPackages(config *mcfgv1.MachineConfig) error. Returns early for non-EL hosts or when no extensions are configured. Maps configured extensions to RPM package names, appends "sysstat" (testing), runs rpm -q <pkg> per package, accumulates missing-package results (rpm exit code 1) into a single error, and treats other rpm failures as immediate errors. Logs success or failure.
First-boot Done transition gate
pkg/daemon/daemon.go
In updateConfigAndState, when node is at desired configuration and about to perform resumed/Donesync logic, the flow now conditionally calls CoreOSDaemon.verifyExtensionPackages(...) for CoreOS (EL) variants and aborts the Done transition if verification returns an error. Non-EL nodes skip this check.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 11 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'DNM - for testing OCPBUGS-65645' is vague and generic, using the abbreviation 'DNM' (Do Not Merge) without describing the actual technical changes being tested. Replace with a clear, descriptive title that explains the main change, such as 'Add extension package verification to daemon startup' or 'Verify extension packages before marking node as Done'.
✅ Passed checks (11 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed This PR only adds non-test code (verifyExtensionPackages function in pkg/daemon/update.go and updates to pkg/daemon/daemon.go). No new Ginkgo test files or test declarations are introduced in this PR.
Test Structure And Quality ✅ Passed Custom check requires Ginkgo test code review. PR adds standard Go unit tests (func Test* with testify), not Ginkgo tests. Check is not applicable.
Microshift Test Compatibility ✅ Passed This PR does not add any new Ginkgo e2e tests. Changes are only in pkg/daemon implementation files containing a new verifyExtensionPackages method.
Single Node Openshift (Sno) Test Compatibility ✅ Passed This check is for SNO compatibility of new Ginkgo e2e tests. The PR adds standard Go tests, not Ginkgo e2e tests, so the check is not applicable.
Topology-Aware Scheduling Compatibility ✅ Passed This PR modifies node-daemon code to verify RPM packages. It does not add deployment manifests, operator code, or scheduling constraints. The topology-aware scheduling check does not apply.
Ote Binary Stdout Contract ✅ Passed No OTE Binary Stdout Contract violations. New verifyExtensionPackages() method uses klog configured to stderr, executes in daemon worker loop (not process-level code).
Ipv6 And Disconnected Network Test Compatibility ✅ Passed No Ginkgo e2e tests were added. Changes are production daemon code (verifyExtensionPackages) only. Check not applicable.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.


Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 7, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: isabella-janssen

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 7, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/daemon/update.go`:
- Around line 1876-1891: The MCO_TEST_EXTENSION_FAILURE injection is only
checked inside the non-ExitCode==1 error branch so it never triggers on
successful rpm queries; after running exec.Command("rpm","-q",
pkg).CombinedOutput() (i.e., after obtaining out and err) move the
os.Getenv("MCO_TEST_EXTENSION_FAILURE") == "true" check out of the error-only
branch so it runs for the healthy path as well—if set, append "test-failing" to
missingPackages (and optionally klog.Warningf) before continuing; keep the
existing exitErr/ExitCode()==1 handling and the error return for other failures
unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 576b5f79-08d6-484b-acd5-b5ba4389ef55

📥 Commits

Reviewing files that changed from the base of the PR and between f9d91f6 and e718454.

📒 Files selected for processing (2)
  • pkg/daemon/daemon.go
  • pkg/daemon/update.go

Comment thread pkg/daemon/update.go
Comment on lines +1876 to +1891
out, err := exec.Command("rpm", "-q", pkg).CombinedOutput()
if err == nil {
continue
}
// Check if this is exit code 1 (package not installed) vs other errors
if errors.As(err, &exitErr) && exitErr.ExitCode() == 1 {
missingPackages = append(missingPackages, pkg)
klog.Warningf("Extension package %s not found in RPM database", pkg)
continue
}
if os.Getenv("MCO_TEST_EXTENSION_FAILURE") == "true" {
missingPackages = append(missingPackages, "test-failing")
}

// Other errors (execution failure, permission issues, etc.) should fail immediately
return fmt.Errorf("failed to query RPM database for package %q: %v: %s", pkg, err, strings.TrimSpace(string(out)))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Test failure injection never triggers in the healthy path.

MCO_TEST_EXTENSION_FAILURE is only checked after rpm -q has already failed with a non-ExitCode()==1 error, so a normal successful verification can never be force-failed. If this hook is meant to exercise the new post-reboot guard, move the env-var check outside this error branch.

Suggested fix
 func (dn *CoreOSDaemon) verifyExtensionPackages(config *mcfgv1.MachineConfig) error {
 	// Only verify on RHCOS/SCOS nodes
 	if !dn.os.IsEL() {
 		return nil
 	}
+
+	if os.Getenv("MCO_TEST_EXTENSION_FAILURE") == "true" {
+		return fmt.Errorf("injected extension verification failure")
+	}
 
 	// Get the list of extensions from the config
 	extensions := config.Spec.Extensions
@@
-		if os.Getenv("MCO_TEST_EXTENSION_FAILURE") == "true" {
-			missingPackages = append(missingPackages, "test-failing")
-		}
-
 		// Other errors (execution failure, permission issues, etc.) should fail immediately
 		return fmt.Errorf("failed to query RPM database for package %q: %v: %s", pkg, err, strings.TrimSpace(string(out)))
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/daemon/update.go` around lines 1876 - 1891, The
MCO_TEST_EXTENSION_FAILURE injection is only checked inside the non-ExitCode==1
error branch so it never triggers on successful rpm queries; after running
exec.Command("rpm","-q", pkg).CombinedOutput() (i.e., after obtaining out and
err) move the os.Getenv("MCO_TEST_EXTENSION_FAILURE") == "true" check out of the
error-only branch so it runs for the healthy path as well—if set, append
"test-failing" to missingPackages (and optionally klog.Warningf) before
continuing; keep the existing exitErr/ExitCode()==1 handling and the error
return for other failures unchanged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant