Skip to content

MCO-2257: make rhel-10 the default for new installs#5999

Open
cheesesashimi wants to merge 4 commits into
openshift:mainfrom
cheesesashimi:zzlotnik/rhel10-by-default
Open

MCO-2257: make rhel-10 the default for new installs#5999
cheesesashimi wants to merge 4 commits into
openshift:mainfrom
cheesesashimi:zzlotnik/rhel10-by-default

Conversation

@cheesesashimi
Copy link
Copy Markdown
Member

@cheesesashimi cheesesashimi commented May 5, 2026

- What I did

This updates the MCO's Dockerfiles to set the RHEL 10 image as the default OS image for new installs.

- How to verify it

Bring up a new cluster with this PR and it should be running RHEL 10.

- Description for the changelog
Use RHEL 10 by default

Summary by CodeRabbit

  • Bug Fixes

    • Fixed image selection and rewrite logic so correct OS image tags are chosen across deployment variants; updated default bootstrap image tag to RHEL CoreOS 10.
  • Tests

    • Updated test container images and repo rewrites to Stream 10; adjusted test helpers to treat EL10 like EL9/SCOS/FCOS.
    • Updated embedded test container builds to use dnf installs and added test cleanup/assertion and password-hash extraction improvements.
  • Chores

    • Standardized RHEL CoreOS image reference names for consistency.

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 5, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 5, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 5, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 10ac30b1-1ab3-4e42-a090-84a6e3cf967b

📥 Commits

Reviewing files that changed from the base of the PR and between ea6f93f and af59c59.

📒 Files selected for processing (1)
  • test/e2e-1of2/mcd_test.go

Walkthrough

Default RHEL CoreOS image tag changed to rhel-coreos-10. Dockerfile and Dockerfile.rhel7 RUN logic reorganized to ensure SCOS rewrites to stream-coreos and non-SCOS rewrites go to rhel-coreos-10. CentOS stream test bases moved from stream9→stream10. Tests adjust SSH-path expectations and password-hash comparisons.

Changes

RHEL CoreOS tag + build/test rewrites

Layer / File(s) Summary
Image reference data
install/image-references
Rename image keys to rhel-coreos-10 / rhel-coreos-10-extensions and add the updated 10 entry.
Default bootstrap config
cmd/machine-config-operator/bootstrap.go
Change default baseOSContainerImageTag from rhel-coreos to rhel-coreos-10; FCOS/SCOS conditionals remain.
Dockerfile conditional rewrites
Dockerfile
Reorder RUN conditional so the scos branch performs rhel-coreosstream-coreos and add an else branch that rewrites rhel-coreosrhel-coreos-10 for non-scos cases.
RHEL7 Dockerfile adjustments
Dockerfile.rhel7
Relocate fi; keep scos rewrite to stream-coreos across manifests and add explicit else branch rewriting rhel-coreosrhel-coreos-10 in /manifests/0000_80_machine-config_05_osimageurl.yaml before package installation.
CentOS stream test images & repo rewrites
test/e2e-ocl-shared/Containerfile.cowsay, test/extended-priv/mco_ocb.go, test/e2e-ocl-shared/helpers.go
Switch CentOS base images from quay.io/centos/centos:stream9:stream10; update sed repo rewrites for Stream 10; adjust repo/key extraction helper to use stream10; change final-stage package installs to use dnf install -y (and include ripgrep in OCB test).
SSH path expectations in tests
test/helpers/utils.go
Extend GetSSHPaths so EL10 uses the same expected SSH key path as EL9/SCOS/FCOS (constants.RHCOS9SSHKeyPath) and the not-expected path is constants.RHCOS8SSHKeyPath.

Test: MCP cleanup naming and shadow password-hash checks

Layer / File(s) Summary
MCP cleanup wiring in tests
test/e2e-1of2/mcd_test.go
Replace local delete variable with deleteMCP := helpers.CreateMCP(...) and call deleteMCP() in t.Cleanup() across multiple tests; assert deletion caller returns nil for old-infra cleanup.
Rendered config error handling
test/e2e-1of2/mcd_test.go
Use require.NoError(...) when waiting for rendered config (helpers.WaitForRenderedConfig).
Password-hash comparison & helper
test/e2e-1of2/mcd_test.go
Replace full /etc/shadow equality checks with extraction and comparison of the password-hash field; add extractPasswordHashFromShadowLine(shadowLine string) string to parse the hash (returns "" on parse failure).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 9 | ❌ 3

❌ Failed checks (3 warnings)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 54.55% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Test Structure And Quality ⚠️ Warning Tests have 61 assertions without failure messages vs 7 with messages. Requirement #4 mandates meaningful diagnostic messages. Violates codebase consistency with Ginkgo tests. Add diagnostic messages to all require.Nil/NoError/Equal assertions. Change require.Nil(t, err) to require.Nil(t, err, "failed to create MC").
Ipv6 And Disconnected Network Test Compatibility ⚠️ Warning TestInstallRPMAndCheckMCDMetrics has hardcoded IPv4 localhost and requires external registries. Incompatible with IPv6-only and disconnected CI. Replace 127.0.0.1:8797 with IPv6-safe net.JoinHostPort. Mirror external pulls from quay.io/fedoraproject.org or skip on disconnected. Verify with ipv6 payload job.
✅ Passed checks (9 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'MCO-2257: make rhel-10 the default for new installs' accurately reflects the primary change: updating MCO to default to RHEL 10 for new cluster installations.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed All Ginkgo test titles in modified test files are static and deterministic. They use Polarion IDs and descriptive text without dynamic values like pod names, timestamps, UUIDs, or node names.
Microshift Test Compatibility ✅ Passed No new Ginkgo e2e tests were added. Changes consist only of modifications to existing tests and helper functions for RHEL 10 support.
Single Node Openshift (Sno) Test Compatibility ✅ Passed No new Ginkgo e2e tests are added in this PR. All changes are to existing test functions and helper utilities. Existing tests already handle SNO topology checks.
Topology-Aware Scheduling Compatibility ✅ Passed No new scheduling constraints, affinity rules, or topology-derived logic introduced. Changes are only image tag selection (rhel-coreos-10) and test infrastructure updates.
Ote Binary Stdout Contract ✅ Passed No OTE Binary Stdout Contract violations found. All klog properly redirects to stderr via flag.Set("logtostderr", "true"), and no uncontrolled stdout writes exist in process-level code.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 5, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cheesesashimi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 5, 2026
@cheesesashimi cheesesashimi changed the title make rhel-10 the default for new installs MCO-2257: make rhel-10 the default for new installs May 5, 2026
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label May 5, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented May 5, 2026

@cheesesashimi: This pull request references MCO-2257 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

- What I did

This updates the MCO's Dockerfiles to set the RHEL 10 image as the default OS image for new installs.

- How to verify it

Bring up a new cluster with this PR and it should be running RHEL 10.

- Description for the changelog
Use RHEL 10 by default

Summary by CodeRabbit

Release Notes

  • Bug Fixes
  • Fixed build configuration logic to correctly select base OS container images for different deployment variants (Stream CoreOS, Fedora CoreOS, and RHEL CoreOS).
  • Corrected bootstrap image tag defaults to ensure proper container image selection across all build scenarios.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@cheesesashimi cheesesashimi marked this pull request as ready for review May 6, 2026 13:20
@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 6, 2026
@cheesesashimi
Copy link
Copy Markdown
Member Author

/test e2e-aws-ovn e2e-aws-ovn-upgrade

@openshift-ci openshift-ci Bot requested review from djoshy and umohnani8 May 6, 2026 13:22
@cheesesashimi
Copy link
Copy Markdown
Member Author

/jira refresh

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented May 6, 2026

@cheesesashimi: This pull request references MCO-2257 which is a valid jira issue.

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@cheesesashimi cheesesashimi force-pushed the zzlotnik/rhel10-by-default branch from 7f8dd4f to 411e4be Compare May 6, 2026 21:24
@cheesesashimi
Copy link
Copy Markdown
Member Author

/test e2e-aws-ovn e2e-aws-ovn-upgrade

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
Dockerfile (1)

33-35: 💤 Low value

The else branch safely rewrites osimageurl.yaml today, but consider anchoring the sed to YAML keys for resilience against future edits.

The sed s/rhel-coreos/rhel-coreos-10/g is currently safe—the target file contains only rhel-coreos and rhel-coreos-extensions, with no pre-existing rhel-coreos-10 substring. However, the pattern is non-idempotent: a future manual edit introducing a literal rhel-coreos-10 in the file would cause it to rewrite to rhel-coreos-10-10 and silently break image resolution. Consider anchoring the rewrite to specific YAML keys for safety:

♻️ Optional: anchor the rewrite to specific YAML keys for idempotency
-    sed -i 's/rhel-coreos/rhel-coreos-10/g' /manifests/0000_80_machine-config_05_osimageurl.yaml; fi && \
+    sed -i -e 's|^\(\s*baseOSContainerImage:\s*\)rhel-coreos$|\1rhel-coreos-10|' \
+           -e 's|^\(\s*baseOSExtensionsContainerImage:\s*\)rhel-coreos-extensions$|\1rhel-coreos-10-extensions|' \
+        /manifests/0000_80_machine-config_05_osimageurl.yaml; fi && \

Note: The else branch now makes empty/unknown TAGS values default to rhel-coreos-10 (previously untouched). This is intentional behavior — any out-of-band builds with non-fcos/scos TAGS will see manifests rewritten accordingly.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Dockerfile` around lines 33 - 35, The sed replacement in the else branch (the
sed -i 's/rhel-coreos/rhel-coreos-10/g' invocation targeting
/manifests/0000_80_machine-config_05_osimageurl.yaml) is not idempotent; change
it to only rewrite the specific YAML key(s) that carry the OS image identifier
(e.g., the osImageURL or image fields in that manifest) so that literal
occurrences elsewhere are not mangled. Update the Dockerfile command to match
and replace the value on the YAML key (anchor the pattern to the key name and
its value) instead of a global s///, ensuring the rewrite is safe to run
multiple times and will not produce rhel-coreos-10-10 if the file already
contains rhel-coreos-10.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@Dockerfile`:
- Around line 33-35: The sed replacement in the else branch (the sed -i
's/rhel-coreos/rhel-coreos-10/g' invocation targeting
/manifests/0000_80_machine-config_05_osimageurl.yaml) is not idempotent; change
it to only rewrite the specific YAML key(s) that carry the OS image identifier
(e.g., the osImageURL or image fields in that manifest) so that literal
occurrences elsewhere are not mangled. Update the Dockerfile command to match
and replace the value on the YAML key (anchor the pattern to the key name and
its value) instead of a global s///, ensuring the rewrite is safe to run
multiple times and will not produce rhel-coreos-10-10 if the file already
contains rhel-coreos-10.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 5aee0aad-7c70-434c-bd88-069e411d76fb

📥 Commits

Reviewing files that changed from the base of the PR and between 7f8dd4f and 411e4be.

📒 Files selected for processing (4)
  • Dockerfile
  • Dockerfile.rhel7
  • cmd/machine-config-operator/bootstrap.go
  • install/image-references

@cheesesashimi
Copy link
Copy Markdown
Member Author

/test e2e-gcp-op-part1 e2e-gcp-op-part2 e2e-gcp-op-single-node

@cheesesashimi
Copy link
Copy Markdown
Member Author

/retest-required

2 similar comments
@cheesesashimi
Copy link
Copy Markdown
Member Author

/retest-required

@cheesesashimi
Copy link
Copy Markdown
Member Author

/retest-required

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/e2e-ocl-shared/helpers.go`:
- Around line 875-877: The test now pulls quay.io/centos/centos:stream10 but
ConvertFilesFromContainerImageToBytesMap contains hardcoded repo rewrite logic
that forces repo names to "9-stream"; update the rewrite logic in
ConvertFilesFromContainerImageToBytesMap to derive the stream version from the
centosPullspec (or from the repo file contents) instead of always substituting
"9-stream" — specifically detect tags like "stream10" and rewrite to "10-stream"
(or preserve the original stream token) so repo entries produced for
centosPullspec match the pulled image; ensure the change touches the rewrite
branch that references "9-stream" and keeps tests using centosPullspec
unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: e671115b-5ac5-450a-9bd0-7862f453d91d

📥 Commits

Reviewing files that changed from the base of the PR and between 411e4be and ea6f93f.

📒 Files selected for processing (4)
  • test/e2e-ocl-shared/Containerfile.cowsay
  • test/e2e-ocl-shared/helpers.go
  • test/extended-priv/mco_ocb.go
  • test/helpers/utils.go

Comment on lines +875 to 877
centosPullspec := "quay.io/centos/centos:stream10"
yumReposContents := ConvertFilesFromContainerImageToBytesMap(t, centosPullspec, "/etc/yum.repos.d/")
rpmGpgContents := ConvertFilesFromContainerImageToBytesMap(t, centosPullspec, "/etc/pki/rpm-gpg/")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Stream10 pullspec now conflicts with hardcoded 9-stream rewrite logic

After switching to quay.io/centos/centos:stream10 (Line 875), the extracted repo content is still rewritten to 9-stream in ConvertFilesFromContainerImageToBytesMap. That mismatch can point builds at the wrong repos and break package resolution.

Suggested fix
 func ConvertFilesFromContainerImageToBytesMap(t *testing.T, pullspec, containerFilepath string) map[string][]byte {
@@
-	isCentosImage := strings.Contains(pullspec, "centos")
+	isCentosImage := strings.Contains(pullspec, "centos")
+	streamValue := "10-stream"
+	if strings.Contains(pullspec, "stream9") {
+		streamValue = "9-stream"
+	}
@@
-		if isCentosImage {
-			contents = bytes.ReplaceAll(contents, []byte("$stream"), []byte("9-stream"))
-		}
-
-		// Replace $stream with 9-stream in any of the Centos repo content we pulled.
+		if isCentosImage {
+			contents = bytes.ReplaceAll(contents, []byte("$stream"), []byte(streamValue))
+		}
+
+		// Replace $stream with the matching stream value in any of the CentOS repo content.
 		out[filepath.Base(path)] = contents
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/e2e-ocl-shared/helpers.go` around lines 875 - 877, The test now pulls
quay.io/centos/centos:stream10 but ConvertFilesFromContainerImageToBytesMap
contains hardcoded repo rewrite logic that forces repo names to "9-stream";
update the rewrite logic in ConvertFilesFromContainerImageToBytesMap to derive
the stream version from the centosPullspec (or from the repo file contents)
instead of always substituting "9-stream" — specifically detect tags like
"stream10" and rewrite to "10-stream" (or preserve the original stream token) so
repo entries produced for centosPullspec match the pulled image; ensure the
change touches the rewrite branch that references "9-stream" and keeps tests
using centosPullspec unchanged.

@cheesesashimi
Copy link
Copy Markdown
Member Author

/retest-required

@cheesesashimi
Copy link
Copy Markdown
Member Author

/test e2e-aws-ovn e2e-aws-ovn-upgrade e2e-gcp-op-part1 e2e-gcp-op-part2 e2e-gcp-op-single-node e2e-gcp-op-ocl-part1 e2e-gcp-op-ocl-part2

@cheesesashimi
Copy link
Copy Markdown
Member Author

/payload-job periodic-ci-openshift-machine-config-operator-release-4.22-arm64-periodics-e2e-aws-mco-disruptive-techpreview-3of3

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 12, 2026

@cheesesashimi: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-machine-config-operator-release-4.22-arm64-periodics-e2e-aws-mco-disruptive-techpreview-3of3

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/7b2a0d00-4e11-11f1-9785-b0e986132b42-0

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 12, 2026

@cheesesashimi: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-gcp-op-part1 ea6f93f link true /test e2e-gcp-op-part1

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Golang defines a built-in delete() function which is used to remove
items from maps. These tests overwrite this built-in function by
assigning a different function to it. Using a different name for our
cleanup functions is preferred due to the potential for side-effects.
When usermod -P is used, the last password change field is changed. When
the MachineConfig is rolled back, this value remains the same even
though the actual password value has been changed back to its previous
value. Consequently, the test should only use the password hash to
determine whether it was successful or not.

Assisted-By: Claude Opus 4.6
@cheesesashimi
Copy link
Copy Markdown
Member Author

/test e2e-gcp-op-part1

1 similar comment
@cheesesashimi
Copy link
Copy Markdown
Member Author

/test e2e-gcp-op-part1

@cheesesashimi
Copy link
Copy Markdown
Member Author

/test unit

@cheesesashimi
Copy link
Copy Markdown
Member Author

/payload-job periodic-ci-openshift-machine-config-operator-release-5.0-periodics-e2e-aws-mco-disruptive-techpreview-3of3

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 12, 2026

@cheesesashimi: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-machine-config-operator-release-5.0-periodics-e2e-aws-mco-disruptive-techpreview-3of3

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/62101d20-4e41-11f1-9aa1-f596df5ceec0-0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants