Skip to content

MCO-2200: Simplify CoreOS fetching code#10534

Closed
andfasano wants to merge 5 commits intoopenshift:mainfrom
andfasano:abi-remove-custom-stream-getter
Closed

MCO-2200: Simplify CoreOS fetching code#10534
andfasano wants to merge 5 commits intoopenshift:mainfrom
andfasano:abi-remove-custom-stream-getter

Conversation

@andfasano
Copy link
Copy Markdown
Contributor

@andfasano andfasano commented May 6, 2026

This patch is meant to be a preliminary simplification to facilitate future osImageStream (rhel-9/rhel-10) adoption (see #10481).

Previously a streamGetter func was introduced mainly to support a distinction between the ABI Install and AddNodes workflow: in the latter, the stream data are directly fetched from the target cluster - and not from the embedded installer metadata as usual. Anyhow this design didn't make it easier to introduce a new field (to select the source stream), and was used only for above case.

Changes

  • Removed the CoreOSBuildFetcher function type and replaced it with a direct *stream.Stream pointer throughout the base ISO fetching and CoreOS metadata resolution paths
  • When the stream is nil (Install workflow, baremetal bootstrap, unconfigured ignition), the code falls back to fetching from embedded metadata; when non-nil (AddNodes workflow), the provided stream from the target cluster is used directly
  • Rename GetMetalArtifact to GetMetalArtifactWithStream to make the nil-handling behavior explicit

Summary by CodeRabbit

  • Refactor
    • Improved internal architecture for artifact management and simplified dependency handling across image generation workflows.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label May 6, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented May 6, 2026

@andfasano: This pull request references MCO-2200 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

This patch is meant to be a preliminary simplification to facilitate future osImageStream (rhel-9/rhel-10) adoption (see #10481).

Previously a streamGetter func was introduced mainly to support a distinction between the ABI Install and AddNodes workflow: in the latter, the stream data are directly fetched from the target cluster - and not from the embedded installer metadata as usual. Anyhow this design didn't make it easier to introduce a new field (to select the source stream), and was used only for above case.

Changes

  • Removed the CoreOSBuildFetcher function type and replaced it with a direct *stream.Stream pointer throughout the base ISO fetching and CoreOS metadata resolution paths
  • When the stream is nil (Install workflow, baremetal bootstrap, unconfigured ignition), the code falls back to fetching from embedded metadata; when non-nil (AddNodes workflow), the provided stream from the target cluster is used directly
  • Rename GetMetalArtifact to GetMetalArtifactWithStream to make the nil-handling behavior explicit

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 6, 2026

Walkthrough

This PR refactors how CoreOS streams are passed throughout the codebase, replacing a function-type fetcher pattern (CoreOSBuildFetcher) with direct *stream.Stream pointer parameters. The changes update type signatures, struct fields, interface contracts, implementation logic, and call sites across the agent and imagebased asset packages.

Changes

CoreOS Stream Parameter Refactoring

Layer / File(s) Summary
Core Type & API Definition
pkg/asset/rhcos/iso.go
BaseIso struct replaces streamGetter CoreOSBuildFetcher field with st *stream.Stream. Constructor NewBaseISOFetcher now accepts stream directly. Old GetMetalArtifact function removed; new GetMetalArtifactWithStream added for stream-first artifact retrieval.
Interface Contract Update
pkg/asset/rhcos/releaseextract.go
ReleasePayload interface method GetBaseIso signature changed from CoreOSBuildFetcher to *stream.Stream. Implementation and helper functions (verifyCacheFile, getHashFromInstaller) updated to accept and use the stream parameter; lazy fallback to rhcos.FetchCoreOSBuild added for nil streams.
Core Implementation Updates
pkg/asset/agent/image/baseiso.go
getRootFSURL signature simplified to accept *stream.Stream instead of agentWorkflow and clusterInfo; now uses GetMetalArtifactWithStream. Removed internal customStreamGetter helper.
pkg/asset/imagebased/image/baseiso.go
pkg/asset/agent/image/ignition.go
Call Site Wiring
pkg/asset/agent/image/agentimage.go
baseIso.getRootFSURL call updated to pass clusterInfo.OSImage instead of workflow and cluster context objects.
Tests & Test Doubles
pkg/asset/rhcos/iso_test.go
Mock's GetBaseIso signature updated to accept *stream.Stream. Test wiring passes concrete stream descriptor to NewBaseISOFetcher.
pkg/asset/imagebased/image/baseiso_test.go

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 11 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 28.57% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (11 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'MCO-2200: Simplify CoreOS fetching code' accurately and concisely summarizes the main objective of the changeset: simplifying CoreOS/base ISO fetching by removing the CoreOSBuildFetcher abstraction and replacing it with direct stream pointers.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed This PR does not use Ginkgo framework. Test files use standard Go testing with static test names containing no dynamic values.
Test Structure And Quality ✅ Passed The custom check requests Ginkgo test quality review. The modified test files use standard Go testing (testing.T), not Ginkgo (Describe/It blocks). The check is not applicable to this PR.
Microshift Test Compatibility ✅ Passed No new Ginkgo e2e tests are added in this PR. The changes are refactoring of installer asset packages (CoreOS/base ISO fetching) and unit tests. MicroShift compatibility check is not applicable.
Single Node Openshift (Sno) Test Compatibility ✅ Passed No new Ginkgo e2e tests are added in this PR. Changes are limited to asset fetching and CoreOS base ISO handling code with only unit test modifications.
Topology-Aware Scheduling Compatibility ✅ Passed PR refactors CoreOS artifact fetching infrastructure only. No changes to deployment manifests, operators, or scheduling constraints. Not applicable to topology-aware scheduling check.
Ote Binary Stdout Contract ✅ Passed The OTE Binary Stdout Contract check is not applicable. This PR modifies only asset/library packages (CoreOS fetching and ISO building) with no process-level functions or stdout writes.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed PR contains no Ginkgo e2e tests. Changes are refactoring of asset handling code and unit tests using Go's standard testing package. Check is not applicable.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.12.1)

Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions
The command is terminated due to an error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions


Comment @coderabbitai help to get the list of available commands and usage tips.

@andfasano
Copy link
Copy Markdown
Contributor Author

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 6, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign jhixson74 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot requested review from jhixson74 and pawanpinjarkar May 6, 2026 15:02
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pkg/asset/rhcos/iso.go (1)

134-149: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Prefer the supplied stream before extracting from the release payload.

When i.st is non-nil, this branch still calls i.ocRelease.GetBaseIso(...) first. That means the AddNodes path can still return the ISO from the installer release payload instead of the target cluster stream passed from pkg/asset/agent/image/baseiso.go Line 75, which breaks the new nil/non-nil contract and can pick the wrong base ISO once the cluster stream diverges from the installer release.

Suggested fix
 func (i *BaseIso) retrieveBaseIso(ctx context.Context, archName string) (string, error) {
 	// Default iso archName to x86_64.
 	if archName == "" {
 		archName = arch.RpmArch(types.ArchitectureAMD64)
 	}

+	if i.st != nil {
+		logrus.Info("Downloading base ISO from provided stream")
+		if err := workflowreport.GetReport(ctx).SubStage(workflow.StageFetchBaseISODownload); err != nil {
+			return "", err
+		}
+		return i.downloadIso(ctx, archName)
+	}
+
 	if i.ocRelease != nil {
 		// If we have the image registry location and 'oc' command is available then get from release payload
 		logrus.Info("Extracting base ISO from release payload")
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/asset/rhcos/iso.go` around lines 134 - 149, The code is calling
i.ocRelease.GetBaseIso(...) even when a stream (i.st) was supplied; change the
branch to prefer the supplied stream: if i.st != nil, use the provided stream to
determine the base ISO (the stream passed from the agent/baseiso logic) and
return that result without calling i.ocRelease.GetBaseIso; only call
i.ocRelease.GetBaseIso and checkReleasePayloadBaseISOVersion when i.st is nil.
Ensure you still invoke the workflowreport SubStage transitions
(StageFetchBaseISOExtract and StageFetchBaseISOVerify) in the same places for
both paths.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@pkg/asset/rhcos/iso.go`:
- Around line 134-149: The code is calling i.ocRelease.GetBaseIso(...) even when
a stream (i.st) was supplied; change the branch to prefer the supplied stream:
if i.st != nil, use the provided stream to determine the base ISO (the stream
passed from the agent/baseiso logic) and return that result without calling
i.ocRelease.GetBaseIso; only call i.ocRelease.GetBaseIso and
checkReleasePayloadBaseISOVersion when i.st is nil. Ensure you still invoke the
workflowreport SubStage transitions (StageFetchBaseISOExtract and
StageFetchBaseISOVerify) in the same places for both paths.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: aa494340-460e-4f96-9368-6db3ebf13406

📥 Commits

Reviewing files that changed from the base of the PR and between 8ec4836 and 7ee5482.

📒 Files selected for processing (8)
  • pkg/asset/agent/image/agentimage.go
  • pkg/asset/agent/image/baseiso.go
  • pkg/asset/agent/image/ignition.go
  • pkg/asset/imagebased/image/baseiso.go
  • pkg/asset/imagebased/image/baseiso_test.go
  • pkg/asset/rhcos/iso.go
  • pkg/asset/rhcos/iso_test.go
  • pkg/asset/rhcos/releaseextract.go

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 6, 2026

@andfasano: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-agent-two-node-fencing-ipv4 7ee5482 link false /test e2e-agent-two-node-fencing-ipv4
ci/prow/okd-scos-images 7ee5482 link true /test okd-scos-images
ci/prow/e2e-agent-compact-ipv4-rhel10-techpreview 7ee5482 link false /test e2e-agent-compact-ipv4-rhel10-techpreview
ci/prow/e2e-agent-compact-ipv4-appliance-diskimage 7ee5482 link false /test e2e-agent-compact-ipv4-appliance-diskimage
ci/prow/e2e-agent-5control-ipv4 7ee5482 link false /test e2e-agent-5control-ipv4
ci/prow/e2e-agent-ha-dualstack 7ee5482 link false /test e2e-agent-ha-dualstack
ci/prow/e2e-agent-compact-ipv4-iso-no-registry 7ee5482 link false /test e2e-agent-compact-ipv4-iso-no-registry

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Copy link
Copy Markdown
Member

@zaneb zaneb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously a streamGetter func was introduced mainly to support a distinction between the ABI Install and AddNodes workflow

Given that in a production build the stream data in the cluster ConfigMap is guaranteed to be the same as the one embedded in the node-joiner/installer binaries, what if we just... didn't? We could use the same code for everything and remove heaps of complexity.

I'm struggling to think of a CI scenario in which this would prevent us from pre-merge testing something either.

If we're going to pass a whole Stream (rather than just a name like rhcos9 or rhcos10) then I'd prefer that we do it everywhere rather than have this nil vs. non-nil API. See the comment from CodeRabbit.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somebody should probably refactor IBI so that it uses the same pkg/asset/rhcos code instead of copying-and-pasting an old version of the ABI code from before it was moved out of the pk/asset/agent/image package.

@andfasano
Copy link
Copy Markdown
Contributor Author

Previously a streamGetter func was introduced mainly to support a distinction between the ABI Install and AddNodes workflow

Given that in a production build the stream data in the cluster ConfigMap is guaranteed to be the same as the one embedded in the node-joiner/installer binaries, what if we just... didn't? We could use the same code for everything and remove heaps of complexity.

I'm struggling to think of a CI scenario in which this would prevent us from pre-merge testing something either.

🤔 I think for this specific use case such approach could work fine. Even though we usually considered the target cluster as the source of truth, the node-joiner binary it's still always in sync with the current target cluster's release (no skewing). I'll try it out in another PR, if it works we could really peel off another layer of complexity from the current code.

@andfasano
Copy link
Copy Markdown
Contributor Author

New refactor available in #10537

@andfasano andfasano closed this May 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants