Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
138 changes: 138 additions & 0 deletions docs/design/aws-marketplace.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
# AWS Marketplace RHCOS Boot Image Updates (MCO)

## Overview

This document describes how the Machine Config Operator (MCO) resolves the correct RHCOS AMI when updating nodes that were provisioned from an AWS Marketplace image.

AWS Marketplace offers RHCOS images under several OpenShift product tiers. Each tier has a distinct product ID embedded in the AMI name. The MCO uses this product ID as a stable identifier to find the updated AMI for a given release, without requiring any stored state about which offering the cluster was originally installed from.

## Flavors

Each offering has a unique product ID per architecture embedded in the AMI name, e.g.:

```
RHEL-9.4-RHCOS-4.18_HVM_GA-20251119-x86_64-0-59ead7de-2540-4653-a8b0-fa7926d5c845
```
Comment on lines +13 to +15
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add language tags to the fenced code blocks.

These five fences are currently untyped, which is already tripping markdownlint (MD040). text fits the AMI/name examples, and bash fits the DescribeImages snippet.

Also applies to: 55-57, 102-104, 112-114, 122-125

🧰 Tools
🪛 markdownlint-cli2 (0.22.1)

[warning] 13-13: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/design/aws-marketplace.md` around lines 13 - 15, Several fenced code
blocks are missing language tags (causing MD040); add appropriate languages: for
the AMI/name example blocks such as the string
"RHEL-9.4-RHCOS-4.18_HVM_GA-20251119-x86_64-0-59ead7de-2540-4653-a8b0-fa7926d5c845"
mark the fence as ```text, and for the AWS DescribeImages command block mark the
fence as ```bash (apply the same change to the other untyped fences noted around
lines with similar AMI/name examples and the DescribeImages snippet).


### x86_64

| Offering | Product ID |
|---|---|
| OCP | `59ead7de-2540-4653-a8b0-fa7926d5c845` |
| OKE | `963b36c3-de6f-48ed-b802-2b38b2a2cdeb` |
| OPP | `f5da01a6-d046-487c-9072-42fe53b1cad4` |

### arm64

| Offering | Product ID |
|---|---|
| OCP | `abc249f8-7440-45f7-a4b1-c026baff64c1` |
| OKE | `d2d3ebcd-c1ca-43d8-bf0a-530433200f35` |
| OPP | `be6d3e94-c8dc-4a3e-9218-4b449b11f06f` |

### x86_64 EMEA (EU, Middle East, Africa)

| Offering | Product ID |
|---|---|
| OCP EMEA | `962791c7-3ae5-46d1-ba62-c7a5ebac54fd` |
| OKE EMEA | `7026c8d7-392c-4010-b93c-f93f7bc5495f` |
| OPP EMEA | `628c9df3-0254-4f91-bc1f-8619d1b8eaa8` |

EMEA has no arm64 variants.

### ROSA

| Offering | Product ID |
|---|---|
| ROSA | `34850061-abaf-402d-92df-94325c9e947f` |

ROSA Classic is planned to be sunset for 5.0, it was implemented because the solution was identical to the other flavors listed above.

## AMI Name Format

This feature targets OCP 4.22+, where RHCOS uses RHEL-aligned versioning. Marketplace AMIs follow this naming convention:

```
RHEL-{rhel-version}-RHCOS-{version-token}_HVM_GA-{date}-{arch}-{index}-{uuid}
```

The `{version-token}` is derived from the leading two segments of the RHCOS release string in the stream configmap:

| RHCOS release string (configmap) | Version token in AMI name |
|---|---|
| `9.6.20260210-0` | `9.6` |

Pre-4.19 clusters used a different RHCOS release string format (`418.94.202511191518-0`) and a different AMI naming scheme that predates marketplace boot image support. This enhancement would only land in 4.22+, so this format is not considered while parsing the configmap's release version value.

## AMI Description Field

Each marketplace AMI carries a `Description` field that embeds the version token. Two formats are in use depending on the product line:

| Product line | Description format | Example |
|---|---|---|
| OCP / OKE / OPP (RHEL marketplace) | `RHEL CoreOS {N.M} {release-string} {arch}` | `RHEL CoreOS 9.6 9.6.20260210-0 x86_64` |
| ROSA | `rhcos-{N.M}.{date}-{index}-{arch}` | `rhcos-9.6.20250701-0-x86_64` |

The version token (e.g. `9.6`) is the primary field used for version matching at runtime. Because the surrounding characters differ between formats (spaces for RHEL marketplace, dash/dot for ROSA), the MCO checks both boundaries when filtering results.

Marketplace AMIs are distinguished from standard RHCOS AMIs by owner account `679593333241` (`aws-marketplace`). Standard RHCOS AMIs are owned by `531415883065` and have no UUID in their name.

## AMI Detection

All AWS boot image updates begin with a `DescribeImages` call on the MachineSet's current AMI ID. The MCO branches on `OwnerId`:

- `531415883065` (Red Hat) → standard RHCOS — update target comes from stream configmap by region and architecture
- `679593333241` (AWS Marketplace) → marketplace RHCOS — use the marketplace resolution flow below
- anything else → custom/unknown image, skip and log

## Runtime Flow

### Standard path

The target AMI ID is read directly from the stream configmap by region and architecture. No additional EC2 API calls are needed beyond the initial `DescribeImages` on the current AMI.

### Marketplace path

AMI IDs are region-scoped. The MCO resolves the correct marketplace AMI at update time using a `DescribeImages` name-pattern filter anchored to the product ID. No stored flavor state is required.

#### Step 1: extract the product ID

The current AMI is already known from the MachineSet's provider spec. Its name contains the product ID as the trailing segment:

```
RHEL-9.4-RHCOS-4.18_HVM_GA-20251119-x86_64-0-{product-id}
```

Extract the product ID by splitting on `-` and taking the last five groups, then validating the result as a UUID (`8-4-4-4-12` hex format).

#### Step 2: derive the target version token

From the stream configmap's RHCOS release string for the target OCP version:

```
9.6.20260210-0 → token "9.6"
```

Take the first two dot-separated segments.

#### Step 3: find the updated AMI

Call `DescribeImages` in the node's region:

```
--owners aws-marketplace
--filters "Name=name,Values=*{product-id}*"
```

From the results, filter to images whose `Description` field contains the version token derived in step 2, matching against both description formats (space-bounded for RHEL marketplace, dash/dot-bounded for ROSA). Among those, select the AMI with the latest `CreationDate`.

If no matching AMI is found in the region (e.g. replication lag), the MCO skips and retries on the next reconcile cycle rather than falling back to a different version.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit conflicted on this, if a RHEL version match isn't found in the marketplace, it likely means that the latest RHEL version hasn't been mirrored yet.

Would it make sense to just use the latest published value for that flavor? The only issue I can see if the marketplace skips a RHEL minor for some reason, and an older y stream could accidentally do a boot image update to a newer RHEL release than appropriate. Still - I think a RHEL minor not being published would be quite unlikely?

For now, the controller just skips the update if the bootimage for that RHEL version isn't found in the marketplace.


### Total EC2 calls per MachineSet reconcile

- **All AWS clusters**: 1 `DescribeImages` call on the current AMI (owner-based classification)
- **Marketplace clusters** (additional): 1 `DescribeImages` call with the product ID filter

## Credentials

The MCO reads AWS credentials from the `aws-cloud-credentials` secret in the `openshift-machine-api` namespace. This secret is provisioned by the machine-api-operator's `CredentialsRequest` and already includes `ec2:DescribeImages` with `resource: "*"`. No new `CredentialsRequest` or RBAC changes are required — the MCO's existing clusterrole grants cluster-wide secret read access.
15 changes: 15 additions & 0 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,21 @@ require (
github.com/armon/circbuf v0.0.0-20190214190532-5111143e8da2 // indirect
github.com/asaskevich/govalidator v0.0.0-20230301143203-a9d515a09cc2 // indirect
github.com/aws/aws-sdk-go v1.55.6 // indirect
github.com/aws/aws-sdk-go-v2 v1.41.7 // indirect
github.com/aws/aws-sdk-go-v2/config v1.32.17 // indirect
github.com/aws/aws-sdk-go-v2/credentials v1.19.16 // indirect
github.com/aws/aws-sdk-go-v2/feature/ec2/imds v1.18.23 // indirect
github.com/aws/aws-sdk-go-v2/internal/configsources v1.4.23 // indirect
github.com/aws/aws-sdk-go-v2/internal/endpoints/v2 v2.7.23 // indirect
github.com/aws/aws-sdk-go-v2/internal/v4a v1.4.24 // indirect
github.com/aws/aws-sdk-go-v2/service/ec2 v1.300.0 // indirect
github.com/aws/aws-sdk-go-v2/service/internal/accept-encoding v1.13.9 // indirect
github.com/aws/aws-sdk-go-v2/service/internal/presigned-url v1.13.23 // indirect
github.com/aws/aws-sdk-go-v2/service/signin v1.0.11 // indirect
github.com/aws/aws-sdk-go-v2/service/sso v1.30.17 // indirect
github.com/aws/aws-sdk-go-v2/service/ssooidc v1.35.21 // indirect
github.com/aws/aws-sdk-go-v2/service/sts v1.42.1 // indirect
github.com/aws/smithy-go v1.25.1 // indirect
github.com/bombsimon/wsl/v4 v4.5.0 // indirect
github.com/butuzov/mirror v1.3.0 // indirect
github.com/catenacyber/perfsprint v0.7.1 // indirect
Expand Down
30 changes: 30 additions & 0 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,36 @@ github.com/ashcrow/osrelease v0.0.0-20180626175927-9b292693c55c/go.mod h1:BRljTy
github.com/aws/aws-sdk-go v1.19.11/go.mod h1:KmX6BPdI08NWTb3/sm4ZGu5ShLoqVDhKgpiN924inxo=
github.com/aws/aws-sdk-go v1.55.6 h1:cSg4pvZ3m8dgYcgqB97MrcdjUmZ1BeMYKUxMMB89IPk=
github.com/aws/aws-sdk-go v1.55.6/go.mod h1:eRwEWoyTWFMVYVQzKMNHWP5/RV4xIUGMQfXQHfHkpNU=
github.com/aws/aws-sdk-go-v2 v1.41.7 h1:DWpAJt66FmnnaRIOT/8ASTucrvuDPZASqhhLey6tLY8=
github.com/aws/aws-sdk-go-v2 v1.41.7/go.mod h1:4LAfZOPHNVNQEckOACQx60Y8pSRjIkNZQz1w92xpMJc=
github.com/aws/aws-sdk-go-v2/config v1.32.17 h1:FpL4/758/diKwqbytU0prpuiu60fgXKUWCpDJtApclU=
github.com/aws/aws-sdk-go-v2/config v1.32.17/go.mod h1:OXqUMzgXytfoF9JaKkhrOYsyh72t9G+MJH8mMRaexOE=
github.com/aws/aws-sdk-go-v2/credentials v1.19.16 h1:r3RJBuU7X9ibt8RHbMjWE6y60QbKBiII6wSrXnapxSU=
github.com/aws/aws-sdk-go-v2/credentials v1.19.16/go.mod h1:6cx7zqDENJDbBIIWX6P8s0h6hqHC8Avbjh9Dseo27ug=
github.com/aws/aws-sdk-go-v2/feature/ec2/imds v1.18.23 h1:UuSfcORqNSz/ey3VPRS8TcVH2Ikf0/sC+Hdj400QI6U=
github.com/aws/aws-sdk-go-v2/feature/ec2/imds v1.18.23/go.mod h1:+G/OSGiOFnSOkYloKj/9M35s74LgVAdJBSD5lsFfqKg=
github.com/aws/aws-sdk-go-v2/internal/configsources v1.4.23 h1:GpT/TrnBYuE5gan2cZbTtvP+JlHsutdmlV2YfEyNde0=
github.com/aws/aws-sdk-go-v2/internal/configsources v1.4.23/go.mod h1:xYWD6BS9ywC5bS3sz9Xh04whO/hzK2plt2Zkyrp4JuA=
github.com/aws/aws-sdk-go-v2/internal/endpoints/v2 v2.7.23 h1:bpd8vxhlQi2r1hiueOw02f/duEPTMK59Q4QMAoTTtTo=
github.com/aws/aws-sdk-go-v2/internal/endpoints/v2 v2.7.23/go.mod h1:15DfR2nw+CRHIk0tqNyifu3G1YdAOy68RftkhMDDwYk=
github.com/aws/aws-sdk-go-v2/internal/v4a v1.4.24 h1:OQqn11BtaYv1WLUowvcA30MpzIu8Ti4pcLPIIyoKZrA=
github.com/aws/aws-sdk-go-v2/internal/v4a v1.4.24/go.mod h1:X5ZJyfwVrWA96GzPmUCWFQaEARPR7gCrpq2E92PJwAE=
github.com/aws/aws-sdk-go-v2/service/ec2 v1.300.0 h1:HgOfUy9Sm2Q9UQAyj9I/7NZhIaymTEakGA/FnLw65lw=
github.com/aws/aws-sdk-go-v2/service/ec2 v1.300.0/go.mod h1:Y95W0Hm6FYLPa6o0hbnJ+sWgmdc4ifcLFjGkdobWVhY=
github.com/aws/aws-sdk-go-v2/service/internal/accept-encoding v1.13.9 h1:FLudkZLt5ci0ozzgkVo8BJGwvqNaZbTWb3UcucAateA=
github.com/aws/aws-sdk-go-v2/service/internal/accept-encoding v1.13.9/go.mod h1:w7wZ/s9qK7c8g4al+UyoF1Sp/Z45UwMGcqIzLWVQHWk=
github.com/aws/aws-sdk-go-v2/service/internal/presigned-url v1.13.23 h1:pbrxO/kuIwgEsOPLkaHu0O+m4fNgLU8B3vxQ+72jTPw=
github.com/aws/aws-sdk-go-v2/service/internal/presigned-url v1.13.23/go.mod h1:/CMNUqoj46HpS3MNRDEDIwcgEnrtZlKRaHNaHxIFpNA=
github.com/aws/aws-sdk-go-v2/service/signin v1.0.11 h1:TdJ+HdzOBhU8+iVAOGUTU63VXopcumCOF1paFulHWZc=
github.com/aws/aws-sdk-go-v2/service/signin v1.0.11/go.mod h1:R82ZRExE/nheo0N+T8zHPcLRTcH8MGsnR3BiVGX0TwI=
github.com/aws/aws-sdk-go-v2/service/sso v1.30.17 h1:7byT8HUWrgoRp6sXjxtZwgOKfhss5fW6SkLBtqzgRoE=
github.com/aws/aws-sdk-go-v2/service/sso v1.30.17/go.mod h1:xNWknVi4Ezm1vg1QsB/5EWpAJURq22uqd38U8qKvOJc=
github.com/aws/aws-sdk-go-v2/service/ssooidc v1.35.21 h1:+1Kl1zx6bWi4X7cKi3VYh29h8BvsCoHQEQ6ST9X8w7w=
github.com/aws/aws-sdk-go-v2/service/ssooidc v1.35.21/go.mod h1:4vIRDq+CJB2xFAXZ+YgGUTiEft7oAQlhIs71xcSeuVg=
github.com/aws/aws-sdk-go-v2/service/sts v1.42.1 h1:F/M5Y9I3nwr2IEpshZgh1GeHpOItExNM9L1euNuh/fk=
github.com/aws/aws-sdk-go-v2/service/sts v1.42.1/go.mod h1:mTNxImtovCOEEuD65mKW7DCsL+2gjEH+RPEAexAzAio=
github.com/aws/smithy-go v1.25.1 h1:J8ERsGSU7d+aCmdQur5Txg6bVoYelvQJgtZehD12GkI=
github.com/aws/smithy-go v1.25.1/go.mod h1:YE2RhdIuDbA5E5bTdciG9KrW3+TiEONeUWCqxX9i1Fc=
github.com/beorn7/perks v0.0.0-20180321164747-3a771d992973/go.mod h1:Dwedo/Wpr24TaqPxmxbtue+5NUziq4I4S80YR8gNf3Q=
github.com/beorn7/perks v1.0.1 h1:VlbKKnNfV8bJzeqoa4cOKqO6bYr3WgKZxO8Z16+hsOM=
github.com/beorn7/perks v1.0.1/go.mod h1:G2ZrVWU2WbWT9wwq4/hrbKbnv/1ERSJQ0ibhJ6rlkpw=
Expand Down
Loading