[WIP] Parsing current RHEL Streams from Product Pages by mruprich · Pull Request #424 · packit/ai-workflows

mruprich · 2026-04-27T12:02:48Z

TODO:

Write new tests or update the old ones to cover new functionality.
Update doc-strings where appropriate.
Update or write new documentation in packit/packit.dev.
‹fill in›

The Product Pages should be the source of truth for this, but getting current and upcoming z-streams and current y-streams is tricky since they overlap during the DevDocTest phases. My logic is as follows:

If there are more than one active y-stream branches ATM per major RHEL, that means that the lower stream is currently in the very late phase of DevDocTest and right now it is being prepared for GA. That means it will very soon become an active z-stream, but right now it is the upcoming z-stream
I did not find a way to get active z-streams, other than look for the (GA/ZStream) in the long name, that marks the current z-stream

Anyway, the fetch_rhel_streams_snapshot() function should return a structure similar to what load_rhel_config() function returns now except for the package_instructions part. It will only work with a valid Kerberos ticket, otherwise it will raise a ToolError.

I am marking this as WIP for now so let me have it. Let me know if I chose a good place for the code.

RELEASE NOTES BEGIN

Packit now supports automatic parsing of all the current RHEL streams.

RELEASE NOTES END

for more information, see https://pre-commit.ci

gemini-code-assist

Code Review

This pull request introduces a new module product_pages.py to fetch and process RHEL release stream data from the internal Product Pages API. The implementation includes logic for identifying current y-streams, current z-streams, and upcoming z-streams. Feedback was provided to improve the reliability and efficiency of the API interaction by adding timeouts, using a context manager for the session, consolidating multiple API calls into a single request, and enhancing error handling.

gemini-code-assist · 2026-04-27T12:05:19Z

+    s = requests.Session()
+    auth = requests_gssapi.HTTPSPNEGOAuth(mutual_authentication=requests_gssapi.OPTIONAL)
+    auth_resp = s.post(_OIDC_AUTHENTICATE_URL, auth=auth)
+    _require_ok(auth_resp, "OIDC authenticate")
+
+    # Multiple active releases per major: lower stream is finishing; higher is main y-stream.
+    response_active = s.get(
+        _RELEASES_API_URL,
+        params={
+            "fields": "shortname",
+            "active": "",
+            "product__shortname": "rhel",
+        },
+    )
+    _require_ok(response_active, "active releases")
+    active_data = response_active.json()
+
+    current_y_streams = _build_current_y_streams(active_data)
+    upcoming_z_streams = _build_upcoming_z_streams(active_data)
+
+    response_zstream = s.get(
+        _RELEASES_API_URL,
+        params={
+            "fields": "shortname,name_incl_maint,name",
+            "product__shortname": "rhel",
+        },
+    )
+    _require_ok(response_zstream, "releases for z-stream filtering")
+    z_data = response_zstream.json()
+
+    fields = [
+        "shortname",
+        "name_incl_maint",
+        "name",
+    ]
+    filtered = [
+        {k: item[k] for k in fields}
+        for item in z_data
+        if _GA_ZSTREAM_RE.search(item.get("name_incl_maint") or "")
+    ]
+
+    current_z_streams = _build_current_z_streams_ga_zstream(filtered)
+
+    return {
+        "current_y_streams": current_y_streams,
+        "current_z_streams": current_z_streams,
+        "upcoming_z_streams": upcoming_z_streams,
+    }


The current implementation of _fetch_rhel_streams_snapshot_sync can be improved for reliability, efficiency, and resource management:

Timeouts: Network requests using requests should always specify a timeout (e.g., 30 seconds) to prevent the application from hanging indefinitely if the server is unresponsive.

Efficiency: Instead of making two separate calls to the releases API, you can fetch all RHEL releases in a single call by requesting the active field and then filtering the results in memory. This reduces network overhead and load on the Product Pages API.

Resource Management: Using requests.Session as a context manager (with requests.Session() as s:) ensures that the underlying TCP connections are properly closed when the function exits.

Error Handling: Wrapping the network calls and JSON parsing in a try-except block (catching requests.RequestException and ValueError) allows for more robust error reporting by wrapping these low-level errors into a ToolError with a descriptive message.

with requests.Session() as s: auth = requests_gssapi.HTTPSPNEGOAuth(mutual_authentication=requests_gssapi.OPTIONAL) try: # Always specify a timeout for network requests to avoid hanging indefinitely. auth_resp = s.post(_OIDC_AUTHENTICATE_URL, auth=auth, timeout=30) _require_ok(auth_resp, "OIDC authenticate") # Fetch all RHEL releases in a single call and filter in memory to reduce network overhead. response = s.get( _RELEASES_API_URL, params={ "fields": "shortname,name_incl_maint,name,active", "product__shortname": "rhel", }, timeout=30, ) _require_ok(response, "fetch releases") all_data = response.json() except (requests.RequestException, ValueError) as e: raise ToolError(f"Failed to fetch RHEL streams from Product Pages: {e}") from e active_data = [item for item in all_data if item.get("active")] current_y_streams = _build_current_y_streams(active_data) upcoming_z_streams = _build_upcoming_z_streams(active_data) filtered_z = [ item for item in all_data if _GA_ZSTREAM_RE.search(item.get("name_incl_maint") or "") ] current_z_streams = _build_current_z_streams_ga_zstream(filtered_z) return { "current_y_streams": current_y_streams, "current_z_streams": current_z_streams, "upcoming_z_streams": upcoming_z_streams, }

added

I don't think that is possible. I did not find a way to get all that data in a single API call. The active API call gives very different results from the second one that grabs everything.

added

added

for more information, see https://pre-commit.ci

TomasTomecek

very nice!! it would be lovely to get at least a single test case that has mocked data with what PP returns (fake data obviously so we're not leaking anything here)

TomasTomecek · 2026-04-27T12:34:47Z

+    timeout = _PRODUCT_PAGES_TIMEOUT
+    try:
+        with requests.Session() as s:
+            auth = requests_gssapi.HTTPSPNEGOAuth(mutual_authentication=requests_gssapi.OPTIONAL)


we need to make sure this works okay with our kerberos setup

This should work with an active and valid krb ticket in your system but I realize that packit might have a different way to get a ticket right?

Take a look at init_kerberos_ticket function. That is the one used to obtain kerberos ticket or verify it already exists.

Pushing another round of changes. I used the init_kereberos_ticket to get a ticket. I hope I am using that right.

I also added a unit test for the product pages..

for more information, see https://pre-commit.ci

TomasTomecek · 2026-05-04T08:18:27Z

Just a headsup a related commit was merged recently: a0c211a

for more information, see https://pre-commit.ci

mruprich · 2026-05-05T13:54:48Z

One question. I see the internal_repos_host being pulled from the rhel-config.json in ymir/tools/privileged/copr.py. What should I do about that? That is not possible to pull from product pages...

Maybe @nforro or @opohorel are aware of that option? Any idea whether this is actually used somewhere?

opohorel · 2026-05-05T17:37:37Z

One question. I see the internal_repos_host being pulled from the rhel-config.json in ymir/tools/privileged/copr.py. What should I do about that? That is not possible to pull from product pages...

Maybe @nforro or @opohorel are aware of that option? Any idea whether this is actually used somewhere?

AFAIK it is used for z-stream builds where the repo is added into the copr. If I remember correctly we had some issues with copr not having all the latest packages which were in the z-stream repo. With internal_repos_host defined you always get the latest z-stream packages to build against

@nforro knows more about it as he implemented it in #347

nforro · 2026-05-05T18:27:24Z

It's the location of Brew repos, and it has nothing to do with Product Pages, it's in rhel-config.json only because it's an internal URL (and yes, it changes in time, so it can't be hardcoded).

mruprich · 2026-05-05T18:34:55Z

Ok, thanks. So one solution would be to keep the rhel-config.json at least for copr.py to use. That requires minimal changes in copr.py.

Maybe the rhel-config.json could be used in the future as some sort of a cache? So that you don't have to contact PP every time you do something? Might save some bandwidth and you would be able to work even when the PP is unavailable. There would have to be a mechanism to make sure that it is up-to-date though...

nforro

Shouldn't this be a privileged MCP tool as it accesses internal data?

nforro · 2026-05-05T18:38:14Z

Ok, thanks. So one solution would be to keep the rhel-config.json at least for copr.py to use. That requires minimal changes in copr.py.

Yes, we can rename the file later or use an environment variable for the URL instead, but that's out of scope of this PR.

Maybe the rhel-config.json could be used in the future as some sort of a cache? So that you don't have to contact PP every time you do something? Might save some bandwidth and you would be able to work even when the PP is unavailable. There would have to be a mechanism to make sure that it is up-to-date though...

Some caching mechanism would be definitely welcome, but I don't think it should be a file in .secrets.

mruprich · 2026-05-05T18:48:33Z

@nforro the placement of the product-pages.py is really up to you guys, I will move it if necessary.

nforro · 2026-05-06T06:00:02Z

@nforro the placement of the product-pages.py is really up to you guys, I will move it if necessary.

It's more about not calling fetch_rhel_streams_snapshot() from anywhere except a privileged MCP tool than about location of the file. There are plenty of examples of such tools, how they are registered and how they are called in the code.

Perhaps it would be easier for you to contribute just product-pages.py in a form of standalone "library" and leave the rest on us. Kerberos ticket initialization would be the tool's responsibility and raising ToolErrors from fetch_rhel_streams_snapshot() seems wrong anyway.

Michal Ruprich and others added 2 commits April 27, 2026 13:07

This should parse PP for current RHEL streams

44e306d

[pre-commit.ci] auto fixes from pre-commit.com hooks

ec69762

for more information, see https://pre-commit.ci

gemini-code-assist Bot reviewed Apr 27, 2026

View reviewed changes

Michal Ruprich and others added 2 commits April 27, 2026 14:29

Adding suggestions from gemini-code-assist

441380d

[pre-commit.ci] auto fixes from pre-commit.com hooks

eeda027

for more information, see https://pre-commit.ci

TomasTomecek reviewed Apr 27, 2026

View reviewed changes

Michal Ruprich and others added 2 commits April 29, 2026 14:55

Fixing kerberos fetching, adding unit test

1537061

[pre-commit.ci] auto fixes from pre-commit.com hooks

f9a5685

for more information, see https://pre-commit.ci

Michal Ruprich and others added 5 commits May 5, 2026 13:40

Replacing the load_rhel_config function call

cd0c0a9

Merge branch 'main' into product-pages

8308db1

[pre-commit.ci] auto fixes from pre-commit.com hooks

b74c5f2

for more information, see https://pre-commit.ci

Pulling recent changes

f51e210

[pre-commit.ci] auto fixes from pre-commit.com hooks

9c15e31

for more information, see https://pre-commit.ci

Fixing tests

190e877

nforro requested changes May 5, 2026

View reviewed changes

Conversation

mruprich commented Apr 27, 2026 • edited by opohorel Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

mruprich Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

TomasTomecek left a comment

Choose a reason for hiding this comment

Uh oh!

TomasTomecek Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

mruprich Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

TomasKorbar Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

mruprich Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

TomasTomecek commented May 4, 2026

Uh oh!

mruprich commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

opohorel commented May 5, 2026

Uh oh!

nforro commented May 5, 2026

Uh oh!

mruprich commented May 5, 2026

Uh oh!

nforro left a comment

Choose a reason for hiding this comment

Uh oh!

nforro commented May 5, 2026

Uh oh!

mruprich commented May 5, 2026

Uh oh!

nforro commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

mruprich commented Apr 27, 2026 •

edited by opohorel

Loading

mruprich commented May 5, 2026 •

edited

Loading

nforro commented May 6, 2026 •

edited

Loading