Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .claude/commands/ci/analyze-failures.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,8 @@ https://jenkins-csb-openshift-qe-mastern.dno.corp.redhat.com/job/image-consisten
https://jenkins-csb-openshift-qe-mastern.dno.corp.redhat.com/job/zstreams/job/Stage-Pipeline/1413/
```

> **Note:** Image consistency check has been migrated from Jenkins to Prow. Prow job URLs follow the Prow URL pattern above.

## Action to Take

**Step 1**: Analyze the provided URL against the detection rules above.
Expand Down
7 changes: 4 additions & 3 deletions .claude/commands/ci/analyze-jenkins-failures.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@ description: Analyze Jenkins job failures (image-consistency-check, stage-testin

You are helping the user analyze failures from a Jenkins job run (typically image-consistency-check or stage-testing jobs used in z-stream release testing).

> **Note:** Image consistency check has been migrated from Jenkins to Prow. For new Prow-based image-consistency-check jobs, use `/ci:analyze-failures` instead. This command still supports analyzing legacy Jenkins image-consistency-check runs.

The user has provided a Jenkins job URL: {{args}}

## Overview
Expand Down Expand Up @@ -995,7 +997,6 @@ No critical actions required. Monitor for:
**Context**:
- These jobs are part of OAR z-stream release workflow
- Triggered by commands like:
- `oar -r 4.19.1 image-consistency-check`
- `oar -r 4.19.1 stage-testing`
- Failures may block release approval

Expand All @@ -1008,11 +1009,11 @@ No critical actions required. Monitor for:
## Example Usage

```bash
/ci:analyze-jenkins-failures https://jenkins-csb-openshift-qe-mastern.dno.corp.redhat.com/job/image-consistency-check/3436/
/ci:analyze-jenkins-failures https://jenkins-csb-openshift-qe-mastern.dno.corp.redhat.com/job/zstreams/job/Stage-Pipeline/1413/
```

The command will:
1. Fetch console log from public endpoint
2. Detect it's an image-consistency-check job
2. Detect it's a stage-testing job
3. Parse the structured output sections
4. Provide analysis and recommendation
50 changes: 26 additions & 24 deletions .claude/commands/release/drive.md
Original file line number Diff line number Diff line change
Expand Up @@ -252,7 +252,7 @@ For complete step-by-step logic, refer to the **`release-workflow` skill** (`.cl

**Async Task Monitoring:**
- Re-execute the same MCP tool to check status
- Example: `oar_image_consistency_check(release, build_number=123)` to check progress
- Example: `oar_image_consistency_check(release, job_id="uuid")` to check progress

## Key Decision Points

Expand Down Expand Up @@ -383,7 +383,7 @@ state = oar_get_release_status(release="4.20.1")
"status": "In Progress",
"started_at": "2025-01-15T14:00:00Z",
"completed_at": null,
"result": "Jenkins job #123 triggered..."
"result": "Prow job triggered..."
}
],
"issues": [
Expand Down Expand Up @@ -458,17 +458,19 @@ elif task["status"] == "Pass":
Continue to next task

elif task["status"] == "In Progress":
# Check if async task (Jenkins jobs)
# Check if async task (Prow/Jenkins jobs)
if task_name in ["image-consistency-check", "stage-testing"]:
# Extract build number from task result
build_number = extract_from_result(task["result"], r"Build number: (\d+)")
# Extract job ID from task result
# image-consistency-check: "Triggered image consistency check Prow job: {job_id}"
# stage-testing: "Build number: {build_number}"
job_id = extract_from_result(task["result"], r"Prow job: (\S+)") or extract_from_result(task["result"], r"job ID: (\S+)") or extract_from_result(task["result"], r"Build number: (\d+)")

if not build_number:
Log: f"⚠ {task_name} in progress but no build number found, retrying..."
if not job_id:
Log: f"⚠ {task_name} in progress but no job ID found, retrying..."
Execute task_name
else:
# Query Jenkins job status
result = execute_mcp_tool(task_name, build_number=build_number)
# Query job status (Prow or Jenkins depending on task)
result = execute_mcp_tool(task_name, job_id=job_id)
Comment thread
coderabbitai[bot] marked this conversation as resolved.

if "status is changed to [Pass]" in result:
Log: f"✓ {task_name} completed successfully"
Expand All @@ -477,7 +479,7 @@ elif task["status"] == "In Progress":
Log: f"✗ {task_name} failed"
STOP pipeline
else:
Log: f"⏳ {task_name} still running (job #{build_number})"
Log: f"⏳ {task_name} still running (job {job_id})"
Ask user to check back later
RETURN
else:
Expand Down Expand Up @@ -505,36 +507,36 @@ elif task["status"] == "Fail":

### Async Task Monitoring

**For long-running Jenkins tasks:**
**For long-running async tasks:**

```python
# Initial trigger (when task doesn't exist or has no build number)
# Initial trigger (when task doesn't exist or has no job ID)
result = oar_image_consistency_check(release=release)

if "Build number:" in result:
build_number = extract_build_number(result)
Log: f"⏳ Jenkins job #{build_number} triggered"
if "Prow job" in result:
job_id = extract_job_id(result)
Log: f"⏳ Prow job {job_id} triggered"
Log: "Check back in 20-30 minutes with: /release:drive {release}"
RETURN

# Status check on resume (when task has build number in result)
result = oar_image_consistency_check(release=release, build_number=build_number)
# Status check on resume (when task has job ID in result)
result = oar_image_consistency_check(release=release, job_id=job_id)

if "status is changed to [Pass]" in result:
Log: f"✓ Job #{build_number} completed successfully"
Log: f"✓ Job {job_id} completed successfully"
Continue to next task
elif "status is changed to [Fail]" in result:
# Add issue to StateBox
oar_add_issue(
release=release,
issue=f"image-consistency-check job #{build_number} failed: {extract_failure_reason(result)}",
issue=f"image-consistency-check Prow job {job_id} failed: {extract_failure_reason(result)}",
blocker=True,
related_tasks=["image-consistency-check"]
)
Log: "✗ Job failed, blocker added to StateBox"
STOP pipeline
else:
Log: f"⏳ Job #{build_number} still running..."
Log: f"⏳ Job {job_id} still running..."
RETURN
```

Expand Down Expand Up @@ -644,8 +646,8 @@ AI: Resuming from PHASE 2...
AI: ✓ Skipping 2 completed tasks (take-ownership, check-cve-tracker-bug)
AI: ⏳ push-to-cdn-staging still running (job #456)
AI: ✓ Build promoted! Phase: PHASE 3 - Test Evaluation
AI: ⏳ image-consistency-check triggered (job #789)
AI: ⏳ stage-testing triggered (job #790)
AI: ⏳ image-consistency-check triggered (Prow job abc-123-def)
AI: ⏳ stage-testing triggered (Jenkins job #790)
AI: Waiting for test results, check back in 1 hour
```

Expand All @@ -656,8 +658,8 @@ AI: Loading StateBox state for 4.20.1...
AI: Resuming from PHASE 4...
AI: ✓ Skipping 4 completed tasks
AI: ✓ push-to-cdn-staging completed (job #456)
AI: ✓ image-consistency-check completed (job #789)
AI: ✓ stage-testing completed (job #790)
AI: ✓ image-consistency-check completed (Prow job abc-123-def)
AI: ✓ stage-testing completed (Jenkins job #790)
AI: Analyzing promoted build test results...
AI: ✓ All tests passed, proceeding to PHASE 5
AI: ✓ image-signed-check completed
Expand Down
18 changes: 9 additions & 9 deletions .claude/skills/release-workflow/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -565,13 +565,13 @@ ELSE IF file.accepted == false:

### 8. image-consistency-check (Async Task)

**Purpose:** Verify image consistency across architectures
**Purpose:** Verify payload images match shipment MR

**MCP Tool:** `oar_image_consistency_check(release, build_number=None)`
**MCP Tool:** `oar_image_consistency_check(release, job_id=None)`

**Input:**
- `release`: Z-stream version
- `build_number`: Optional Jenkins build number (for status check)
- `job_id`: Optional Prow job ID (for status check)

**Prerequisites:**
- Build promotion detected (phase == "Accepted")
Expand All @@ -583,9 +583,9 @@ ELSE IF file.accepted == false:
```python
Execute: oar_image_consistency_check(release)

# Success - Jenkins job triggered:
# Success - Prow job triggered:
stdout contains: "task [Image consistency check] status is changed to [In Progress]"
AND capture Jenkins build number from stdout
AND capture Prow job ID from stdout (pattern: "Triggered image consistency check Prow job: {job_id}")

# Blocked - Stage-release pipeline not succeeded:
IF stage-release pipeline error detected:
Expand All @@ -595,7 +595,7 @@ IF stage-release pipeline error detected:

**Phase 2 - Check Status:**
```python
Execute: oar_image_consistency_check(release, build_number={captured_build_number})
Execute: oar_image_consistency_check(release, job_id={captured_job_id})
```

**Phase 3 - Complete:**
Expand All @@ -608,7 +608,7 @@ Failure: stdout contains: "task [Image consistency check] status is changed to [

**Failure Handling:**
- Stage-release pipeline not ready: Report to user, ask to work with ART, wait for user to re-invoke
- Jenkins job failure: Mark overall status "Red", notify owner
- Prow job failure: Mark overall status "Red", notify owner

---

Expand Down Expand Up @@ -796,11 +796,11 @@ ELSE:
```
WHEN trigger phase:
Execute command
Capture Jenkins build_number from stdout (if applicable)
Capture job ID from stdout (Prow job ID or Jenkins build number)
Report to user: "Task triggered, check status in X minutes"

WHEN user re-invokes /release:drive:
Execute command with build_number parameter (if applicable)
Execute command with job ID (Prow job ID or Jenkins build number)
Parse stdout for status

IF status == "In Progress":
Expand Down
26 changes: 13 additions & 13 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -403,23 +403,24 @@ oar -r <release-version> [OPTIONS] COMMAND [ARGS]
**Command:**
```bash
oar -r <release> image-consistency-check
oar -r <release> image-consistency-check -n <build-number>
oar -r <release> image-consistency-check -i <job-id>
```

**Purpose:** Verifies that images in the release payload are consistent with advisory contents.
**Purpose:** Verifies that images in the release payload are consistent with images in the shipment.

**Options:**
- `-n, --build-number` - Jenkins build number to check status (for subsequent runs)
- `-i, --job-id` - Prow job ID to check status (for subsequent runs)

**What it does:**
- Triggers a Jenkins job to verify image consistency
- Compares images in release payload with images in advisories
- Returns build number on first run
- Can check job status on subsequent runs with build number
- Triggers a Prow job via Gangway API to verify image consistency
- Compares images in release payload with images in shipment MR
- Returns Prow job ID on first run
- Can check job status on subsequent runs with job ID
- Requires `APITOKEN` environment variable for Prow authentication

**Workflow:**
1. First run: Triggers job, returns build number
2. Subsequent runs: Check status using `-n <build-number>`
1. First run: Triggers Prow job, returns job ID
2. Subsequent runs: Check status using `-i <job-id>`

---

Expand Down Expand Up @@ -792,18 +793,17 @@ All core modules follow a consistent pattern:

**Key Functionality:**
- Trigger stage testing pipeline
- Trigger image consistency check jobs
- Monitor job queue and execution
- Validate job parameters match release version
- Get build status with detailed error handling

**Supported Jobs:**
- `stage-pipeline` - Stage environment testing
- `image-consistency-check` - Verify payload images match advisories

**Note:** Image consistency check has been migrated to Prow (see `prow/job/job.py` `run_image_consistency_check`).

**Key Methods:**
- `call_stage_job()` - Trigger stage testing
- `call_image_consistency_job()` - Trigger image consistency validation
- `get_build_status()` - Check job status by build number
- `is_job_enqueue()` - Check if job is queued

Expand Down Expand Up @@ -1124,5 +1124,5 @@ When adding support for new OpenShift versions, update:
2. Job registry configurations
3. Test report templates
4. Add new ci-profile for stage-testing pipeline
5. Add new release version to parameter `VERSION` of image-consistency-check job
5. Update Prow job configuration for image-consistency-check
6. Update configstore config to add new test template doc ID and slack group alias for release leads
9 changes: 5 additions & 4 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -201,7 +201,7 @@ The MCP (Model Context Protocol) server (`mcp_server/server.py`) exposes OAR com

**Categories of tools:**
1. **Read-only tools** - Safe query operations (check-greenwave-cvp-tests, check-cve-tracker-bug, image-signed-check, is-release-shipped)
2. **Status check tools** - Query job status (image-consistency-check -n, stage-testing -n)
2. **Status check tools** - Query job status (image-consistency-check -i, stage-testing -n)
3. **Write operations** - Modify state (create-test-report, update-bug-list, take-ownership)
4. **Critical operations** - Production impact (push-to-cdn-staging, change-advisory-status)
5. **Controller tools** - Background agents (start-release-detector, jira-notificator)
Expand Down Expand Up @@ -463,7 +463,7 @@ oar -r 4.19.1 update-bug-list

# 4. Verify payload images
oar -r 4.19.1 image-consistency-check
oar -r 4.19.1 image-consistency-check -n <build-number> # Check status
oar -r 4.19.1 image-consistency-check -i <job-id> # Check status

# 5. Validate CVP tests
oar -r 4.19.1 check-greenwave-cvp-tests
Expand Down Expand Up @@ -510,8 +510,9 @@ When adding new version support, update:
1. Jira query filters (`oar/notificator/jira_notificator.py`)
2. Job registry configurations
3. Test report templates
4. Jenkins job parameters (stage-testing, image-consistency-check)
5. ConfigStore config (test template doc ID, Slack group alias)
4. Jenkins job parameters (stage-testing)
5. Prow job configuration (image-consistency-check)
6. ConfigStore config (test template doc ID, Slack group alias)

## Authentication Notes

Expand Down
6 changes: 4 additions & 2 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -41,8 +41,10 @@ RUN case ${TARGETARCH} in \
# Install OAR CLI
WORKDIR /usr/src/release-tests
COPY . .
RUN uv pip install --python ${PY_BIN} --system . && \
RUN uv pip install --python ${PY_BIN} --system . ./prow && \
oar --help && \
oarctl --help
oarctl --help && \
job --help && \
jobctl --help

CMD [ "/bin/bash" ]
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
install:
pip3 install --use-pep517 -e . && oar -h
pip3 install --use-pep517 -e . && pip3 install -e prow/ && oar -h

uninstall:
pip3 uninstall -y oar artcommon pyartcd rh-elliott rh-doozer
Expand Down
17 changes: 10 additions & 7 deletions mcp_server/server.py
Original file line number Diff line number Diff line change
Expand Up @@ -690,23 +690,26 @@ async def oar_image_signed_check(release: str) -> str:
# ============================================================================

@mcp.tool()
async def oar_image_consistency_check(release: str, build_number: str = None) -> str:
async def oar_image_consistency_check(release: str, job_id: str = None) -> str:
"""
Check status of image consistency check or start new check.

If build_number is provided, queries existing job status (READ-ONLY).
If build_number is not provided, starts new consistency check (WRITE).
Triggers a Prow job via Gangway API to verify payload images match shipment.
Requires APITOKEN environment variable for Prow authentication.

If job_id is provided, queries existing Prow job status (READ-ONLY).
If job_id is not provided, starts new consistency check (WRITE).

Args:
release: Z-stream release version (e.g., "4.19.1")
build_number: Optional specific build number to check status
job_id: Optional Prow job ID to check status

Returns:
Job status information or new job details
"""
args = []
if build_number is not None and build_number != "":
args.extend(["-n", build_number])
if job_id is not None and job_id != "":
args.extend(["-i", job_id])

result = await invoke_oar_command_async(release, "image-consistency-check", args)
return format_result(result)
Expand Down Expand Up @@ -1480,7 +1483,7 @@ async def oar_add_issue(
oar_add_issue("4.19.1", "ART build pipeline down - ETA: 2025-01-16", True, None)

# Non-blocking issue (automation improvement)
oar_add_issue("4.19.1", "Jenkins job timeout - retry succeeded", False, "image-consistency-check")
oar_add_issue("4.19.1", "Prow job timeout - retry succeeded", False, "image-consistency-check")
"""
try:
cs = get_cached_configstore(release)
Expand Down
Loading