CCIP-10030 CCIP-10029 Retrying failed jobs CLI#962
CCIP-10030 CCIP-10029 Retrying failed jobs CLI#962mateusz-sekara wants to merge 5 commits intomainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adds a new ccv job-queue CLI surface to inspect and retry permanently failed verifier jobs stored in Postgres archive tables, complementing the existing ccv chain-statuses CLI.
Changes:
- Adds
job-queueCLI commands (list,reschedule) with parsing/rendering helpers and unit tests. - Implements a Postgres-backed
Storefor listing failed archived jobs and rescheduling them back into active queues. - Adds an E2E smoke test for the new CLI and wires it into the smoke workflow matrix.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| verifier/cmd/run_ccv_cli.go | Registers the new ccv job-queue subcommand and lazily constructs DB-backed deps. |
| cli/jobqueue/store.go | Defines queue types, archived job model, and Store interface for CLI operations. |
| cli/jobqueue/postgres_store.go | Postgres implementation for listing failed jobs and rescheduling from archive to active tables. |
| cli/jobqueue/commands.go | Implements list and reschedule CLI commands, flag parsing, and table output. |
| cli/jobqueue/commands_test.go | Unit tests for command behavior, parsing, and output rendering. |
| cli/jobqueue/mocks/mock_Store.go | Mockery-generated mock for the Store interface used in unit tests. |
| build/devenv/tests/e2e/smoke_jobqueue_cli_test.go | E2E smoke test that seeds an archived failed job and verifies list/reschedule behavior. |
| .mockery.yaml | Adds mockery config for generating cli/jobqueue mocks. |
| .github/workflows/test-smoke.yaml | Adds a new smoke-test matrix entry intended to run the job-queue E2E test. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| args = append(args, ownerID) | ||
| } | ||
|
|
||
| query += " ORDER BY created_at DESC" |
There was a problem hiding this comment.
ListFailed orders archive rows by created_at DESC, but the archive tables are indexed by completed_at and the CLI renders this as the archive timestamp. Ordering by completed_at DESC would better match the semantics (most recently failed first) and align with existing indexes to avoid slow sorts on large archives.
| query += " ORDER BY created_at DESC" | |
| query += " ORDER BY completed_at DESC" |
| archivedAt := "-" | ||
| if j.ArchivedAt != nil { | ||
| archivedAt = j.ArchivedAt.Format("2006-01-02T15:04:05Z") | ||
| } |
There was a problem hiding this comment.
The time formatting layout 2006-01-02T15:04:05Z hardcodes a literal Z suffix even if the time.Time value isn’t UTC, which can misrepresent timestamps coming back from Postgres. Consider formatting as RFC3339 with offset (e.g., time.RFC3339 / 2006-01-02T15:04:05Z07:00) or converting to UTC before formatting; cli/chainstatuses/commands.go already uses Z07:00 for this reason.
…the insert ON CONFLICT DO NOTHING would delete the archive row but leave the active table unchanged, losing the job with no trace. Removing the clause lets PostgreSQL surface a unique-constraint violation so the operator can fix the state manually. Made-with: Cursor
|
Code coverage report:
|
No description provided.