Skip to content

Add agentic deploy scripts, CRDs, and operator integration#1582

Draft
harche wants to merge 1 commit into
openshift:mainfrom
harche:wt/e2e-testing
Draft

Add agentic deploy scripts, CRDs, and operator integration#1582
harche wants to merge 1 commit into
openshift:mainfrom
harche:wt/e2e-testing

Conversation

@harche
Copy link
Copy Markdown

@harche harche commented Apr 30, 2026

Adds agentic stack deploy scripts (hack/agentic/), agentic CRDs, LightspeedAgents feature gate, and operator integration for console/sandbox image management.

🤖 Generated with Claude Code

@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 30, 2026
@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented Apr 30, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci openshift-ci Bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 30, 2026
@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented Apr 30, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign raptorsun for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Comment thread hack/agentic/lib.sh Outdated
update_crds_and_rbac() {
step "Updating CRDs and RBAC"
cd "${OPERATOR_DIR}"
make manifests kustomize >/dev/null 2>&1
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Developer experience hint, many commands throw the output away, so the script fails without giving reasons why.

My agent suggests something like and I tent to agree

_run() {
    local _out
    _out=$(mktemp)
    if "$@" >"${_out}" 2>&1; then
        rm -f "${_out}"
    else
        local _rc=$?
        echo -e "    ${RED}${NC} Command failed: $*" >&2
        cat "${_out}" >&2
        rm -f "${_out}"
        return ${_rc}
    fi
}

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — added _run() wrapper (lines 13-25) that captures stdout/stderr to a tempfile and only surfaces it on failure. All oc apply, oc patch, and make calls that previously redirected to /dev/null now go through _run.

Comment thread hack/agentic/lib.sh Outdated
# Step 3: AgenticConfig — Cluster Admin
# Approval policy, console/sandbox images, concurrency limits.
###########################################################################
local AGENT_IMAGE="${INTERNAL_REG}/${NS_OPERATOR}/lightspeed-agentic-sandbox:${TAG}"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
local AGENT_IMAGE="${INTERNAL_REG}/${NS_OPERATOR}/lightspeed-agentic-sandbox:${TAG}"
local AGENT_IMAGE="${INTERNAL_REG}/${NS_OPERATOR}/${BC_AGENT}:${TAG}"

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — removed the standalone AGENT_IMAGE and switched to AGENT_IMG which is centralized in lib.sh (line 66), built from ${BC_AGENT} + ${TAG}. All four image vars are now defined once in lib.sh.

Comment thread hack/agentic/lib.sh Outdated
Comment on lines +373 to +374
local OPERATOR_IMG="${INTERNAL_REG}/${NS_OPERATOR}/lightspeed-operator:${TAG}"
local CONSOLE_IMG="${INTERNAL_REG}/${NS_OPERATOR}/lightspeed-console-plugin:${TAG}"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
local OPERATOR_IMG="${INTERNAL_REG}/${NS_OPERATOR}/lightspeed-operator:${TAG}"
local CONSOLE_IMG="${INTERNAL_REG}/${NS_OPERATOR}/lightspeed-console-plugin:${TAG}"
local OPERATOR_IMG="${INTERNAL_REG}/${NS_OPERATOR}/${BC_OPERATOR}:${TAG}"
local CONSOLE_IMG="${INTERNAL_REG}/${NS_OPERATOR}/${BC_CONSOLE}:${TAG}"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given all the other envs are also set in one place, might make sense to do this for _IMG vars as well. I see the SKILLS_IMAGE in redeploy.sh as well… the consistency could help a bit.

I know it's just a dev script, but still…

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — OPERATOR_IMG, CONSOLE_IMG, AGENT_IMG, and SKILLS_IMG are now centralized globals in lib.sh (lines 64-67), all built from their respective BC_* constants. No more local definitions inside individual functions.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — removed the local SKILLS_IMAGE from redeploy-agent.sh and switched to SKILLS_IMG from lib.sh. All four image vars are now defined once in lib.sh and shared across all scripts.

Comment thread hack/agentic/lib.sh Outdated
-p "{\"spec\":{\"output\":{\"to\":{\"name\":\"${bc_name}:${TAG}\"}}}}" >/dev/null 2>&1
echo " Building ${label} on cluster (uploading source)..."
oc start-build "${bc_name}" -n "${NS_OPERATOR}" \
--from-dir="${from_dir}" --follow \
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs --wait as well, otherwise it returns 0 even on failure.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — added --wait alongside --follow in the unified _build sync path (line 234).

Comment thread hack/agentic/lib.sh Outdated
# Uploads the source directory to a builder pod, which runs the Dockerfile
# natively on amd64 and pushes to the internal registry — no local container
# engine, cross-compilation, registry route, or auth tokens needed.
build_on_cluster() {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given build_on_cluster and start_build_async are very similar, how about having just _build sync and _build async and handle the differences in particular places.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — merged build_on_cluster and start_build_async into a single _build function that takes sync or async as its first argument (lines 222-258), eliminating the duplicated patch/skip/build logic.

@openshift-ci openshift-ci Bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 12, 2026
@harche harche changed the title WIP: Agentic deploy scripts, Dockerfile.dev, and CRD fixes Add agentic deploy scripts, CRDs, and operator integration May 12, 2026
@harche
Copy link
Copy Markdown
Author

harche commented May 12, 2026

Great feedback @iNecas , thanks, I will update the PR and test with those changes.

@harche harche force-pushed the wt/e2e-testing branch 5 times, most recently from 637a769 to d37f7ad Compare May 12, 2026 19:52
@openshift-ci openshift-ci Bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 20, 2026
@harche harche force-pushed the wt/e2e-testing branch 4 times, most recently from 22cf8dc to 73e7884 Compare May 20, 2026 18:24
@openshift-ci openshift-ci Bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 20, 2026
@harche harche force-pushed the wt/e2e-testing branch 2 times, most recently from 3a4ae52 to a13e8ba Compare May 20, 2026 18:50
Comment thread Makefile
export IMG
# ENVTEST_K8S_VERSION refers to the version of kubebuilder assets to be downloaded by envtest binary.
ENVTEST_K8S_VERSION = 1.27.1
ENVTEST_K8S_VERSION = 1.35.0
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bumped from 1.27.1 to match the K8s dependency version (k8s.io/api v0.35.4). The agentic CRDs use CEL functions like format.dns1123Subdomain() that were introduced in K8s 1.31+. Envtest 1.27 rejects these CRDs during test setup.

Adds the agentic stack deployment infrastructure to lightspeed-operator:

Deploy scripts (hack/agentic/):
- deploy.sh: Full deploy with on-cluster builds (--provider=vertex|bedrock)
- redeploy-{operator,agent,console,skills,all}.sh: Fast iteration scripts
- undeploy.sh: Teardown with timeout + finalizer cleanup for stuck CRDs
- lib.sh: Shared build helpers with _run() error wrapper, unified _build
  sync|async, centralized image vars, --wait on oc start-build

Operator integration:
- cmd/main.go: Wire agentic controller with --agentic-console-image and
  --agentic-sandbox-image flags
- Add LightspeedAgents to FeatureGate enum in OLSConfig CRD
- Agentic CRDs: ApprovalPolicy, Agent, LLMProvider, Proposal, results
- config/rbac-agentic/: RBAC for agentic controller
- Dockerfile.dev: Local module builds for agentic-operator dependency
- Add FeatureGateLightspeedAgents and agentic image default constants

Demo proposals use find-token skill from quay.io/harpatil/agentic-skills
(TODO: replace with Konflux-built image when available).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants