Skip to content

CM-867: Implement trust-manager resource reconcilers#379

Open
chiragkyal wants to merge 6 commits intoopenshift:masterfrom
chiragkyal:trust-manager-reconcilers
Open

CM-867: Implement trust-manager resource reconcilers#379
chiragkyal wants to merge 6 commits intoopenshift:masterfrom
chiragkyal:trust-manager-reconcilers

Conversation

@chiragkyal
Copy link
Member

@chiragkyal chiragkyal commented Mar 13, 2026

Summary

Changes

  • Added reconcile logic for every trust-manager operand resource
  • SSA Patch will be applied after comparing actual vs expected resources.
  • Added RELATED_IMAGE_CERT_MANAGER_TRUST_MANAGER and
    TRUSTMANAGER_OPERAND_IMAGE_VERSION environment variables to the CSV and manager deployment
    manifest.
  • Added commented-out --unsupported-addon-features="TrustManager=true" flag to
    local-run for easy local testing.
  • Added Unit tests and E2E tests for all the managed resources.

@coderabbitai
Copy link

coderabbitai bot commented Mar 13, 2026

Important

Review skipped

Auto reviews are limited based on label configuration.

🚫 Review skipped — only excluded labels are configured. (1)
  • do-not-merge/work-in-progress

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 273913ce-3792-47b3-acfa-a1c04e1e8205

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds TrustManager support: new image/version env vars and CSV entries; implements a TrustManager controller (service accounts, RBAC, services, deployment, issuer/certificate, validating webhook), shared validation/util helpers, scheme registration, RBAC role additions, and extensive unit and e2e tests.

Changes

Cohort / File(s) Summary
Build & manifests
Makefile, bundle/manifests/cert-manager-operator.clusterserviceversion.yaml, config/manager/manager.yaml
Add RELATED_IMAGE_CERT_MANAGER_TRUST_MANAGER and TRUSTMANAGER_OPERAND_IMAGE_VERSION; inject env vars into manager and CSV; add relatedImages and operator/container entries for trust-manager; extend CSV RBAC rules.
TrustManager controller implementation
pkg/controller/trustmanager/...
pkg/controller/trustmanager/constants.go, .../controller.go, .../install_trustmanager.go, .../deployments.go, .../rbacs.go, .../serviceaccounts.go, .../services.go, .../certificates.go, .../webhooks.go, .../utils.go
New controller: resource constants and asset wiring; orchestrated reconciliation to create/apply ServiceAccounts, RBAC, Services, Issuer/Certificate, Deployment, ValidatingWebhookConfiguration; SSA-aware apply/patch, metadata & annotation handling, namespace checks, and per-resource modification detection.
Controller tests & test helpers
pkg/controller/trustmanager/*_test.go, pkg/controller/trustmanager/test_utils.go, pkg/controller/trustmanager/*.go (tests)
Add extensive unit tests for deployments, RBAC, services, serviceaccounts, certificates/issuers, webhooks, utils and test utilities supporting those tests.
E2E tests & builders
test/e2e/trustmanager_test.go, test/e2e/utils_test.go
Add broad end-to-end tests covering resource lifecycle, drift reconciliation, deployment configuration, RBAC/webhook verification, and a TrustManager CR builder for test setup.
Common utilities & validation
pkg/controller/common/utils.go, pkg/controller/common/validation.go, pkg/controller/common/core_validation_helpers.go
Add UpdateName and generic DecodeObjBytes[T]; remove HasObjectChanged; introduce centralized validation helpers (nodeSelector, tolerations, resources, affinity, labels, annotations); rename/move package comments.
Operator setup & schemes
pkg/operator/setup_manager.go
Register admissionregistrationv1 and cert-manager types to the scheme; extend managed resources list to include TrustManager-related kinds.
RBAC manifest
config/rbac/role.yaml
Add ClusterRole rules for trust.cert-manager.io resources: bundles, bundles/finalizers, and bundles/status.

Sequence Diagram(s)

sequenceDiagram
    participant Reconciler as TrustManager<br/>Reconciler
    participant Controller as Kubernetes<br/>API
    participant ServiceAccount as ServiceAccount
    participant RBAC as RBAC Resources
    participant Deployment as Deployment
    participant Services as Services
    participant Issuer as Issuer/<br/>Certificate
    participant Webhook as Webhook<br/>Config

    Reconciler->>Reconciler: reconcileTrustManagerDeployment()
    Reconciler->>Reconciler: Validate trust namespace

    Reconciler->>ServiceAccount: createOrApplyServiceAccounts()
    ServiceAccount->>Controller: Apply/Patch ServiceAccount

    Reconciler->>RBAC: createOrApplyRBACResources()
    RBAC->>Controller: Apply ClusterRole/Binding
    RBAC->>Controller: Apply Namespaced Role/Binding
    RBAC->>Controller: Apply LeaderElection Role/Binding

    Reconciler->>Services: createOrApplyServices()
    Services->>Controller: Apply Webhook Service
    Services->>Controller: Apply Metrics Service

    Reconciler->>Issuer: createOrApplyIssuer()
    Issuer->>Controller: Apply Issuer

    Reconciler->>Issuer: createOrApplyCertificate()
    Issuer->>Controller: Apply Certificate

    Reconciler->>Deployment: createOrApplyDeployment()
    Deployment->>Controller: Apply Deployment with config

    Reconciler->>Webhook: createOrApplyValidatingWebhookConfiguration()
    Webhook->>Controller: Apply WebhookConfiguration

    Reconciler->>Controller: updateStatusObservedState()
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 32.73% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Test Structure And Quality ⚠️ Warning Test suite lacks unit tests for 1581 lines of new controller code, combines unrelated behaviors in single test, and 83% of assertions lack diagnostic messages. Add unit tests for all controller files following IstioCSR pattern, split first E2E test into separate tests by responsibility, and add meaningful messages to all assertions.
✅ Passed checks (3 passed)
Check name Status Explanation
Stable And Deterministic Test Names ✅ Passed No Ginkgo test files or test declarations found in PR. All changes are implementation files without test code.
Title check ✅ Passed The title clearly and specifically describes the main change: implementing resource reconcilers for the trust-manager component across multiple Kubernetes resource types.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
📝 Coding Plan
  • Generate coding plan for human review comments

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci bot requested review from lunarwhite and mytreya-rh March 13, 2026 05:52
@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 13, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 13

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
Makefile (1)

62-62: ⚠️ Potential issue | 🟡 Minor

Remove extra quotes from TRUST_MANAGER_VERSION.

The version value has unnecessary quotes that are inconsistent with other version definitions (e.g., CERT_MANAGER_VERSION, ISTIO_CSR_VERSION). This may cause issues when the variable is expanded in shell commands.

🔧 Proposed fix
-TRUST_MANAGER_VERSION ?= "v0.20.3"
+TRUST_MANAGER_VERSION ?= v0.20.3
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Makefile` at line 62, The TRUST_MANAGER_VERSION Makefile variable is defined
with extra quotes; update the TRUST_MANAGER_VERSION assignment in the Makefile
so it matches other version vars (e.g., CERT_MANAGER_VERSION, ISTIO_CSR_VERSION)
by removing the surrounding quotes and using TRUST_MANAGER_VERSION ?= v0.20.3 so
shell expansions and comparisons behave consistently.
🧹 Nitpick comments (7)
pkg/controller/common/client.go (1)

100-104: Reflection-based type creation may not work for all object types.

The reflection logic creates a new instance via reflect.New(reflect.TypeOf(obj).Elem()). This assumes obj is always a pointer to a struct. If obj is a non-pointer type or an interface without an underlying struct, this could panic or produce unexpected behavior.

Consider adding a nil check or documenting the constraint that obj must be a non-nil pointer to a struct.

💡 Suggested defensive check
 func (c *ctrlClientImpl) UpdateWithRetry(
 	ctx context.Context, obj client.Object, opts ...client.UpdateOption,
 ) error {
 	key := client.ObjectKeyFromObject(obj)
 	if err := retry.RetryOnConflict(retry.DefaultRetry, func() error {
+		if obj == nil || reflect.TypeOf(obj).Kind() != reflect.Ptr {
+			return fmt.Errorf("obj must be a non-nil pointer")
+		}
 		currentInterface := reflect.New(reflect.TypeOf(obj).Elem()).Interface()
 		current, ok := currentInterface.(client.Object)
 		if !ok {
 			return fmt.Errorf("failed to convert to client.Object")
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/common/client.go` around lines 100 - 104, The reflection code
using reflect.New(reflect.TypeOf(obj).Elem()) can panic or behave unexpectedly
if obj is nil, not a pointer, or not pointing to a struct; update the logic
around currentInterface/current to validate obj first (ensure obj != nil,
TypeOf(obj).Kind() == reflect.Ptr and TypeOf(obj).Elem().Kind() ==
reflect.Struct) and return a clear error if the constraint is violated, or
document that obj must be a non-nil pointer to a struct; keep the existing
conversion to client.Object (current, ok) after these checks.
pkg/controller/common/errors.go (2)

45-55: Simplify redundant string conversion.

fmt.Sprint(err.Error()) is equivalent to err.Error(). The fmt.Sprint call is unnecessary.

♻️ Proposed fix
 func NewMultipleInstanceError(err error) *ReconcileError {
 	if err == nil {
 		return nil
 	}
 	return &ReconcileError{
 		Reason:  MultipleInstanceError,
-		Message: fmt.Sprint(err.Error()),
+		Message: err.Error(),
 		Err:     err,
 	}
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/common/errors.go` around lines 45 - 55, The
NewMultipleInstanceError function redundantly wraps err.Error() with fmt.Sprint;
replace fmt.Sprint(err.Error()) with err.Error() in the returned ReconcileError
(in function NewMultipleInstanceError) so Message is set directly from
err.Error(); keep the existing nil-check and preserve Reason:
MultipleInstanceError and Err: err.

24-31: Consider adding Unwrap() method for proper error chain support.

The ReconcileError struct wraps an underlying error but doesn't implement Unwrap(). This means errors.Is() and errors.As() won't traverse to the wrapped error, which may limit error inspection capabilities.

♻️ Proposed addition

Add after the Error() method:

// Unwrap returns the underlying error for error chain traversal.
func (e *ReconcileError) Unwrap() error {
	return e.Err
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/common/errors.go` around lines 24 - 31, ReconcileError wraps
an underlying error but lacks an Unwrap method, preventing
errors.Is()/errors.As() from traversing the wrapped error; add a method func (e
*ReconcileError) Unwrap() error that returns e.Err (place it after the existing
Error() method) so standard error chain functions can inspect the underlying
error; ensure the receiver is a pointer to ReconcileError and it simply returns
the Err field.
pkg/controller/istiocsr/utils.go (1)

136-230: Consider migrating decode functions to use common.DecodeObjBytes.

The local decode*ObjBytes functions (e.g., decodeDeploymentObjBytes, decodeClusterRoleObjBytes, etc.) follow the same pattern and could potentially be replaced with calls to common.DecodeObjBytes[T]. This would further reduce duplication.

However, this is a non-blocking suggestion since the current implementation works correctly.

♻️ Example migration for decodeDeploymentObjBytes
 func decodeDeploymentObjBytes(objBytes []byte) *appsv1.Deployment {
-	obj, err := runtime.Decode(codecs.UniversalDecoder(appsv1.SchemeGroupVersion), objBytes)
-	if err != nil {
-		panic(err)
-	}
-	deployment, ok := obj.(*appsv1.Deployment)
-	if !ok {
-		panic("failed to convert to *appsv1.Deployment")
-	}
-	return deployment
+	return common.DecodeObjBytes[*appsv1.Deployment](codecs, appsv1.SchemeGroupVersion, objBytes)
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/istiocsr/utils.go` around lines 136 - 230, The decodeXObjBytes
functions (decodeDeploymentObjBytes, decodeClusterRoleObjBytes,
decodeClusterRoleBindingObjBytes, decodeRoleObjBytes, decodeRoleBindingObjBytes,
decodeServiceObjBytes, decodeServiceAccountObjBytes, decodeCertificateObjBytes)
are duplicative; replace their bodies with calls to the generic helper
common.DecodeObjBytes[T] (e.g.,
common.DecodeObjBytes[appsv1.Deployment](objBytes)) or remove the redundant
functions and call common.DecodeObjBytes[T] where they’re used, ensure you
import the common package, and update callers to the generic version so
duplication is eliminated while preserving panic/error behavior.
pkg/controller/common/utils.go (2)

37-40: Function name is misleading - only compares labels, not full metadata.

ObjectMetadataModified suggests it compares all object metadata, but it only compares labels. Consider renaming to ObjectLabelsModified for clarity, or expanding to compare other metadata fields (annotations, etc.) if that's the intended behavior.

♻️ Option 1: Rename for clarity
-// ObjectMetadataModified compares the labels of two objects and returns true if they differ.
-func ObjectMetadataModified(desired, fetched client.Object) bool {
+// ObjectLabelsModified compares the labels of two objects and returns true if they differ.
+func ObjectLabelsModified(desired, fetched client.Object) bool {
 	return !reflect.DeepEqual(desired.GetLabels(), fetched.GetLabels())
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/common/utils.go` around lines 37 - 40, The function
ObjectMetadataModified only compares labels but the name implies full metadata
comparison; rename ObjectMetadataModified to ObjectLabelsModified and update all
callers to use ObjectLabelsModified (keeping the implementation that returns
!reflect.DeepEqual(desired.GetLabels(), fetched.GetLabels())); alternatively, if
you intended to compare more metadata, expand the function (e.g., add
comparisons for desired.GetAnnotations()/fetched.GetAnnotations() and any other
metadata fields) and keep or rename to ObjectMetadataModified accordingly—ensure
to update references to the function name
(ObjectMetadataModified/ObjectLabelsModified) and adjust tests or callers to
match the chosen change.

63-75: Panic-on-failure pattern is acceptable for static asset decoding but should be documented.

The function panics on decode failure or type mismatch. While this is appropriate for static assets loaded at initialization, callers should be aware of this behavior.

The panic message on line 68 uses *new(T) which works but could be clearer.

♻️ Suggested doc enhancement
 // DecodeObjBytes decodes raw YAML/JSON bytes into a typed Kubernetes object.
-// Panics on decode failure or type mismatch.
+// Panics on decode failure or type mismatch. Use only for static assets
+// loaded at initialization time where failure indicates a programmer error.
 func DecodeObjBytes[T runtime.Object](codecs serializer.CodecFactory, gv schema.GroupVersion, objBytes []byte) T {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/common/utils.go` around lines 63 - 75, The DecodeObjBytes
function panics on decode/type mismatch which is OK for static asset decoding
but needs explicit documentation and clearer panic messages; update the function
comment for DecodeObjBytes to state that it will panic on decode failure or type
mismatch and is intended for static/initialization use, and replace the current
panic messages that use *new(T) with a clearer type representation (use a
reflect-based or fmt-based type string for T) so the error text unambiguously
shows the expected type in both decode and conversion failure cases.
pkg/controller/trustmanager/install_trustmanager.go (1)

59-61: Consider using common.FromClientError for consistency.

The error from updateStatusObservedState is wrapped with fmt.Errorf, while other reconciliation errors in this file use common.FromClientError or common.NewIrrecoverableError. Consider using the common error utilities for consistent error classification and retry behavior.

♻️ Suggested change
 	if err := r.updateStatusObservedState(trustManager); err != nil {
-		return fmt.Errorf("failed to update status observed state: %w", err)
+		return common.FromClientError(err, "failed to update status observed state")
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/trustmanager/install_trustmanager.go` around lines 59 - 61,
The current return wraps the updateStatusObservedState error with fmt.Errorf;
replace that wrapping with the project's error helper so the error is classified
consistently and preserves retry semantics—call common.FromClientError(err,
"failed to update status observed state") (or common.NewIrrecoverableError if
the error should not be retried) where updateStatusObservedState is invoked so
the reconciler uses the same error utilities as other handlers.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@api/operator/v1alpha1/features.go`:
- Line 28: The PreRelease values are inconsistent: FeatureTrustManager is set
with PreRelease: "TechPreview" while another feature uses featuregate.GA; make
them consistent by either (a) introducing a package-level constant (e.g., const
PreReleaseTechPreview = "TechPreview") and use PreRelease: PreReleaseTechPreview
for FeatureTrustManager, or (b) change the other feature that currently uses
featuregate.GA to use the equivalent string literal so all PreRelease fields use
the same string style; update references to the PreRelease value in features.go
(look for FeatureTrustManager and the feature using featuregate.GA) accordingly.

In `@pkg/controller/common/utils.go`:
- Around line 28-35: The inconsistency is that HasObjectChanged(desired,
fetched) in pkg/controller/common/utils.go returns false when types differ,
while the local hasObjectChanged in istiocsr/utils.go panics on type mismatch;
fix by aligning behavior: update hasObjectChanged in istiocsr/utils.go to
delegate to the common HasObjectChanged (or perform the same reflect.TypeOf
check and return false) instead of panicking, so both functions have identical
type-mismatch handling and avoid duplicated logic.

In `@pkg/controller/istiocsr/controller.go`:
- Around line 84-85: The predicate uses common.ManagedResourceLabelKey (which
resolves to the generic "app" key) to detect operator-owned resources; replace
that with a new operator-specific label constant (e.g., IstioCSRManagedLabelKey
or ManagedByIstioCSRKey) defined in the constants file and use it everywhere you
currently check objLabels[common.ManagedResourceLabelKey] ==
RequestEnqueueLabelValue (including the labelOk closure and the other predicate
block around RequestEnqueueLabelValue). Add the new constant to the constants
file, update the checks in controller.go to reference the new constant (and
RequestEnqueueLabelValue), and run/update any tests that assert label-based
behavior to use the new operator-specific label.

In `@pkg/controller/istiocsr/deployments.go`:
- Around line 376-385: The type-assertion failures for clusterIssuer and issuer
are internal invariant violations and should return an irrecoverable error
instead of a retryable client error; update the two branches that check
clusterIssuer (variable clusterIssuer) and issuer (variable issuer, case
issuerKind) to return an irrecoverable error (e.g., use the project's
irrecoverable error constructor such as common.NewIrrecoverableError or
equivalent) rather than common.FromClientError so these failures won't be
treated as retryable.
- Around line 321-324: r.Get can fail transiently so don't wrap every error as
irrecoverable; instead, after calling r.Get(ctx, sourceConfigMapKey,
sourceConfigMap) check the returned err and only return
common.NewIrrecoverableError for the actual misconfiguration case (e.g., when
the ConfigMap exists but is malformed/missing expected keys) and for all other
client/API errors return common.FromClientError(err, "failed to fetch CA
certificate ConfigMap %s/%s", sourceConfigMapKey.Namespace,
sourceConfigMapKey.Name); in short: replace the unconditional
NewIrrecoverableError usage with a branch that detects the misconfiguration
condition and uses NewIrrecoverableError there, and uses common.FromClientError
for transient/client errors from r.Get.

In `@pkg/controller/trustmanager/deployments.go`:
- Around line 233-249: The helper containerPortsMatch currently only checks
ContainerPort and must be updated to compare the full corev1.ContainerPort spec
to detect drift; modify containerPortsMatch to require same length and for each
desired port find an existing port with equal Name, ContainerPort, HostPort,
Protocol (and HostIP if used) — or simply perform a full struct equality (e.g.
reflect.DeepEqual) between desired and matching existing ports — so
name/protocol/hostPort differences no longer get ignored.
- Around line 55-60: The code currently silently no-ops when the baked-in
container named by trustManagerContainerName is missing, so change the update
helpers (updateDeploymentArgs, updateTLSSecretVolume, updateImage) and/or add a
pre-check in getDeploymentObject to explicitly detect whether a container with
trustManagerContainerName exists and return an error if not; locate the loop or
lookup that references containers in the
Deployment.Spec.Template.Spec.Containers, and if you cannot find a container
matching trustManagerContainerName, return a descriptive error (propagate it up
so getDeploymentObject fails) instead of silently continuing, ensuring a missing
asset container surfaces as a hard error and prevents applying the Deployment
with defaults.

In `@pkg/controller/trustmanager/utils.go`:
- Around line 184-199: The function currently always returns
common.NewMultipleInstanceError(statusMessage) when ignoreProcessing is false,
which incorrectly rejects the winning singleton; change the control flow so that
when this TrustManager instance is the chosen/oldest (i.e., the winning
singleton) you return nil instead of NewMultipleInstanceError. Concretely, after
the ignoreProcessing block check the condition that determines this instance is
the active singleton (use the same logic/flags already computed around
ignoreProcessing/trustManager) and return nil for that case; only return
common.NewMultipleInstanceError(fmt.Errorf("%s", statusMessage)) for non-winning
instances. Ensure the successful update path (updateCondition/updateWithRetry)
remains unchanged and keep the references to trustManager, statusMessage,
updateCondition, addProcessingRejectedAnnotation, and UpdateWithRetry.
- Around line 129-133: The validateTrustManagerConfig function currently rejects
a zero-valued TrustManagerConfig which breaks reconciliation and e2e tests;
update the code so empty/zero TrustManagerConfig is considered valid—either
remove the call to validateTrustManagerConfig from
reconcileTrustManagerDeployment or change validateTrustManagerConfig to always
return nil (accept empty config) and keep any other semantic validations if
needed; ensure the change targets the validateTrustManagerConfig function and
the reconcileTrustManagerDeployment call site so an empty
TrustManager.Spec.TrustManagerConfig passes validation.

In `@pkg/operator/setup_manager.go`:
- Around line 196-230: The map key uses client.Object pointer instances in
addControllerCacheConfig so identical resource types from different controllers
don't match (different pointer addresses), causing label selectors to be
overwritten instead of merged; change the keying to be type-based (e.g., use
reflect.TypeOf(res) or a GVK string) so the lookup/merge logic operates on the
resource type rather than the object pointer, update objectList's key type and
all access sites in addControllerCacheConfig to use that type-based key, and
preserve the existing merge logic (existing.Label.Requirements(),
labels.NewRequirement(...), etc.) so multiple controllers' label values are
combined.

In `@pkg/operator/starter.go`:
- Around line 156-160: The manager is being started with its own signal handler
via ctrl.SetupSignalHandler(), which creates a separate context and can cause
shutdown inconsistencies; change the goroutine that calls manager.Start so it
uses the passed ctx (from controllercmd.ControllerContext) instead of
ctrl.SetupSignalHandler() — i.e., call manager.Start(ctx) (or the appropriate
context variable name) so the unified controller manager respects the parent
context cancellation, and remove/avoid creating an independent signal handler.
Ensure the goroutine and any related shutdown logic reference the same ctx
variable.

In `@test/e2e/trustmanager_test.go`:
- Around line 206-210: The test should fail fast when the controller reports
Degraded=True instead of continuing to poll; replace the current early-return
(return false, nil) inside the degradedCondition check with a failing return
(return false, error) that includes the condition's Reason and Message so CI
shows the cause; use the existing symbols meta.FindStatusCondition,
trustManagerStatus.Conditions, v1alpha1.Degraded and metav1.ConditionTrue and
construct an error (e.g. via fmt.Errorf) containing degradedCondition.Reason and
degradedCondition.Message and add the fmt import if missing.
- Around line 44-58: BeforeAll currently patches the Subscription but never
restores the original UNSUPPORTED_ADDON_FEATURES value; capture the current
value before calling patchSubscriptionWithEnvVars (e.g., read the Subscription
and store its UNSUPPORTED_ADDON_FEATURES into a package-level var) and add a
matching AfterAll that calls patchSubscriptionWithEnvVars to restore that saved
value (or remove the key if it was empty) and then call
waitForDeploymentEnvVarAndRollout to wait for the operator to return to the
original env state; reference BeforeAll, patchSubscriptionWithEnvVars and
waitForDeploymentEnvVarAndRollout when locating where to add the capture and
restore logic.

---

Outside diff comments:
In `@Makefile`:
- Line 62: The TRUST_MANAGER_VERSION Makefile variable is defined with extra
quotes; update the TRUST_MANAGER_VERSION assignment in the Makefile so it
matches other version vars (e.g., CERT_MANAGER_VERSION, ISTIO_CSR_VERSION) by
removing the surrounding quotes and using TRUST_MANAGER_VERSION ?= v0.20.3 so
shell expansions and comparisons behave consistently.

---

Nitpick comments:
In `@pkg/controller/common/client.go`:
- Around line 100-104: The reflection code using
reflect.New(reflect.TypeOf(obj).Elem()) can panic or behave unexpectedly if obj
is nil, not a pointer, or not pointing to a struct; update the logic around
currentInterface/current to validate obj first (ensure obj != nil,
TypeOf(obj).Kind() == reflect.Ptr and TypeOf(obj).Elem().Kind() ==
reflect.Struct) and return a clear error if the constraint is violated, or
document that obj must be a non-nil pointer to a struct; keep the existing
conversion to client.Object (current, ok) after these checks.

In `@pkg/controller/common/errors.go`:
- Around line 45-55: The NewMultipleInstanceError function redundantly wraps
err.Error() with fmt.Sprint; replace fmt.Sprint(err.Error()) with err.Error() in
the returned ReconcileError (in function NewMultipleInstanceError) so Message is
set directly from err.Error(); keep the existing nil-check and preserve Reason:
MultipleInstanceError and Err: err.
- Around line 24-31: ReconcileError wraps an underlying error but lacks an
Unwrap method, preventing errors.Is()/errors.As() from traversing the wrapped
error; add a method func (e *ReconcileError) Unwrap() error that returns e.Err
(place it after the existing Error() method) so standard error chain functions
can inspect the underlying error; ensure the receiver is a pointer to
ReconcileError and it simply returns the Err field.

In `@pkg/controller/common/utils.go`:
- Around line 37-40: The function ObjectMetadataModified only compares labels
but the name implies full metadata comparison; rename ObjectMetadataModified to
ObjectLabelsModified and update all callers to use ObjectLabelsModified (keeping
the implementation that returns !reflect.DeepEqual(desired.GetLabels(),
fetched.GetLabels())); alternatively, if you intended to compare more metadata,
expand the function (e.g., add comparisons for
desired.GetAnnotations()/fetched.GetAnnotations() and any other metadata fields)
and keep or rename to ObjectMetadataModified accordingly—ensure to update
references to the function name (ObjectMetadataModified/ObjectLabelsModified)
and adjust tests or callers to match the chosen change.
- Around line 63-75: The DecodeObjBytes function panics on decode/type mismatch
which is OK for static asset decoding but needs explicit documentation and
clearer panic messages; update the function comment for DecodeObjBytes to state
that it will panic on decode failure or type mismatch and is intended for
static/initialization use, and replace the current panic messages that use
*new(T) with a clearer type representation (use a reflect-based or fmt-based
type string for T) so the error text unambiguously shows the expected type in
both decode and conversion failure cases.

In `@pkg/controller/istiocsr/utils.go`:
- Around line 136-230: The decodeXObjBytes functions (decodeDeploymentObjBytes,
decodeClusterRoleObjBytes, decodeClusterRoleBindingObjBytes, decodeRoleObjBytes,
decodeRoleBindingObjBytes, decodeServiceObjBytes, decodeServiceAccountObjBytes,
decodeCertificateObjBytes) are duplicative; replace their bodies with calls to
the generic helper common.DecodeObjBytes[T] (e.g.,
common.DecodeObjBytes[appsv1.Deployment](objBytes)) or remove the redundant
functions and call common.DecodeObjBytes[T] where they’re used, ensure you
import the common package, and update callers to the generic version so
duplication is eliminated while preserving panic/error behavior.

In `@pkg/controller/trustmanager/install_trustmanager.go`:
- Around line 59-61: The current return wraps the updateStatusObservedState
error with fmt.Errorf; replace that wrapping with the project's error helper so
the error is classified consistently and preserves retry semantics—call
common.FromClientError(err, "failed to update status observed state") (or
common.NewIrrecoverableError if the error should not be retried) where
updateStatusObservedState is invoked so the reconciler uses the same error
utilities as other handlers.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 6efa00d3-a92f-40fc-bc0f-c86732c30476

📥 Commits

Reviewing files that changed from the base of the PR and between 8eb5d9e and b48171b.

⛔ Files ignored due to path filters (9)
  • go.work.sum is excluded by !**/*.sum
  • tools/go.sum is excluded by !**/*.sum
  • vendor/github.com/maxbrunsfeld/counterfeiter/v6/.gitignore is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/maxbrunsfeld/counterfeiter/v6/README.md is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/maxbrunsfeld/counterfeiter/v6/arguments/parser.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/maxbrunsfeld/counterfeiter/v6/generator/fake.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/maxbrunsfeld/counterfeiter/v6/generator/interface_template.go is excluded by !**/vendor/**, !vendor/**
  • vendor/github.com/maxbrunsfeld/counterfeiter/v6/generator/loader.go is excluded by !**/vendor/**, !vendor/**
  • vendor/modules.txt is excluded by !**/vendor/**, !vendor/**
📒 Files selected for processing (50)
  • Makefile
  • api/operator/v1alpha1/features.go
  • bundle/manifests/cert-manager-operator.clusterserviceversion.yaml
  • config/manager/manager.yaml
  • config/rbac/role.yaml
  • pkg/controller/common/client.go
  • pkg/controller/common/constants.go
  • pkg/controller/common/errors.go
  • pkg/controller/common/fakes/fake_ctrl_client.go
  • pkg/controller/common/utils.go
  • pkg/controller/istiocsr/certificates.go
  • pkg/controller/istiocsr/certificates_test.go
  • pkg/controller/istiocsr/controller.go
  • pkg/controller/istiocsr/controller_test.go
  • pkg/controller/istiocsr/deployments.go
  • pkg/controller/istiocsr/deployments_test.go
  • pkg/controller/istiocsr/install_instiocsr_test.go
  • pkg/controller/istiocsr/install_istiocsr.go
  • pkg/controller/istiocsr/networkpolicies.go
  • pkg/controller/istiocsr/rbacs.go
  • pkg/controller/istiocsr/rbacs_test.go
  • pkg/controller/istiocsr/serviceaccounts.go
  • pkg/controller/istiocsr/serviceaccounts_test.go
  • pkg/controller/istiocsr/services.go
  • pkg/controller/istiocsr/services_test.go
  • pkg/controller/istiocsr/utils.go
  • pkg/controller/trustmanager/certificates.go
  • pkg/controller/trustmanager/certificates_test.go
  • pkg/controller/trustmanager/constants.go
  • pkg/controller/trustmanager/controller.go
  • pkg/controller/trustmanager/controller_test.go
  • pkg/controller/trustmanager/deployments.go
  • pkg/controller/trustmanager/deployments_test.go
  • pkg/controller/trustmanager/install_trustmanager.go
  • pkg/controller/trustmanager/rbacs.go
  • pkg/controller/trustmanager/rbacs_test.go
  • pkg/controller/trustmanager/serviceaccounts.go
  • pkg/controller/trustmanager/serviceaccounts_test.go
  • pkg/controller/trustmanager/services.go
  • pkg/controller/trustmanager/services_test.go
  • pkg/controller/trustmanager/test_utils.go
  • pkg/controller/trustmanager/utils.go
  • pkg/controller/trustmanager/utils_test.go
  • pkg/controller/trustmanager/webhooks.go
  • pkg/controller/trustmanager/webhooks_test.go
  • pkg/features/features_test.go
  • pkg/operator/setup_manager.go
  • pkg/operator/starter.go
  • test/e2e/trustmanager_test.go
  • tools/go.mod

@chiragkyal chiragkyal force-pushed the trust-manager-reconcilers branch from b48171b to 198894c Compare March 13, 2026 08:00
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 13, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
Makefile (1)

62-62: ⚠️ Potential issue | 🟡 Minor

Remove quotes around version string for consistency.

TRUST_MANAGER_VERSION has quotes around the value ("v0.20.3") while other version variables like CERT_MANAGER_VERSION and ISTIO_CSR_VERSION don't. This inconsistency may cause the literal quotes to be included in image tags when the variable is expanded.

🐛 Proposed fix
-TRUST_MANAGER_VERSION ?= "v0.20.3"
+TRUST_MANAGER_VERSION ?= v0.20.3
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Makefile` at line 62, The TRUST_MANAGER_VERSION variable in the Makefile is
set with surrounding quotes which can inject literal quotes into expansions;
change TRUST_MANAGER_VERSION ?= "v0.20.3" to an unquoted value
TRUST_MANAGER_VERSION ?= v0.20.3 to match other variables (e.g.,
CERT_MANAGER_VERSION, ISTIO_CSR_VERSION) and ensure image tags or string
concatenations don't include stray quotes, then run a quick make target that
uses TRUST_MANAGER_VERSION to verify correct expansion.
🧹 Nitpick comments (4)
pkg/controller/trustmanager/test_utils.go (1)

55-62: Consider documenting that tests must inject a client separately.

The testReconciler factory creates a Reconciler without initializing the CtrlClient field. Tests calling methods like r.Exists() or r.Patch() will panic unless they separately inject a client. This appears intentional for flexibility, but a brief comment would help future maintainers understand the expected usage pattern.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/trustmanager/test_utils.go` around lines 55 - 62, Add a short
comment to the testReconciler factory explaining that it intentionally does not
initialize the Reconciler.CtrlClient and that tests must inject a fake or real
client before calling client-dependent methods (e.g., Reconciler.Exists,
Reconciler.Patch); place the comment immediately above the testReconciler
function and reference the Reconciler type and CtrlClient field so future
maintainers understand the expected usage pattern.
pkg/controller/trustmanager/services_test.go (1)

105-116: Reduce call-order coupling in Exists stubs.

Multiple cases rely on existsCall sequencing to choose webhook vs metrics objects. Prefer branching by key.Name so tests stay stable if reconciliation call order changes.

Refactor example
- existsCall := 0
  m.ExistsCalls(func(ctx context.Context, key client.ObjectKey, obj client.Object) (bool, error) {
-   existsCall++
-   var svc *corev1.Service
-   if existsCall == 1 {
-     svc = getWebhookServiceObject(testResourceLabels(), testResourceAnnotations())
-   } else {
-     svc = getMetricsServiceObject(testResourceLabels(), testResourceAnnotations())
-   }
+   var svc *corev1.Service
+   switch key.Name {
+   case trustManagerServiceName:
+     svc = getWebhookServiceObject(testResourceLabels(), testResourceAnnotations())
+   case trustManagerMetricsServiceName:
+     svc = getMetricsServiceObject(testResourceLabels(), testResourceAnnotations())
+   default:
+     return false, nil
+   }
    svc.DeepCopyInto(obj.(*corev1.Service))
    return true, nil
  })

Also applies to: 124-136, 149-160, 169-180, 189-200

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/trustmanager/services_test.go` around lines 105 - 116, The
current Exists stub in m.ExistsCalls uses an incremental existsCall counter to
choose between webhook vs metrics Service fixtures (existsCall variable and the
anonymous function passed to m.ExistsCalls), which couples tests to call order;
change the stub to inspect the incoming client.ObjectKey (key.Name) and branch
on that (e.g., if key.Name == "<webhook-service-name>" use
getWebhookServiceObject(...), else if key.Name == "<metrics-service-name>" use
getMetricsServiceObject(...)) so it returns the correct Service into
obj.(*corev1.Service) regardless of call sequencing; apply the same
key.Name-based branching fix to the other Exists stubs referenced (the blocks
around lines 124-136, 149-160, 169-180, 189-200) so all tests are stable when
reconciliation call order changes.
pkg/controller/trustmanager/deployments_test.go (1)

475-477: Make the “missing image” case deterministic by explicitly clearing the env var.

The setImage=false branch currently relies on ambient process env state. Explicitly unset/clear trustManagerImageNameEnvVarName in that branch to avoid environment-dependent failures.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/trustmanager/deployments_test.go` around lines 475 - 477, The
test relies on tt.setImage to decide whether to set the env var
trustManagerImageNameEnvVarName, but when setImage is false it leaves the
process env untouched which can cause nondeterministic failures; in the else
branch of that if (where tt.setImage is false) explicitly clear the environment
by calling os.Unsetenv(trustManagerImageNameEnvVarName) (or otherwise ensure the
variable is removed) so the test no longer depends on ambient process env;
update deployments_test.go around the tt.setImage block that currently sets
testImage to use this explicit unset when not setting the image.
pkg/controller/trustmanager/rbacs_test.go (1)

587-607: Consider failing fast if multiple ServiceAccount subjects exist.

The helper iterates through all subjects but only validates the first ServiceAccount found. If multiple ServiceAccount subjects exist unexpectedly, the test would pass on the first match without flagging potential issues.

🔧 Suggested improvement
 func assertSubjects(t *testing.T, subjects []rbacv1.Subject, expectedName, expectedNamespace string) {
 	t.Helper()
 	if len(subjects) == 0 {
 		t.Fatal("expected at least one subject")
 	}
-	found := false
+	var found *rbacv1.Subject
 	for _, s := range subjects {
 		if s.Kind == "ServiceAccount" {
-			found = true
-			if s.Name != expectedName {
-				t.Errorf("expected subject name %q, got %q", expectedName, s.Name)
-			}
-			if s.Namespace != expectedNamespace {
-				t.Errorf("expected subject namespace %q, got %q", expectedNamespace, s.Namespace)
+			if found != nil {
+				t.Error("expected exactly one ServiceAccount subject, found multiple")
 			}
+			found = &s
 		}
 	}
-	if !found {
+	if found == nil {
 		t.Error("expected to find a ServiceAccount subject")
+		return
+	}
+	if found.Name != expectedName {
+		t.Errorf("expected subject name %q, got %q", expectedName, found.Name)
+	}
+	if found.Namespace != expectedNamespace {
+		t.Errorf("expected subject namespace %q, got %q", expectedNamespace, found.Namespace)
 	}
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/trustmanager/rbacs_test.go` around lines 587 - 607, The helper
assertSubjects currently scans subjects and validates only the first
ServiceAccount it finds, which lets tests pass if multiple ServiceAccount
subjects exist; update assertSubjects to detect and fail when more than one
ServiceAccount is present (or alternatively require exactly one) and validate
all ServiceAccount fields deterministically: count ServiceAccount occurrences
while iterating the subjects slice, if count == 0 call t.Error as before, if
count > 1 call t.Fatalf or t.Errorf to fail fast, and continue to assert that
the (single) ServiceAccount's Name and Namespace match
expectedName/expectedNamespace; reference the assertSubjects function and its
subjects parameter to locate where to add the counting and failure logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/controller/trustmanager/controller.go`:
- Around line 105-119: The current use of withIgnoreStatusUpdatePredicates
(which is builder.WithPredicates(predicate.GenerationChangedPredicate{},
controllerManagedResources)) on the Deployment, Certificate, and Issuer watches
suppresses metadata-only updates; replace the third argument in Watches for
&appsv1.Deployment{}, &certmanagerv1.Certificate{}, and &certmanagerv1.Issuer{}
to use controllerManagedResourcePredicates (or a custom predicate that also
reacts to label/annotation changes) instead of withIgnoreStatusUpdatePredicates
so metadata-only drift will enqueue reconciles; modify the Watches lines
referencing withIgnoreStatusUpdatePredicates accordingly (or implement and
reference a new predicate function that detects metadata changes if preferred).

---

Outside diff comments:
In `@Makefile`:
- Line 62: The TRUST_MANAGER_VERSION variable in the Makefile is set with
surrounding quotes which can inject literal quotes into expansions; change
TRUST_MANAGER_VERSION ?= "v0.20.3" to an unquoted value TRUST_MANAGER_VERSION ?=
v0.20.3 to match other variables (e.g., CERT_MANAGER_VERSION, ISTIO_CSR_VERSION)
and ensure image tags or string concatenations don't include stray quotes, then
run a quick make target that uses TRUST_MANAGER_VERSION to verify correct
expansion.

---

Nitpick comments:
In `@pkg/controller/trustmanager/deployments_test.go`:
- Around line 475-477: The test relies on tt.setImage to decide whether to set
the env var trustManagerImageNameEnvVarName, but when setImage is false it
leaves the process env untouched which can cause nondeterministic failures; in
the else branch of that if (where tt.setImage is false) explicitly clear the
environment by calling os.Unsetenv(trustManagerImageNameEnvVarName) (or
otherwise ensure the variable is removed) so the test no longer depends on
ambient process env; update deployments_test.go around the tt.setImage block
that currently sets testImage to use this explicit unset when not setting the
image.

In `@pkg/controller/trustmanager/rbacs_test.go`:
- Around line 587-607: The helper assertSubjects currently scans subjects and
validates only the first ServiceAccount it finds, which lets tests pass if
multiple ServiceAccount subjects exist; update assertSubjects to detect and fail
when more than one ServiceAccount is present (or alternatively require exactly
one) and validate all ServiceAccount fields deterministically: count
ServiceAccount occurrences while iterating the subjects slice, if count == 0
call t.Error as before, if count > 1 call t.Fatalf or t.Errorf to fail fast, and
continue to assert that the (single) ServiceAccount's Name and Namespace match
expectedName/expectedNamespace; reference the assertSubjects function and its
subjects parameter to locate where to add the counting and failure logic.

In `@pkg/controller/trustmanager/services_test.go`:
- Around line 105-116: The current Exists stub in m.ExistsCalls uses an
incremental existsCall counter to choose between webhook vs metrics Service
fixtures (existsCall variable and the anonymous function passed to
m.ExistsCalls), which couples tests to call order; change the stub to inspect
the incoming client.ObjectKey (key.Name) and branch on that (e.g., if key.Name
== "<webhook-service-name>" use getWebhookServiceObject(...), else if key.Name
== "<metrics-service-name>" use getMetricsServiceObject(...)) so it returns the
correct Service into obj.(*corev1.Service) regardless of call sequencing; apply
the same key.Name-based branching fix to the other Exists stubs referenced (the
blocks around lines 124-136, 149-160, 169-180, 189-200) so all tests are stable
when reconciliation call order changes.

In `@pkg/controller/trustmanager/test_utils.go`:
- Around line 55-62: Add a short comment to the testReconciler factory
explaining that it intentionally does not initialize the Reconciler.CtrlClient
and that tests must inject a fake or real client before calling client-dependent
methods (e.g., Reconciler.Exists, Reconciler.Patch); place the comment
immediately above the testReconciler function and reference the Reconciler type
and CtrlClient field so future maintainers understand the expected usage
pattern.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: d1eb27f9-4095-4c34-8155-591dc092dfc6

📥 Commits

Reviewing files that changed from the base of the PR and between b48171b and 198894c.

📒 Files selected for processing (24)
  • Makefile
  • bundle/manifests/cert-manager-operator.clusterserviceversion.yaml
  • config/manager/manager.yaml
  • pkg/controller/common/utils.go
  • pkg/controller/trustmanager/certificates.go
  • pkg/controller/trustmanager/certificates_test.go
  • pkg/controller/trustmanager/constants.go
  • pkg/controller/trustmanager/controller.go
  • pkg/controller/trustmanager/controller_test.go
  • pkg/controller/trustmanager/deployments.go
  • pkg/controller/trustmanager/deployments_test.go
  • pkg/controller/trustmanager/install_trustmanager.go
  • pkg/controller/trustmanager/rbacs.go
  • pkg/controller/trustmanager/rbacs_test.go
  • pkg/controller/trustmanager/serviceaccounts.go
  • pkg/controller/trustmanager/serviceaccounts_test.go
  • pkg/controller/trustmanager/services.go
  • pkg/controller/trustmanager/services_test.go
  • pkg/controller/trustmanager/test_utils.go
  • pkg/controller/trustmanager/utils.go
  • pkg/controller/trustmanager/utils_test.go
  • pkg/controller/trustmanager/webhooks.go
  • pkg/controller/trustmanager/webhooks_test.go
  • pkg/operator/setup_manager.go
🚧 Files skipped from review as they are similar to previous changes (4)
  • pkg/controller/trustmanager/webhooks_test.go
  • pkg/controller/trustmanager/serviceaccounts_test.go
  • pkg/operator/setup_manager.go
  • pkg/controller/trustmanager/webhooks.go

@chiragkyal chiragkyal force-pushed the trust-manager-reconcilers branch from 198894c to 53cb39e Compare March 13, 2026 08:36
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
Makefile (1)

318-319: Make local-run debug/feature flags configurable instead of commented.

Line 318 and Line 319 are useful but currently require file edits. Consider routing them through a variable so developers can toggle at invocation time.

♻️ Suggested tweak
-#		--unsupported-addon-features="TrustManager=true" \
-#		--v=4
+		$(LOCAL_RUN_ARGS)

Add once near this target (outside this range):

LOCAL_RUN_ARGS ?=
# Example:
# make local-run LOCAL_RUN_ARGS='--unsupported-addon-features=TrustManager=true --v=4'
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Makefile` around lines 318 - 319, The commented debug flags are currently
hard-coded and require editing the Makefile; introduce a configurable variable
(e.g., LOCAL_RUN_ARGS) and update the local-run target to append this variable
when invoking the command so developers can pass flags at invocation time (for
example: make local-run
LOCAL_RUN_ARGS='--unsupported-addon-features=TrustManager=true --v=4'); add a
default declaration like LOCAL_RUN_ARGS ?= near the top of the target and
replace the commented flags in the local-run invocation with $(LOCAL_RUN_ARGS)
so toggling is possible without file edits.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@Makefile`:
- Around line 318-319: The commented debug flags are currently hard-coded and
require editing the Makefile; introduce a configurable variable (e.g.,
LOCAL_RUN_ARGS) and update the local-run target to append this variable when
invoking the command so developers can pass flags at invocation time (for
example: make local-run
LOCAL_RUN_ARGS='--unsupported-addon-features=TrustManager=true --v=4'); add a
default declaration like LOCAL_RUN_ARGS ?= near the top of the target and
replace the commented flags in the local-run invocation with $(LOCAL_RUN_ARGS)
so toggling is possible without file edits.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 2ea7a87d-3bd2-4897-a939-01bdee005e40

📥 Commits

Reviewing files that changed from the base of the PR and between 198894c and 53cb39e.

📒 Files selected for processing (3)
  • Makefile
  • bundle/manifests/cert-manager-operator.clusterserviceversion.yaml
  • config/manager/manager.yaml
🚧 Files skipped from review as they are similar to previous changes (2)
  • config/manager/manager.yaml
  • bundle/manifests/cert-manager-operator.clusterserviceversion.yaml

@chiragkyal chiragkyal changed the title Implement trust-manager resource reconcilers CM-867: Implement trust-manager resource reconcilers Mar 13, 2026
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Mar 13, 2026
@openshift-ci-robot
Copy link

openshift-ci-robot commented Mar 13, 2026

@chiragkyal: This pull request references CM-867 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target either version "4.22." or "openshift-4.22.", but it targets "cert-manager-1.19" instead.

Details

In response to this:

Summary

This PR builds on the trust-manager controller framework introduced in #371 by implementing the resource
reconcilers that create and maintain the lifecycle to run the trust-manager operand.

Changes

  • Added reconcile logic for every trust-manager operand resource
  • SSA Patch will be applied after comparing actual vs expected resources.
  • Added RELATED_IMAGE_CERT_MANAGER_TRUST_MANAGER and
    TRUSTMANAGER_OPERAND_IMAGE_VERSION environment variables to the CSV and manager deployment
    manifest.
  • Added commented-out --unsupported-addon-features="TrustManager=true" flag to
    local-run for easy local testing.

Summary by CodeRabbit

Release Notes

  • New Features
  • Added Trust Manager v0.20.3 integration support with automated resource management
  • Operator now manages Trust Manager deployments, services, RBAC policies, and webhook configurations
  • Extended controller to monitor and reconcile Trust Manager-related Kubernetes resources

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link

openshift-ci-robot commented Mar 13, 2026

@chiragkyal: This pull request references CM-867 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target either version "4.22." or "openshift-4.22.", but it targets "cert-manager-1.19" instead.

Details

In response to this:

Summary

Changes

  • Added reconcile logic for every trust-manager operand resource
  • SSA Patch will be applied after comparing actual vs expected resources.
  • Added RELATED_IMAGE_CERT_MANAGER_TRUST_MANAGER and
    TRUSTMANAGER_OPERAND_IMAGE_VERSION environment variables to the CSV and manager deployment
    manifest.
  • Added commented-out --unsupported-addon-features="TrustManager=true" flag to
    local-run for easy local testing.

Summary by CodeRabbit

Release Notes

  • New Features
  • Added Trust Manager v0.20.3 integration support with automated resource management
  • Operator now manages Trust Manager deployments, services, RBAC policies, and webhook configurations
  • Extended controller to monitor and reconcile Trust Manager-related Kubernetes resources

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link

openshift-ci-robot commented Mar 13, 2026

@chiragkyal: This pull request references CM-867 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target either version "4.22." or "openshift-4.22.", but it targets "cert-manager-1.19" instead.

Details

In response to this:

Summary

Changes

  • Added reconcile logic for every trust-manager operand resource
  • SSA Patch will be applied after comparing actual vs expected resources.
  • Added RELATED_IMAGE_CERT_MANAGER_TRUST_MANAGER and
    TRUSTMANAGER_OPERAND_IMAGE_VERSION environment variables to the CSV and manager deployment
    manifest.
  • Added commented-out --unsupported-addon-features="TrustManager=true" flag to
    local-run for easy local testing.

Summary by CodeRabbit

Release Notes

  • New Features
  • Added Trust Manager v0.20.3 integration support with automated resource management
  • Operator now manages Trust Manager deployments, services, RBAC policies, and webhook configurations
  • Extended controller to monitor and reconcile Trust Manager-related Kubernetes resources

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@chiragkyal
Copy link
Member Author

/cc @bharath-b-rh @emmajiafan

@chiragkyal
Copy link
Member Author

/retest

@openshift-ci-robot
Copy link

@chiragkyal: An error was encountered searching for bug CM-867 on the Jira server at https://issues.redhat.com. No known errors were detected, please see the full error message for details.

Full error message. request failed. Please analyze the request body for more details. Status code: 403:

Please contact an administrator to resolve this issue, then request a bug refresh with /jira refresh.

Details

In response to this:

Summary

Changes

  • Added reconcile logic for every trust-manager operand resource
  • SSA Patch will be applied after comparing actual vs expected resources.
  • Added RELATED_IMAGE_CERT_MANAGER_TRUST_MANAGER and
    TRUSTMANAGER_OPERAND_IMAGE_VERSION environment variables to the CSV and manager deployment
    manifest.
  • Added commented-out --unsupported-addon-features="TrustManager=true" flag to
    local-run for easy local testing.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot removed the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Mar 16, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/controller/trustmanager/deployments_test.go`:
- Around line 475-477: The test is flaky because when tt.setImage is false it
leaves the environment variable uncontrolled; modify the test to explicitly
clear the env var in that branch so the "missing image" case is
deterministic—replace the current conditional with: when tt.setImage is true
call t.Setenv(trustManagerImageNameEnvVarName, testImage), otherwise call
t.Setenv(trustManagerImageNameEnvVarName, "") (or explicitly unset via
os.Unsetenv with proper cleanup) so trustManagerImageNameEnvVarName is
guaranteed absent/empty during the test.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f8d1a114-b2d3-4129-a4f6-fc6552195a93

📥 Commits

Reviewing files that changed from the base of the PR and between 53cb39e and 70558d2.

📒 Files selected for processing (9)
  • pkg/controller/trustmanager/certificates_test.go
  • pkg/controller/trustmanager/controller_test.go
  • pkg/controller/trustmanager/deployments_test.go
  • pkg/controller/trustmanager/rbacs_test.go
  • pkg/controller/trustmanager/serviceaccounts_test.go
  • pkg/controller/trustmanager/services_test.go
  • pkg/controller/trustmanager/test_utils.go
  • pkg/controller/trustmanager/utils_test.go
  • pkg/controller/trustmanager/webhooks_test.go

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
test/e2e/trustmanager_test.go (2)

69-83: ⚠️ Potential issue | 🟠 Major

Missing cleanup: Restore feature gate state after test suite.

The BeforeAll modifies the cluster-scoped Subscription to enable TrustManager=true, but there's no corresponding AfterAll to restore the original UNSUPPORTED_ADDON_FEATURES value. This leaks state into subsequent e2e specs.

Consider capturing the original value before patching and restoring it in AfterAll:

var originalUnsupportedAddonFeatures string

BeforeAll(func() {
    // Capture original value first
    // ... then patch
})

AfterAll(func() {
    // Restore original value using patchSubscriptionWithEnvVars
})
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/trustmanager_test.go` around lines 69 - 83, Capture the current
value of UNSUPPORTED_ADDON_FEATURES into a package-level variable (e.g.,
originalUnsupportedAddonFeatures) inside BeforeAll before calling
patchSubscriptionWithEnvVars, then add an AfterAll that calls
patchSubscriptionWithEnvVars to restore the original value and uses
waitForDeploymentEnvVarAndRollout to ensure the operator rolls out with the
restored env var; reference the existing BeforeAll,
patchSubscriptionWithEnvVars, waitForDeploymentEnvVarAndRollout and add an
AfterAll to perform the restore and verification.

812-815: ⚠️ Potential issue | 🟡 Minor

Return error on Degraded=True for faster failure diagnosis.

When the controller reports Degraded=True, continuing to poll until timeout delays CI feedback. Returning an error with the degraded reason would help diagnose failures faster.

 		degradedCondition := meta.FindStatusCondition(trustManagerStatus.Conditions, v1alpha1.Degraded)
 		if degradedCondition != nil && degradedCondition.Status == metav1.ConditionTrue {
-			return false, nil
+			return false, fmt.Errorf("TrustManager is degraded: %s - %s", degradedCondition.Reason, degradedCondition.Message)
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/trustmanager_test.go` around lines 812 - 815, The test currently
treats a Degraded=True condition as a non-fatal state; change the logic so that
when meta.FindStatusCondition(trustManagerStatus.Conditions, v1alpha1.Degraded)
returns a condition with Status==metav1.ConditionTrue the function returns an
error instead of (false, nil). Construct the error including the degraded
condition's Reason and/or Message (e.g., via fmt.Errorf("controller degraded: %s
- %s", degradedCondition.Reason, degradedCondition.Message)) so callers fail
fast and get diagnostic info; update the code path using degradedCondition,
meta.FindStatusCondition, trustManagerStatus.Conditions and v1alpha1.Degraded
accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@test/e2e/trustmanager_test.go`:
- Around line 69-83: Capture the current value of UNSUPPORTED_ADDON_FEATURES
into a package-level variable (e.g., originalUnsupportedAddonFeatures) inside
BeforeAll before calling patchSubscriptionWithEnvVars, then add an AfterAll that
calls patchSubscriptionWithEnvVars to restore the original value and uses
waitForDeploymentEnvVarAndRollout to ensure the operator rolls out with the
restored env var; reference the existing BeforeAll,
patchSubscriptionWithEnvVars, waitForDeploymentEnvVarAndRollout and add an
AfterAll to perform the restore and verification.
- Around line 812-815: The test currently treats a Degraded=True condition as a
non-fatal state; change the logic so that when
meta.FindStatusCondition(trustManagerStatus.Conditions, v1alpha1.Degraded)
returns a condition with Status==metav1.ConditionTrue the function returns an
error instead of (false, nil). Construct the error including the degraded
condition's Reason and/or Message (e.g., via fmt.Errorf("controller degraded: %s
- %s", degradedCondition.Reason, degradedCondition.Message)) so callers fail
fast and get diagnostic info; update the code path using degradedCondition,
meta.FindStatusCondition, trustManagerStatus.Conditions and v1alpha1.Degraded
accordingly.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e5fae2f5-dd94-4fbc-bacd-025ec1420a92

📥 Commits

Reviewing files that changed from the base of the PR and between 70558d2 and 42609c6.

📒 Files selected for processing (2)
  • test/e2e/trustmanager_test.go
  • test/e2e/utils_test.go

@openshift-ci-robot
Copy link

@chiragkyal: An error was encountered searching for bug CM-867 on the Jira server at https://issues.redhat.com. No known errors were detected, please see the full error message for details.

Full error message. request failed. Please analyze the request body for more details. Status code: 403:

Please contact an administrator to resolve this issue, then request a bug refresh with /jira refresh.

Details

In response to this:

Summary

Changes

  • Added reconcile logic for every trust-manager operand resource
  • SSA Patch will be applied after comparing actual vs expected resources.
  • Added RELATED_IMAGE_CERT_MANAGER_TRUST_MANAGER and
    TRUSTMANAGER_OPERAND_IMAGE_VERSION environment variables to the CSV and manager deployment
    manifest.
  • Added commented-out --unsupported-addon-features="TrustManager=true" flag to
    local-run for easy local testing.
  • Added Unit tests and E2E tests for all the managed resources.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@chiragkyal
Copy link
Member Author

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Mar 16, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@chiragkyal
Copy link
Member Author

@coderabbitai resume

@coderabbitai
Copy link

coderabbitai bot commented Mar 16, 2026

✅ Actions performed

Reviews resumed.

@chiragkyal
Copy link
Member Author

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Mar 16, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@chiragkyal chiragkyal force-pushed the trust-manager-reconcilers branch from 2d56512 to 689b09c Compare March 17, 2026 10:18
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

♻️ Duplicate comments (3)
pkg/controller/common/utils.go (1)

54-66: ⚠️ Potential issue | 🔴 Critical

Replace panic-based decoding with error returns.

Line 56 through Line 64 can panic on decode/type mismatch. In controller paths, this can terminate the operator process instead of returning a reconcile error.

Proposed fix
-func DecodeObjBytes[T runtime.Object](codecs serializer.CodecFactory, gv schema.GroupVersion, objBytes []byte) T {
+func DecodeObjBytes[T runtime.Object](codecs serializer.CodecFactory, gv schema.GroupVersion, objBytes []byte) (T, error) {
+	var zero T
 	obj, err := runtime.Decode(codecs.UniversalDecoder(gv), objBytes)
 	if err != nil {
-		panic(fmt.Sprintf("failed to decode object bytes for %T: %v", *new(T), err))
+		return zero, fmt.Errorf("failed to decode object bytes for %T: %w", *new(T), err)
 	}
 	typed, ok := obj.(T)
 	if !ok {
-		panic(fmt.Sprintf("failed to convert decoded object to %T", *new(T)))
+		return zero, fmt.Errorf("failed to convert decoded object to %T", *new(T))
 	}
-	return typed
+	return typed, nil
 }

As per coding guidelines, **: -Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/common/utils.go` around lines 54 - 66, The DecodeObjBytes
function currently panics on decode/type-mismatch; change its signature to
return (T, error) instead of panicking, i.e. DecodeObjBytes[T
runtime.Object](codecs serializer.CodecFactory, gv schema.GroupVersion, objBytes
[]byte) (T, error); on decode failure return the zero value of T and an error
created with fmt.Errorf that includes context (e.g. the target type from *new(T)
and the original err), and on failed type assertion return zero T with a
descriptive error instead of panic; update all callers of DecodeObjBytes to
handle the error and propagate/reconcile appropriately.
test/e2e/trustmanager_test.go (2)

813-816: ⚠️ Potential issue | 🟠 Major

Return the degraded reason instead of timing out.

return false, nil here turns an irrecoverable controller failure into a generic highTimeout expiry and drops the condition details. Fail fast with the degraded Reason and Message.

Proposed fix
 		degradedCondition := meta.FindStatusCondition(trustManagerStatus.Conditions, v1alpha1.Degraded)
 		if degradedCondition != nil && degradedCondition.Status == metav1.ConditionTrue {
-			return false, nil
+			return false, fmt.Errorf(
+				"TrustManager degraded: %s: %s",
+				degradedCondition.Reason,
+				degradedCondition.Message,
+			)
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/trustmanager_test.go` around lines 813 - 816, The code currently
treats a Degraded condition as a generic timeout by returning (false, nil);
instead detect the degradedCondition found via
meta.FindStatusCondition(trustManagerStatus.Conditions, v1alpha1.Degraded) and
fail fast by returning an error that includes degradedCondition.Reason and
degradedCondition.Message so the test surface the actual controller failure
details rather than expiring with highTimeout.

69-84: ⚠️ Potential issue | 🟠 Major

Restore the patched Subscription env after this suite.

BeforeAll mutates the cluster-scoped Subscription but nothing restores the previous UNSUPPORTED_ADDON_FEATURES and OPERATOR_LOG_LEVEL values. That leaks state into later e2e specs and makes the suite order-dependent.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/trustmanager_test.go` around lines 69 - 84, The BeforeAll block
mutates cluster Subscription env vars but never restores them; capture the
current UNSUPPORTED_ADDON_FEATURES and OPERATOR_LOG_LEVEL values before calling
patchSubscriptionWithEnvVars in BeforeAll and add an AfterAll that calls
patchSubscriptionWithEnvVars to restore the original values (or remove keys if
they were absent). Modify the test to use the existing BeforeAll,
patchSubscriptionWithEnvVars, and waitForDeploymentEnvVarAndRollout symbols and
ensure AfterAll waits for the operator rollout to return to the original env
state (using waitForDeploymentEnvVarAndRollout) so cluster-scoped state is
cleaned up after the suite.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/controller/trustmanager/certificates.go`:
- Around line 21-37: The helper functions are using r.ctx (a background context)
for client calls which decouples them from the reconcile cancellation; change
the signatures of processReconcileRequest, reconcileTrustManagerDeployment,
createOrApplyIssuer, createOrApplyCertificate and any other helper that issues
client API calls to accept a context parameter (ctx context.Context) and replace
uses of r.ctx with that ctx when calling r.Exists, r.Patch, r.Create, r.Update,
etc.; then thread the reconcile-scoped ctx passed into Reconcile through
processReconcileRequest -> reconcileTrustManagerDeployment ->
createOrApplyIssuer/createOrApplyCertificate so all client operations use the
reconcile context and obey cancellation/timeouts.

In `@pkg/controller/trustmanager/controller.go`:
- Around line 109-118: The current predicate setup
(withIgnoreStatusUpdatePredicates using builder.WithPredicates and
controllerManagedResources) ANDs checks so removal of the managed label gets
filtered; change the logic so controllerManagedResources checks both old and new
objects (i.e. “managed in either state”) before being combined with
LabelChangedPredicate/AnnotationChangedPredicate/GenerationChangedPredicate, and
update the associated mapFunc to inspect both old and new object labels when
deciding to enqueue/restore the managed label; apply this change to all watch
registrations that currently pass controllerManagedResources so the controller
triggers reconciliation when a managed label is removed.
- Around line 120-121: The current primary watch uses
predicate.GenerationChangedPredicate in the
ctrl.NewControllerManagedBy(...).For(&v1alpha1.TrustManager{},
builder.WithPredicates(predicate.GenerationChangedPredicate{})) which filters
out deletion-timestamp updates so finalizer removal never runs; fix by replacing
that predicate with one that also allows deletion events (either remove the
predicate entirely or use predicate.Or to combine GenerationChangedPredicate
with a predicate that returns true for Delete and for updates where
metadata.deletionTimestamp changed, or add a predicate.Funcs that explicitly
allows Delete and update events when DeletionTimestamp transitions) so
reconciler runs on finalizer deletions for TrustManager.

In `@pkg/controller/trustmanager/deployments.go`:
- Around line 34-44: deploymentModified currently enforces exact equality on
NodeSelector, Volumes, Affinity, Tolerations and container order which conflicts
with the Server-Side Apply path (r.Patch(..., client.Apply,
client.FieldOwner(...), client.ForceOwnership)) and causes perpetual patches
when other field managers add fields; update deploymentModified (the function
referenced in the if check) to only compare fields that this controller manages
(the SSA-managed subset) or relax comparisons (e.g., ignore extra containers,
unordered volumes, and keys added by other managers) so the equality check
reflects SSA semantics, or alternatively change the apply logic to not use
ForceOwnership if exact ownership is intended; locate usages around
deploymentModified, the r.Patch call, and the desired/existing deployment
variables to implement the narrower comparison/ignoring logic.

In `@pkg/operator/setup_manager.go`:
- Around line 57-67: The trustManagerManagedResources slice is missing
&certmanagerv1.Issuer{}, causing Issuer to be excluded from the label-selector
based cache scoping used by addControllerCacheConfig; add
&certmanagerv1.Issuer{} into the trustManagerManagedResources slice (alongside
&certmanagerv1.Certificate{}) so the TrustManager controller (see
controller.go:131 that watches Issuer) will be filtered by the same
managed-label cache configuration and avoid cluster-wide caching.

---

Duplicate comments:
In `@pkg/controller/common/utils.go`:
- Around line 54-66: The DecodeObjBytes function currently panics on
decode/type-mismatch; change its signature to return (T, error) instead of
panicking, i.e. DecodeObjBytes[T runtime.Object](codecs serializer.CodecFactory,
gv schema.GroupVersion, objBytes []byte) (T, error); on decode failure return
the zero value of T and an error created with fmt.Errorf that includes context
(e.g. the target type from *new(T) and the original err), and on failed type
assertion return zero T with a descriptive error instead of panic; update all
callers of DecodeObjBytes to handle the error and propagate/reconcile
appropriately.

In `@test/e2e/trustmanager_test.go`:
- Around line 813-816: The code currently treats a Degraded condition as a
generic timeout by returning (false, nil); instead detect the degradedCondition
found via meta.FindStatusCondition(trustManagerStatus.Conditions,
v1alpha1.Degraded) and fail fast by returning an error that includes
degradedCondition.Reason and degradedCondition.Message so the test surface the
actual controller failure details rather than expiring with highTimeout.
- Around line 69-84: The BeforeAll block mutates cluster Subscription env vars
but never restores them; capture the current UNSUPPORTED_ADDON_FEATURES and
OPERATOR_LOG_LEVEL values before calling patchSubscriptionWithEnvVars in
BeforeAll and add an AfterAll that calls patchSubscriptionWithEnvVars to restore
the original values (or remove keys if they were absent). Modify the test to use
the existing BeforeAll, patchSubscriptionWithEnvVars, and
waitForDeploymentEnvVarAndRollout symbols and ensure AfterAll waits for the
operator rollout to return to the original env state (using
waitForDeploymentEnvVarAndRollout) so cluster-scoped state is cleaned up after
the suite.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 03762f5c-7d60-4948-9add-6671d394478f

📥 Commits

Reviewing files that changed from the base of the PR and between 2d56512 and 689b09c.

📒 Files selected for processing (27)
  • Makefile
  • bundle/manifests/cert-manager-operator.clusterserviceversion.yaml
  • config/manager/manager.yaml
  • config/rbac/role.yaml
  • pkg/controller/common/utils.go
  • pkg/controller/trustmanager/certificates.go
  • pkg/controller/trustmanager/certificates_test.go
  • pkg/controller/trustmanager/constants.go
  • pkg/controller/trustmanager/controller.go
  • pkg/controller/trustmanager/controller_test.go
  • pkg/controller/trustmanager/deployments.go
  • pkg/controller/trustmanager/deployments_test.go
  • pkg/controller/trustmanager/install_trustmanager.go
  • pkg/controller/trustmanager/rbacs.go
  • pkg/controller/trustmanager/rbacs_test.go
  • pkg/controller/trustmanager/serviceaccounts.go
  • pkg/controller/trustmanager/serviceaccounts_test.go
  • pkg/controller/trustmanager/services.go
  • pkg/controller/trustmanager/services_test.go
  • pkg/controller/trustmanager/test_utils.go
  • pkg/controller/trustmanager/utils.go
  • pkg/controller/trustmanager/utils_test.go
  • pkg/controller/trustmanager/webhooks.go
  • pkg/controller/trustmanager/webhooks_test.go
  • pkg/operator/setup_manager.go
  • test/e2e/trustmanager_test.go
  • test/e2e/utils_test.go
🚧 Files skipped from review as they are similar to previous changes (7)
  • pkg/controller/trustmanager/webhooks_test.go
  • pkg/controller/trustmanager/utils_test.go
  • pkg/controller/trustmanager/services_test.go
  • config/manager/manager.yaml
  • Makefile
  • test/e2e/utils_test.go
  • pkg/controller/trustmanager/serviceaccounts_test.go

Comment on lines +21 to +37
func (r *Reconciler) createOrApplyIssuer(trustManager *v1alpha1.TrustManager, resourceLabels, resourceAnnotations map[string]string) error {
desired := getIssuerObject(resourceLabels, resourceAnnotations)
resourceName := fmt.Sprintf("%s/%s", desired.GetNamespace(), desired.GetName())
r.log.V(4).Info("reconciling issuer resource", "name", resourceName)

existing := &certmanagerv1.Issuer{}
exists, err := r.Exists(r.ctx, client.ObjectKeyFromObject(desired), existing)
if err != nil {
return common.FromClientError(err, "failed to check if issuer %q exists", resourceName)
}
if exists && !issuerModified(desired, existing) {
r.log.V(4).Info("issuer already matches desired state, skipping apply", "name", resourceName)
return nil
}

if err := r.Patch(r.ctx, desired, client.Apply, client.FieldOwner(fieldOwner), client.ForceOwnership); err != nil {
return common.FromClientError(err, "failed to apply issuer %q", resourceName)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, find the controller.go file and check how r.ctx is initialized
find . -name "controller.go" -path "*/trustmanager/*" -type f

Repository: openshift/cert-manager-operator

Length of output: 118


🏁 Script executed:

# Check the context initialization in controller.go
grep -n "context.Background\|r.ctx\|\.ctx\s*=" pkg/controller/trustmanager/controller.go | head -20

Repository: openshift/cert-manager-operator

Length of output: 257


🏁 Script executed:

# Read the controller.go file to understand the Reconciler structure
wc -l pkg/controller/trustmanager/controller.go

Repository: openshift/cert-manager-operator

Length of output: 120


🏁 Script executed:

# Examine certificates.go to see the function signatures and context usage
wc -l pkg/controller/trustmanager/certificates.go

Repository: openshift/cert-manager-operator

Length of output: 122


🏁 Script executed:

# Read the createOrApplyIssuer function (lines 21-37 and nearby context)
sed -n '1,50p' pkg/controller/trustmanager/certificates.go | cat -n

Repository: openshift/cert-manager-operator

Length of output: 2617


🏁 Script executed:

# Read the Reconcile function signature in controller.go
sed -n '80,150p' pkg/controller/trustmanager/controller.go | cat -n

Repository: openshift/cert-manager-operator

Length of output: 4482


🏁 Script executed:

# Search for processReconcileRequest to see its signature
rg "processReconcileRequest|func.*Reconcile" pkg/controller/trustmanager/ -A 3

Repository: openshift/cert-manager-operator

Length of output: 46091


🏁 Script executed:

# Get the full processReconcileRequest signature
sed -n '104,120p' pkg/controller/trustmanager/controller.go | cat -n

Repository: openshift/cert-manager-operator

Length of output: 760


🏁 Script executed:

# Check all the createOrApply functions and their signatures in certificates.go
sed -n '20,80p' pkg/controller/trustmanager/certificates.go | cat -n

Repository: openshift/cert-manager-operator

Length of output: 3740


🏁 Script executed:

# Check what context is available and used in other similar helper functions
rg "createOrApply|func \(r \*Reconciler\)" pkg/controller/trustmanager/ --no-heading -A 1 | head -40

Repository: openshift/cert-manager-operator

Length of output: 3650


🏁 Script executed:

# Get the full processReconcileRequest function definition
sed -n '127,160p' pkg/controller/trustmanager/controller.go | cat -n

Repository: openshift/cert-manager-operator

Length of output: 2509


🏁 Script executed:

# Check how createOrApplyIssuer and createOrApplyCertificate are called from processReconcileRequest
rg "createOrApplyIssuer|createOrApplyCertificate" pkg/controller/trustmanager/controller.go -B 2 -A 2

Repository: openshift/cert-manager-operator

Length of output: 57


🏁 Script executed:

# Check utils.go to understand the pattern - how other methods accept ctx
sed -n '1,30p' pkg/controller/trustmanager/utils.go | cat -n

Repository: openshift/cert-manager-operator

Length of output: 1077


🏁 Script executed:

# Search for where these functions are called in the codebase
rg "createOrApplyIssuer\(|createOrApplyCertificate\(" pkg/controller/trustmanager/ -B 3 -A 1

Repository: openshift/cert-manager-operator

Length of output: 2891


🏁 Script executed:

# Get more context on processReconcileRequest - the full function
sed -n '127,180p' pkg/controller/trustmanager/controller.go | cat -n

Repository: openshift/cert-manager-operator

Length of output: 3416


🏁 Script executed:

# Check the full processReconcileRequest function and see how it calls the install functions
sed -n '127,208p' pkg/controller/trustmanager/controller.go | cat -n

Repository: openshift/cert-manager-operator

Length of output: 4937


🏁 Script executed:

# Check install_trustmanager.go to see where createOrApplyIssuer/Certificate are called
sed -n '1,50p' pkg/controller/trustmanager/install_trustmanager.go | cat -n

Repository: openshift/cert-manager-operator

Length of output: 2296


🏁 Script executed:

# Get the reconcileTrustManagerDeployment function
rg "reconcileTrustManagerDeployment" pkg/controller/trustmanager/ -A 20 | head -40

Repository: openshift/cert-manager-operator

Length of output: 3165


🏁 Script executed:

# Verify that ctx is NOT being passed through the call chain
rg "processReconcileRequest|reconcileTrustManagerDeployment" pkg/controller/trustmanager/controller.go -A 2 | head -20

Repository: openshift/cert-manager-operator

Length of output: 690


🏁 Script executed:

# Check if there are other similar functions that DO pass ctx correctly for comparison
rg "createOrApplyService|createOrApplyValidatingWebhook" pkg/controller/trustmanager/ -A 8 | head -50

Repository: openshift/cert-manager-operator

Length of output: 4699


🏁 Script executed:

# Double-check the line numbers for the createOrApplyCertificate function
sed -n '55,80p' pkg/controller/trustmanager/certificates.go | cat -n

Repository: openshift/cert-manager-operator

Length of output: 1705


Pass the reconcile-scoped ctx through the helper functions instead of using r.ctx.

Lines 27, 36, 61, and 70 use r.ctx (initialized from context.Background() in controller.go) for client API calls. This detaches these requests from reconcile cancellation and allows work to continue running after the current reconcile is cancelled. Thread the ctx from the Reconcile method through processReconcileRequest, reconcileTrustManagerDeployment, and into createOrApplyIssuer, createOrApplyCertificate, and all other helper functions that perform client operations. This pattern should be applied consistently across all similar helper methods in the trustmanager controller.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/trustmanager/certificates.go` around lines 21 - 37, The helper
functions are using r.ctx (a background context) for client calls which
decouples them from the reconcile cancellation; change the signatures of
processReconcileRequest, reconcileTrustManagerDeployment, createOrApplyIssuer,
createOrApplyCertificate and any other helper that issues client API calls to
accept a context parameter (ctx context.Context) and replace uses of r.ctx with
that ctx when calling r.Exists, r.Patch, r.Create, r.Update, etc.; then thread
the reconcile-scoped ctx passed into Reconcile through processReconcileRequest
-> reconcileTrustManagerDeployment ->
createOrApplyIssuer/createOrApplyCertificate so all client operations use the
reconcile context and obey cancellation/timeouts.

Comment on lines +34 to +44
if exists && !deploymentModified(desired, existing) {
r.log.V(4).Info("deployment already matches desired state, skipping apply", "name", deploymentName)
return nil
}

if err := r.Patch(r.ctx, desired, client.Apply, client.FieldOwner(fieldOwner), client.ForceOwnership); err != nil {
return common.FromClientError(err, "failed to apply deployment %q", deploymentName)
}

r.eventRecorder.Eventf(trustManager, corev1.EventTypeNormal, "Reconciled", "deployment resource %s applied", deploymentName)
r.log.V(2).Info("applied deployment", "name", deploymentName)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

printf '\n== SSA apply path ==\n'
rg -n -C2 'client\.Apply|ForceOwnership|deploymentModified\(' pkg/controller/trustmanager/deployments.go

printf '\n== exact equality checks inside deploymentModified ==\n'
rg -n -C2 'maps\.Equal\(desiredPodSpec\.NodeSelector|reflect\.DeepEqual\(desiredPodSpec\.Volumes|len\(desiredPodSpec\.Containers\)|containerModified\(&desiredPodSpec\.Containers' pkg/controller/trustmanager/deployments.go

Repository: openshift/cert-manager-operator

Length of output: 1645


Skip logic requires exact equality but apply path uses Server-Side Apply, causing perpetual patching when other field managers are present.

The deploymentModified function at line 34 checks for exact equality across NodeSelector, Volumes, Affinity, Tolerations, and container count/order (lines 226, 227, 228, 229, 233). However, line 39 applies the Deployment using Server-Side Apply (client.Apply + ForceOwnership), which allows unmanaged fields from other field managers to coexist. When a sidecar injector, admission webhook, or another controller adds a container, volume, or node-selector key, the equality checks return true and trigger unnecessary patches and Reconciled events on every reconciliation loop.

Either narrow deploymentModified to match only SSA-managed fields, or clarify the intent if exact state is required.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/trustmanager/deployments.go` around lines 34 - 44,
deploymentModified currently enforces exact equality on NodeSelector, Volumes,
Affinity, Tolerations and container order which conflicts with the Server-Side
Apply path (r.Patch(..., client.Apply, client.FieldOwner(...),
client.ForceOwnership)) and causes perpetual patches when other field managers
add fields; update deploymentModified (the function referenced in the if check)
to only compare fields that this controller manages (the SSA-managed subset) or
relax comparisons (e.g., ignore extra containers, unordered volumes, and keys
added by other managers) so the equality check reflects SSA semantics, or
alternatively change the apply logic to not use ForceOwnership if exact
ownership is intended; locate usages around deploymentModified, the r.Patch
call, and the desired/existing deployment variables to implement the narrower
comparison/ignoring logic.

@chiragkyal chiragkyal force-pushed the trust-manager-reconcilers branch from 689b09c to 2552e3a Compare March 17, 2026 13:14
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (3)
test/e2e/trustmanager_test.go (3)

813-816: ⚠️ Potential issue | 🟠 Major

Fail fast when Degraded=True during readiness polling.

Line 815 returns (false, nil) on degraded status, so the test can spin until timeout and lose the condition reason/message that explains the failure. Return an error including degradedCondition.Reason and degradedCondition.Message instead.

As per coding guidelines, "Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/trustmanager_test.go` around lines 813 - 816, The readiness poll
currently returns (false, nil) when a Degraded condition is found (via
meta.FindStatusCondition on trustManagerStatus.Conditions for
v1alpha1.Degraded), which hides failure details; replace that early return with
an error that includes degradedCondition.Reason and degradedCondition.Message so
the test fails fast and surfaces the condition reason/message instead of timing
out.

69-84: ⚠️ Potential issue | 🟠 Major

Restore patched Subscription env vars after this ordered suite.

Line 75 mutates a cluster-scoped Subscription, but there is no matching AfterAll to restore previous UNSUPPORTED_ADDON_FEATURES/OPERATOR_LOG_LEVEL. This leaks state into later e2e specs and makes runs order-dependent.

As per coding guidelines, "Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/trustmanager_test.go` around lines 69 - 84, BeforeAll mutates the
cluster Subscription via patchSubscriptionWithEnvVars setting
UNSUPPORTED_ADDON_FEATURES and OPERATOR_LOG_LEVEL but never restores them;
capture the current env vars before calling patchSubscriptionWithEnvVars in
BeforeAll (read existing UNSUPPORTED_ADDON_FEATURES and OPERATOR_LOG_LEVEL from
the Subscription) and add a matching AfterAll that calls
patchSubscriptionWithEnvVars to restore those saved values (or remove them if
they were absent), using the same ctx/loader and handling errors with
Expect(...).NotTo(HaveOccurred()); reference BeforeAll, AfterAll,
patchSubscriptionWithEnvVars, waitForDeploymentEnvVarAndRollout,
UNSUPPORTED_ADDON_FEATURES, and OPERATOR_LOG_LEVEL in your changes.

868-875: ⚠️ Potential issue | 🟠 Major

Recreation helper still allows false positives due to async delete race.

This helper only checks for eventual Get success after delete. If the object was never observed as deleted, the test can pass without verifying controller-driven recreation.

Proposed fix
 func verifyTrustManagerResourceRecreation(deleteFunc func() error, getFunc func() error) {
 	err := deleteFunc()
 	Expect(err).ShouldNot(HaveOccurred())
 
+	Eventually(func() bool {
+		return errors.IsNotFound(getFunc())
+	}, lowTimeout, fastPollInterval).Should(BeTrue(), "resource was not observed as deleted")
+
 	Eventually(func() error {
 		return getFunc()
 	}, lowTimeout, fastPollInterval).Should(Succeed(), "resource was not recreated by controller")
 }

As per coding guidelines, "Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/trustmanager_test.go` around lines 868 - 875, The helper
verifyTrustManagerResourceRecreation currently only asserts that getFunc
eventually succeeds after deleteFunc, which can false-positive if the delete was
never observed; change verifyTrustManagerResourceRecreation to first assert the
resource was actually deleted by polling getFunc until it returns a "not found"
(or an error indicating absence) before asserting recreation: call deleteFunc(),
then poll getFunc until it reports missing (using the existing polling
timeouts/intervals), and only after that use Eventually to assert getFunc
succeeds (recreated); reference verifyTrustManagerResourceRecreation,
deleteFunc, and getFunc when making this change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@test/e2e/trustmanager_test.go`:
- Around line 813-816: The readiness poll currently returns (false, nil) when a
Degraded condition is found (via meta.FindStatusCondition on
trustManagerStatus.Conditions for v1alpha1.Degraded), which hides failure
details; replace that early return with an error that includes
degradedCondition.Reason and degradedCondition.Message so the test fails fast
and surfaces the condition reason/message instead of timing out.
- Around line 69-84: BeforeAll mutates the cluster Subscription via
patchSubscriptionWithEnvVars setting UNSUPPORTED_ADDON_FEATURES and
OPERATOR_LOG_LEVEL but never restores them; capture the current env vars before
calling patchSubscriptionWithEnvVars in BeforeAll (read existing
UNSUPPORTED_ADDON_FEATURES and OPERATOR_LOG_LEVEL from the Subscription) and add
a matching AfterAll that calls patchSubscriptionWithEnvVars to restore those
saved values (or remove them if they were absent), using the same ctx/loader and
handling errors with Expect(...).NotTo(HaveOccurred()); reference BeforeAll,
AfterAll, patchSubscriptionWithEnvVars, waitForDeploymentEnvVarAndRollout,
UNSUPPORTED_ADDON_FEATURES, and OPERATOR_LOG_LEVEL in your changes.
- Around line 868-875: The helper verifyTrustManagerResourceRecreation currently
only asserts that getFunc eventually succeeds after deleteFunc, which can
false-positive if the delete was never observed; change
verifyTrustManagerResourceRecreation to first assert the resource was actually
deleted by polling getFunc until it returns a "not found" (or an error
indicating absence) before asserting recreation: call deleteFunc(), then poll
getFunc until it reports missing (using the existing polling
timeouts/intervals), and only after that use Eventually to assert getFunc
succeeds (recreated); reference verifyTrustManagerResourceRecreation,
deleteFunc, and getFunc when making this change.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6bdc6f8f-8a5a-45ef-9292-74e885f47b9e

📥 Commits

Reviewing files that changed from the base of the PR and between 689b09c and 2552e3a.

📒 Files selected for processing (2)
  • test/e2e/trustmanager_test.go
  • test/e2e/utils_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/e2e/utils_test.go

@chiragkyal
Copy link
Member Author

/jira refresh

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Mar 17, 2026
@openshift-ci-robot
Copy link

openshift-ci-robot commented Mar 17, 2026

@chiragkyal: This pull request references CM-867 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target either version "4.22." or "openshift-4.22.", but it targets "cert-manager-1.19" instead.

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

certificate.Spec.SecretName = trustManagerTLSSecretName
certificate.Spec.IssuerRef = certmanagermetav1.ObjectReference{
Name: trustManagerIssuerName,
Kind: "Issuer",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will need a tracker in GA to make this Issuer configurable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created https://redhat.atlassian.net/browse/CM-961 tracker for this.

Comment on lines +205 to +220
// managedAnnotationsModified checks whether all annotations present in desired
// exist with matching values in existing. Extra annotations on existing are
// allowed and do not count as modified.
func managedAnnotationsModified(desired, existing client.Object) bool {
existingAnnotations := existing.GetAnnotations()
for k, v := range desired.GetAnnotations() {
if existingAnnotations[k] != v {
return true
}
}
return false
}

// managedMetadataModified returns true if any managed label or annotation has drifted.
func managedMetadataModified(desired, existing client.Object) bool {
return managedLabelsModified(desired, existing) || managedAnnotationsModified(desired, existing)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is known and Mytreya had bought this up in a PR review(unable to recall which PR). We will need to handle this and as well restrict which labels and annotations a user can configure for GA.

Signed-off-by: chiragkyal <ckyal@redhat.com>
Signed-off-by: chiragkyal <ckyal@redhat.com>
@chiragkyal chiragkyal force-pushed the trust-manager-reconcilers branch from 520e13b to 78b0343 Compare March 18, 2026 08:34
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (6)
test/e2e/trustmanager_test.go (2)

868-875: ⚠️ Potential issue | 🟠 Major

Require an observed deletion before asserting recreation.

The helper can pass while the original resource is still present, so it may never prove controller-driven recreation.

Proposed fix
 func verifyTrustManagerResourceRecreation(deleteFunc func() error, getFunc func() error) {
 	err := deleteFunc()
 	Expect(err).ShouldNot(HaveOccurred())
 
+	Eventually(func() bool {
+		return errors.IsNotFound(getFunc())
+	}, lowTimeout, fastPollInterval).Should(BeTrue(), "resource was not observed as deleted")
+
 	Eventually(func() error {
 		return getFunc()
 	}, lowTimeout, fastPollInterval).Should(Succeed(), "resource was not recreated by controller")
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/trustmanager_test.go` around lines 868 - 875,
verifyTrustManagerResourceRecreation currently can succeed without first
observing the resource as deleted; update it to ensure a NotFound is observed
before asserting recreation. After calling deleteFunc() (in
verifyTrustManagerResourceRecreation), poll getFunc() until it returns a
NotFound error (use Eventually with an assertion that the returned error is a
NotFound), only then start the Existing Eventually check that waits for
getFunc() to succeed (resource recreated). Reference deleteFunc, getFunc and
verifyTrustManagerResourceRecreation when making the change.

69-84: ⚠️ Potential issue | 🟠 Major

Restore subscription env state after this Ordered suite.

BeforeAll mutates a cluster-scoped Subscription, but there is no matching AfterAll rollback. This leaks state into other e2e specs and makes outcomes suite-order dependent.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/trustmanager_test.go` around lines 69 - 84, BeforeAll mutates
cluster-scoped Subscription via patchSubscriptionWithEnvVars but doesn't restore
it; capture the original env/state before patching (e.g., read the
Subscription's env map into a package-scoped variable using the same ctx/loader)
and add a matching AfterAll that calls patchSubscriptionWithEnvVars to restore
the original map (or remove the injected keys
"UNSUPPORTED_ADDON_FEATURES"/"OPERATOR_LOG_LEVEL") and then
waitForDeploymentEnvVarAndRollout to ensure operatorDeploymentName in
operatorNamespace returns to the previous state; use the same helper functions
(patchSubscriptionWithEnvVars, waitForDeploymentEnvVarAndRollout) and variables
(ctx, loader, operatorNamespace, operatorDeploymentName) to implement the
rollback.
pkg/controller/trustmanager/certificates.go (1)

21-38: ⚠️ Potential issue | 🟠 Major

Use reconcile-scoped ctx instead of r.ctx for API calls.

Lines 27/37/61/70 use r.ctx (context.Background()), so these operations are not tied to request cancellation/deadlines. Thread ctx from Reconcile through processReconcileRequest and reconciliation helpers, then use that ctx for Exists/Patch/status updates.

Also applies to: 55-71

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/trustmanager/certificates.go` around lines 21 - 38, The
createOrApplyIssuer function (and other reconciliation helpers called from
processReconcileRequest) is using r.ctx (a background context) for API calls
(Exists, Patch and status updates) which prevents request-scoped cancellation;
change the function signature(s) to accept a ctx context.Context threaded from
Reconcile/processReconcileRequest and replace uses of r.ctx with that ctx when
calling r.Exists, r.Patch and any status update methods (and update callers
accordingly so Reconcile passes its ctx down), ensuring all reconciliation
helpers (e.g., createOrApplyIssuer) and invocations use the reconcile-scoped ctx
rather than r.ctx.
pkg/controller/trustmanager/controller.go (2)

108-117: ⚠️ Potential issue | 🟠 Major

Managed-label removal events can still be filtered out before reconcile.

Lines 110-117 compose predicates with controllerManagedResources, which checks only the current object labels. When common.ManagedResourceLabelKey is removed, the update can be dropped before mapFunc enqueues, so label self-healing may not run for watched children.

Also applies to: 123-131

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/trustmanager/controller.go` around lines 108 - 117, The
current predicate composition uses controllerManagedResources (which only
inspects the current object labels) inside withIgnoreStatusUpdatePredicates and
the similar block at lines 123-131, so label-removal updates (e.g., removal of
common.ManagedResourceLabelKey) can be filtered out before mapFunc runs; change
the predicate logic to inspect the UpdateEvent's OldObject as well or use a
custom predicate that returns true when the managed-label was present on the
OldObject (or changed between OldObject and Object), ensuring updates where the
managed-label was removed still pass the predicate and allow the mapFunc and
reconciliation to run; update references in the code where
withIgnoreStatusUpdatePredicates and controllerManagedResources are built to use
that adjusted predicate or a new containsManagedLabelChange predicate that
checks OldObject labels.

119-121: ⚠️ Potential issue | 🔴 Critical

Primary TrustManager watch should also admit deletion-start updates.

Line 120 uses only predicate.GenerationChangedPredicate{}. Deletion with finalizers starts as a metadata update (deletionTimestamp) that may not bump generation, so reconcile can miss finalizer cleanup and leave the CR terminating.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/trustmanager/controller.go` around lines 119 - 121, The
controller currently only uses predicate.GenerationChangedPredicate in the
For(...) predicate which ignores metadata-only updates like deletionTimestamp
(so finalizer-start deletion can be missed); update the predicate to also admit
metadata/deletion updates—for example combine
predicate.GenerationChangedPredicate with
predicate.ResourceVersionChangedPredicate (or add a custom predicate that
returns true when old.ObjectMeta.DeletionTimestamp !=
new.ObjectMeta.DeletionTimestamp) in the
ctrl.NewControllerManagedBy(...).For(...) builder so TrustManager reconciles
when deletion starts and finalizers can be cleaned up.
pkg/controller/trustmanager/deployments.go (1)

232-265: ⚠️ Potential issue | 🟠 Major

deploymentModified is stricter than SSA semantics and may cause repeated apply loops.

With SSA at Line 39, unmanaged fields from other managers can legitimately coexist. The exact checks in Lines 242-265 (including full map/list equality and container count/order) can keep reporting drift and trigger unnecessary patches/events every reconcile.

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/controller/trustmanager/utils.go`:
- Around line 171-182: updateResourceAnnotations currently copies user
annotations over existing ones (maps.Copy(existing, annotations)), allowing
user-provided keys to overwrite controller-required annotations; change the
merging logic in updateResourceAnnotations so existing annotations (from
obj.GetAnnotations()) take precedence: create or use the existing map and for
each key in the incoming annotations map only add the key if it does not already
exist (i.e., skip or preserve when existing[key] != ""), then call
obj.SetAnnotations(existing); reference updateResourceAnnotations, existing,
annotations, obj.GetAnnotations, and obj.SetAnnotations to locate the change.

---

Duplicate comments:
In `@pkg/controller/trustmanager/certificates.go`:
- Around line 21-38: The createOrApplyIssuer function (and other reconciliation
helpers called from processReconcileRequest) is using r.ctx (a background
context) for API calls (Exists, Patch and status updates) which prevents
request-scoped cancellation; change the function signature(s) to accept a ctx
context.Context threaded from Reconcile/processReconcileRequest and replace uses
of r.ctx with that ctx when calling r.Exists, r.Patch and any status update
methods (and update callers accordingly so Reconcile passes its ctx down),
ensuring all reconciliation helpers (e.g., createOrApplyIssuer) and invocations
use the reconcile-scoped ctx rather than r.ctx.

In `@pkg/controller/trustmanager/controller.go`:
- Around line 108-117: The current predicate composition uses
controllerManagedResources (which only inspects the current object labels)
inside withIgnoreStatusUpdatePredicates and the similar block at lines 123-131,
so label-removal updates (e.g., removal of common.ManagedResourceLabelKey) can
be filtered out before mapFunc runs; change the predicate logic to inspect the
UpdateEvent's OldObject as well or use a custom predicate that returns true when
the managed-label was present on the OldObject (or changed between OldObject and
Object), ensuring updates where the managed-label was removed still pass the
predicate and allow the mapFunc and reconciliation to run; update references in
the code where withIgnoreStatusUpdatePredicates and controllerManagedResources
are built to use that adjusted predicate or a new containsManagedLabelChange
predicate that checks OldObject labels.
- Around line 119-121: The controller currently only uses
predicate.GenerationChangedPredicate in the For(...) predicate which ignores
metadata-only updates like deletionTimestamp (so finalizer-start deletion can be
missed); update the predicate to also admit metadata/deletion updates—for
example combine predicate.GenerationChangedPredicate with
predicate.ResourceVersionChangedPredicate (or add a custom predicate that
returns true when old.ObjectMeta.DeletionTimestamp !=
new.ObjectMeta.DeletionTimestamp) in the
ctrl.NewControllerManagedBy(...).For(...) builder so TrustManager reconciles
when deletion starts and finalizers can be cleaned up.

In `@test/e2e/trustmanager_test.go`:
- Around line 868-875: verifyTrustManagerResourceRecreation currently can
succeed without first observing the resource as deleted; update it to ensure a
NotFound is observed before asserting recreation. After calling deleteFunc() (in
verifyTrustManagerResourceRecreation), poll getFunc() until it returns a
NotFound error (use Eventually with an assertion that the returned error is a
NotFound), only then start the Existing Eventually check that waits for
getFunc() to succeed (resource recreated). Reference deleteFunc, getFunc and
verifyTrustManagerResourceRecreation when making the change.
- Around line 69-84: BeforeAll mutates cluster-scoped Subscription via
patchSubscriptionWithEnvVars but doesn't restore it; capture the original
env/state before patching (e.g., read the Subscription's env map into a
package-scoped variable using the same ctx/loader) and add a matching AfterAll
that calls patchSubscriptionWithEnvVars to restore the original map (or remove
the injected keys "UNSUPPORTED_ADDON_FEATURES"/"OPERATOR_LOG_LEVEL") and then
waitForDeploymentEnvVarAndRollout to ensure operatorDeploymentName in
operatorNamespace returns to the previous state; use the same helper functions
(patchSubscriptionWithEnvVars, waitForDeploymentEnvVarAndRollout) and variables
(ctx, loader, operatorNamespace, operatorDeploymentName) to implement the
rollback.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b94637a6-7675-439c-b5da-f8ea3296da3e

📥 Commits

Reviewing files that changed from the base of the PR and between 2552e3a and 78b0343.

📒 Files selected for processing (22)
  • pkg/controller/common/core_validation_helpers.go
  • pkg/controller/common/validation.go
  • pkg/controller/istiocsr/deployments.go
  • pkg/controller/trustmanager/certificates.go
  • pkg/controller/trustmanager/certificates_test.go
  • pkg/controller/trustmanager/constants.go
  • pkg/controller/trustmanager/controller.go
  • pkg/controller/trustmanager/controller_test.go
  • pkg/controller/trustmanager/deployments.go
  • pkg/controller/trustmanager/deployments_test.go
  • pkg/controller/trustmanager/install_trustmanager.go
  • pkg/controller/trustmanager/rbacs.go
  • pkg/controller/trustmanager/rbacs_test.go
  • pkg/controller/trustmanager/serviceaccounts_test.go
  • pkg/controller/trustmanager/services_test.go
  • pkg/controller/trustmanager/test_utils.go
  • pkg/controller/trustmanager/utils.go
  • pkg/controller/trustmanager/utils_test.go
  • pkg/controller/trustmanager/webhooks_test.go
  • pkg/operator/setup_manager.go
  • test/e2e/trustmanager_test.go
  • test/e2e/utils_test.go
🚧 Files skipped from review as they are similar to previous changes (4)
  • pkg/controller/trustmanager/rbacs_test.go
  • pkg/controller/trustmanager/utils_test.go
  • pkg/operator/setup_manager.go
  • test/e2e/utils_test.go

Comment on lines +171 to +182
// updateResourceAnnotations merges user-provided annotations into the object's existing annotations.
// User-provided annotations take precedence over existing ones on key conflicts.
func updateResourceAnnotations(obj client.Object, annotations map[string]string) {
if len(annotations) == 0 {
return
}
existing := obj.GetAnnotations()
if existing == nil {
existing = make(map[string]string)
}
maps.Copy(existing, annotations)
obj.SetAnnotations(existing)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Prevent user annotations from overriding controller-required annotations.

updateResourceAnnotations currently lets user keys overwrite existing keys. On resources that carry required annotations from assets (for example webhook CA injection), a conflicting user annotation can break reconciliation/runtime behavior.

Proposed fix
 func updateResourceAnnotations(obj client.Object, annotations map[string]string) {
 	if len(annotations) == 0 {
 		return
 	}
 	existing := obj.GetAnnotations()
 	if existing == nil {
 		existing = make(map[string]string)
 	}
-	maps.Copy(existing, annotations)
+	for k, v := range annotations {
+		// Keep controller/asset-provided annotations authoritative on conflicts.
+		if _, alreadySet := existing[k]; alreadySet {
+			continue
+		}
+		existing[k] = v
+	}
 	obj.SetAnnotations(existing)
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/trustmanager/utils.go` around lines 171 - 182,
updateResourceAnnotations currently copies user annotations over existing ones
(maps.Copy(existing, annotations)), allowing user-provided keys to overwrite
controller-required annotations; change the merging logic in
updateResourceAnnotations so existing annotations (from obj.GetAnnotations())
take precedence: create or use the existing map and for each key in the incoming
annotations map only add the key if it does not already exist (i.e., skip or
preserve when existing[key] != ""), then call obj.SetAnnotations(existing);
reference updateResourceAnnotations, existing, annotations, obj.GetAnnotations,
and obj.SetAnnotations to locate the change.

@chiragkyal
Copy link
Member Author

/retest

// certificateModified compares only the fields we manage via SSA.
// We compare individual spec fields rather than the full Spec because
// cert-manager's webhook may default fields we don't set (e.g. Duration).
func certificateModified(desired, existing *certmanagerv1.Certificate) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have had these *Modified() helpers in a common package and used for both istio-csr and trust-manager. If not possible now, we can pick it up for GA.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can do that during GA. I can create a tracker for that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, for this and for configuring the issuer both, thank you!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Signed-off-by: chiragkyal <ckyal@redhat.com>
Signed-off-by: chiragkyal <ckyal@redhat.com>
@chiragkyal chiragkyal force-pushed the trust-manager-reconcilers branch from f6a97f2 to de5b043 Compare March 18, 2026 13:38
@bharath-b-rh
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 18, 2026
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 18, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bharath-b-rh, chiragkyal

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 18, 2026
@chiragkyal
Copy link
Member Author

/retest

@chiragkyal
Copy link
Member Author

/retest

2 similar comments
@chiragkyal
Copy link
Member Author

/retest

@chiragkyal
Copy link
Member Author

/retest

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 19, 2026

@chiragkyal: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-operator de5b043 link true /test e2e-operator

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants