CM-867: Implement trust-manager resource reconcilers#379
CM-867: Implement trust-manager resource reconcilers#379chiragkyal wants to merge 6 commits intoopenshift:masterfrom
Conversation
|
Important Review skippedAuto reviews are limited based on label configuration. 🚫 Review skipped — only excluded labels are configured. (1)
Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds TrustManager support: new image/version env vars and CSV entries; implements a TrustManager controller (service accounts, RBAC, services, deployment, issuer/certificate, validating webhook), shared validation/util helpers, scheme registration, RBAC role additions, and extensive unit and e2e tests. Changes
Sequence Diagram(s)sequenceDiagram
participant Reconciler as TrustManager<br/>Reconciler
participant Controller as Kubernetes<br/>API
participant ServiceAccount as ServiceAccount
participant RBAC as RBAC Resources
participant Deployment as Deployment
participant Services as Services
participant Issuer as Issuer/<br/>Certificate
participant Webhook as Webhook<br/>Config
Reconciler->>Reconciler: reconcileTrustManagerDeployment()
Reconciler->>Reconciler: Validate trust namespace
Reconciler->>ServiceAccount: createOrApplyServiceAccounts()
ServiceAccount->>Controller: Apply/Patch ServiceAccount
Reconciler->>RBAC: createOrApplyRBACResources()
RBAC->>Controller: Apply ClusterRole/Binding
RBAC->>Controller: Apply Namespaced Role/Binding
RBAC->>Controller: Apply LeaderElection Role/Binding
Reconciler->>Services: createOrApplyServices()
Services->>Controller: Apply Webhook Service
Services->>Controller: Apply Metrics Service
Reconciler->>Issuer: createOrApplyIssuer()
Issuer->>Controller: Apply Issuer
Reconciler->>Issuer: createOrApplyCertificate()
Issuer->>Controller: Apply Certificate
Reconciler->>Deployment: createOrApplyDeployment()
Deployment->>Controller: Apply Deployment with config
Reconciler->>Webhook: createOrApplyValidatingWebhookConfiguration()
Webhook->>Controller: Apply WebhookConfiguration
Reconciler->>Controller: updateStatusObservedState()
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes 🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Comment |
There was a problem hiding this comment.
Actionable comments posted: 13
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
Makefile (1)
62-62:⚠️ Potential issue | 🟡 MinorRemove extra quotes from TRUST_MANAGER_VERSION.
The version value has unnecessary quotes that are inconsistent with other version definitions (e.g.,
CERT_MANAGER_VERSION,ISTIO_CSR_VERSION). This may cause issues when the variable is expanded in shell commands.🔧 Proposed fix
-TRUST_MANAGER_VERSION ?= "v0.20.3" +TRUST_MANAGER_VERSION ?= v0.20.3🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@Makefile` at line 62, The TRUST_MANAGER_VERSION Makefile variable is defined with extra quotes; update the TRUST_MANAGER_VERSION assignment in the Makefile so it matches other version vars (e.g., CERT_MANAGER_VERSION, ISTIO_CSR_VERSION) by removing the surrounding quotes and using TRUST_MANAGER_VERSION ?= v0.20.3 so shell expansions and comparisons behave consistently.
🧹 Nitpick comments (7)
pkg/controller/common/client.go (1)
100-104: Reflection-based type creation may not work for all object types.The reflection logic creates a new instance via
reflect.New(reflect.TypeOf(obj).Elem()). This assumesobjis always a pointer to a struct. Ifobjis a non-pointer type or an interface without an underlying struct, this could panic or produce unexpected behavior.Consider adding a nil check or documenting the constraint that
objmust be a non-nil pointer to a struct.💡 Suggested defensive check
func (c *ctrlClientImpl) UpdateWithRetry( ctx context.Context, obj client.Object, opts ...client.UpdateOption, ) error { key := client.ObjectKeyFromObject(obj) if err := retry.RetryOnConflict(retry.DefaultRetry, func() error { + if obj == nil || reflect.TypeOf(obj).Kind() != reflect.Ptr { + return fmt.Errorf("obj must be a non-nil pointer") + } currentInterface := reflect.New(reflect.TypeOf(obj).Elem()).Interface() current, ok := currentInterface.(client.Object) if !ok { return fmt.Errorf("failed to convert to client.Object") }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/common/client.go` around lines 100 - 104, The reflection code using reflect.New(reflect.TypeOf(obj).Elem()) can panic or behave unexpectedly if obj is nil, not a pointer, or not pointing to a struct; update the logic around currentInterface/current to validate obj first (ensure obj != nil, TypeOf(obj).Kind() == reflect.Ptr and TypeOf(obj).Elem().Kind() == reflect.Struct) and return a clear error if the constraint is violated, or document that obj must be a non-nil pointer to a struct; keep the existing conversion to client.Object (current, ok) after these checks.pkg/controller/common/errors.go (2)
45-55: Simplify redundant string conversion.
fmt.Sprint(err.Error())is equivalent toerr.Error(). Thefmt.Sprintcall is unnecessary.♻️ Proposed fix
func NewMultipleInstanceError(err error) *ReconcileError { if err == nil { return nil } return &ReconcileError{ Reason: MultipleInstanceError, - Message: fmt.Sprint(err.Error()), + Message: err.Error(), Err: err, } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/common/errors.go` around lines 45 - 55, The NewMultipleInstanceError function redundantly wraps err.Error() with fmt.Sprint; replace fmt.Sprint(err.Error()) with err.Error() in the returned ReconcileError (in function NewMultipleInstanceError) so Message is set directly from err.Error(); keep the existing nil-check and preserve Reason: MultipleInstanceError and Err: err.
24-31: Consider addingUnwrap()method for proper error chain support.The
ReconcileErrorstruct wraps an underlying error but doesn't implementUnwrap(). This meanserrors.Is()anderrors.As()won't traverse to the wrapped error, which may limit error inspection capabilities.♻️ Proposed addition
Add after the
Error()method:// Unwrap returns the underlying error for error chain traversal. func (e *ReconcileError) Unwrap() error { return e.Err }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/common/errors.go` around lines 24 - 31, ReconcileError wraps an underlying error but lacks an Unwrap method, preventing errors.Is()/errors.As() from traversing the wrapped error; add a method func (e *ReconcileError) Unwrap() error that returns e.Err (place it after the existing Error() method) so standard error chain functions can inspect the underlying error; ensure the receiver is a pointer to ReconcileError and it simply returns the Err field.pkg/controller/istiocsr/utils.go (1)
136-230: Consider migrating decode functions to usecommon.DecodeObjBytes.The local
decode*ObjBytesfunctions (e.g.,decodeDeploymentObjBytes,decodeClusterRoleObjBytes, etc.) follow the same pattern and could potentially be replaced with calls tocommon.DecodeObjBytes[T]. This would further reduce duplication.However, this is a non-blocking suggestion since the current implementation works correctly.
♻️ Example migration for decodeDeploymentObjBytes
func decodeDeploymentObjBytes(objBytes []byte) *appsv1.Deployment { - obj, err := runtime.Decode(codecs.UniversalDecoder(appsv1.SchemeGroupVersion), objBytes) - if err != nil { - panic(err) - } - deployment, ok := obj.(*appsv1.Deployment) - if !ok { - panic("failed to convert to *appsv1.Deployment") - } - return deployment + return common.DecodeObjBytes[*appsv1.Deployment](codecs, appsv1.SchemeGroupVersion, objBytes) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/istiocsr/utils.go` around lines 136 - 230, The decodeXObjBytes functions (decodeDeploymentObjBytes, decodeClusterRoleObjBytes, decodeClusterRoleBindingObjBytes, decodeRoleObjBytes, decodeRoleBindingObjBytes, decodeServiceObjBytes, decodeServiceAccountObjBytes, decodeCertificateObjBytes) are duplicative; replace their bodies with calls to the generic helper common.DecodeObjBytes[T] (e.g., common.DecodeObjBytes[appsv1.Deployment](objBytes)) or remove the redundant functions and call common.DecodeObjBytes[T] where they’re used, ensure you import the common package, and update callers to the generic version so duplication is eliminated while preserving panic/error behavior.pkg/controller/common/utils.go (2)
37-40: Function name is misleading - only compares labels, not full metadata.
ObjectMetadataModifiedsuggests it compares all object metadata, but it only compares labels. Consider renaming toObjectLabelsModifiedfor clarity, or expanding to compare other metadata fields (annotations, etc.) if that's the intended behavior.♻️ Option 1: Rename for clarity
-// ObjectMetadataModified compares the labels of two objects and returns true if they differ. -func ObjectMetadataModified(desired, fetched client.Object) bool { +// ObjectLabelsModified compares the labels of two objects and returns true if they differ. +func ObjectLabelsModified(desired, fetched client.Object) bool { return !reflect.DeepEqual(desired.GetLabels(), fetched.GetLabels()) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/common/utils.go` around lines 37 - 40, The function ObjectMetadataModified only compares labels but the name implies full metadata comparison; rename ObjectMetadataModified to ObjectLabelsModified and update all callers to use ObjectLabelsModified (keeping the implementation that returns !reflect.DeepEqual(desired.GetLabels(), fetched.GetLabels())); alternatively, if you intended to compare more metadata, expand the function (e.g., add comparisons for desired.GetAnnotations()/fetched.GetAnnotations() and any other metadata fields) and keep or rename to ObjectMetadataModified accordingly—ensure to update references to the function name (ObjectMetadataModified/ObjectLabelsModified) and adjust tests or callers to match the chosen change.
63-75: Panic-on-failure pattern is acceptable for static asset decoding but should be documented.The function panics on decode failure or type mismatch. While this is appropriate for static assets loaded at initialization, callers should be aware of this behavior.
The panic message on line 68 uses
*new(T)which works but could be clearer.♻️ Suggested doc enhancement
// DecodeObjBytes decodes raw YAML/JSON bytes into a typed Kubernetes object. -// Panics on decode failure or type mismatch. +// Panics on decode failure or type mismatch. Use only for static assets +// loaded at initialization time where failure indicates a programmer error. func DecodeObjBytes[T runtime.Object](codecs serializer.CodecFactory, gv schema.GroupVersion, objBytes []byte) T {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/common/utils.go` around lines 63 - 75, The DecodeObjBytes function panics on decode/type mismatch which is OK for static asset decoding but needs explicit documentation and clearer panic messages; update the function comment for DecodeObjBytes to state that it will panic on decode failure or type mismatch and is intended for static/initialization use, and replace the current panic messages that use *new(T) with a clearer type representation (use a reflect-based or fmt-based type string for T) so the error text unambiguously shows the expected type in both decode and conversion failure cases.pkg/controller/trustmanager/install_trustmanager.go (1)
59-61: Consider usingcommon.FromClientErrorfor consistency.The error from
updateStatusObservedStateis wrapped withfmt.Errorf, while other reconciliation errors in this file usecommon.FromClientErrororcommon.NewIrrecoverableError. Consider using the common error utilities for consistent error classification and retry behavior.♻️ Suggested change
if err := r.updateStatusObservedState(trustManager); err != nil { - return fmt.Errorf("failed to update status observed state: %w", err) + return common.FromClientError(err, "failed to update status observed state") }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/trustmanager/install_trustmanager.go` around lines 59 - 61, The current return wraps the updateStatusObservedState error with fmt.Errorf; replace that wrapping with the project's error helper so the error is classified consistently and preserves retry semantics—call common.FromClientError(err, "failed to update status observed state") (or common.NewIrrecoverableError if the error should not be retried) where updateStatusObservedState is invoked so the reconciler uses the same error utilities as other handlers.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@api/operator/v1alpha1/features.go`:
- Line 28: The PreRelease values are inconsistent: FeatureTrustManager is set
with PreRelease: "TechPreview" while another feature uses featuregate.GA; make
them consistent by either (a) introducing a package-level constant (e.g., const
PreReleaseTechPreview = "TechPreview") and use PreRelease: PreReleaseTechPreview
for FeatureTrustManager, or (b) change the other feature that currently uses
featuregate.GA to use the equivalent string literal so all PreRelease fields use
the same string style; update references to the PreRelease value in features.go
(look for FeatureTrustManager and the feature using featuregate.GA) accordingly.
In `@pkg/controller/common/utils.go`:
- Around line 28-35: The inconsistency is that HasObjectChanged(desired,
fetched) in pkg/controller/common/utils.go returns false when types differ,
while the local hasObjectChanged in istiocsr/utils.go panics on type mismatch;
fix by aligning behavior: update hasObjectChanged in istiocsr/utils.go to
delegate to the common HasObjectChanged (or perform the same reflect.TypeOf
check and return false) instead of panicking, so both functions have identical
type-mismatch handling and avoid duplicated logic.
In `@pkg/controller/istiocsr/controller.go`:
- Around line 84-85: The predicate uses common.ManagedResourceLabelKey (which
resolves to the generic "app" key) to detect operator-owned resources; replace
that with a new operator-specific label constant (e.g., IstioCSRManagedLabelKey
or ManagedByIstioCSRKey) defined in the constants file and use it everywhere you
currently check objLabels[common.ManagedResourceLabelKey] ==
RequestEnqueueLabelValue (including the labelOk closure and the other predicate
block around RequestEnqueueLabelValue). Add the new constant to the constants
file, update the checks in controller.go to reference the new constant (and
RequestEnqueueLabelValue), and run/update any tests that assert label-based
behavior to use the new operator-specific label.
In `@pkg/controller/istiocsr/deployments.go`:
- Around line 376-385: The type-assertion failures for clusterIssuer and issuer
are internal invariant violations and should return an irrecoverable error
instead of a retryable client error; update the two branches that check
clusterIssuer (variable clusterIssuer) and issuer (variable issuer, case
issuerKind) to return an irrecoverable error (e.g., use the project's
irrecoverable error constructor such as common.NewIrrecoverableError or
equivalent) rather than common.FromClientError so these failures won't be
treated as retryable.
- Around line 321-324: r.Get can fail transiently so don't wrap every error as
irrecoverable; instead, after calling r.Get(ctx, sourceConfigMapKey,
sourceConfigMap) check the returned err and only return
common.NewIrrecoverableError for the actual misconfiguration case (e.g., when
the ConfigMap exists but is malformed/missing expected keys) and for all other
client/API errors return common.FromClientError(err, "failed to fetch CA
certificate ConfigMap %s/%s", sourceConfigMapKey.Namespace,
sourceConfigMapKey.Name); in short: replace the unconditional
NewIrrecoverableError usage with a branch that detects the misconfiguration
condition and uses NewIrrecoverableError there, and uses common.FromClientError
for transient/client errors from r.Get.
In `@pkg/controller/trustmanager/deployments.go`:
- Around line 233-249: The helper containerPortsMatch currently only checks
ContainerPort and must be updated to compare the full corev1.ContainerPort spec
to detect drift; modify containerPortsMatch to require same length and for each
desired port find an existing port with equal Name, ContainerPort, HostPort,
Protocol (and HostIP if used) — or simply perform a full struct equality (e.g.
reflect.DeepEqual) between desired and matching existing ports — so
name/protocol/hostPort differences no longer get ignored.
- Around line 55-60: The code currently silently no-ops when the baked-in
container named by trustManagerContainerName is missing, so change the update
helpers (updateDeploymentArgs, updateTLSSecretVolume, updateImage) and/or add a
pre-check in getDeploymentObject to explicitly detect whether a container with
trustManagerContainerName exists and return an error if not; locate the loop or
lookup that references containers in the
Deployment.Spec.Template.Spec.Containers, and if you cannot find a container
matching trustManagerContainerName, return a descriptive error (propagate it up
so getDeploymentObject fails) instead of silently continuing, ensuring a missing
asset container surfaces as a hard error and prevents applying the Deployment
with defaults.
In `@pkg/controller/trustmanager/utils.go`:
- Around line 184-199: The function currently always returns
common.NewMultipleInstanceError(statusMessage) when ignoreProcessing is false,
which incorrectly rejects the winning singleton; change the control flow so that
when this TrustManager instance is the chosen/oldest (i.e., the winning
singleton) you return nil instead of NewMultipleInstanceError. Concretely, after
the ignoreProcessing block check the condition that determines this instance is
the active singleton (use the same logic/flags already computed around
ignoreProcessing/trustManager) and return nil for that case; only return
common.NewMultipleInstanceError(fmt.Errorf("%s", statusMessage)) for non-winning
instances. Ensure the successful update path (updateCondition/updateWithRetry)
remains unchanged and keep the references to trustManager, statusMessage,
updateCondition, addProcessingRejectedAnnotation, and UpdateWithRetry.
- Around line 129-133: The validateTrustManagerConfig function currently rejects
a zero-valued TrustManagerConfig which breaks reconciliation and e2e tests;
update the code so empty/zero TrustManagerConfig is considered valid—either
remove the call to validateTrustManagerConfig from
reconcileTrustManagerDeployment or change validateTrustManagerConfig to always
return nil (accept empty config) and keep any other semantic validations if
needed; ensure the change targets the validateTrustManagerConfig function and
the reconcileTrustManagerDeployment call site so an empty
TrustManager.Spec.TrustManagerConfig passes validation.
In `@pkg/operator/setup_manager.go`:
- Around line 196-230: The map key uses client.Object pointer instances in
addControllerCacheConfig so identical resource types from different controllers
don't match (different pointer addresses), causing label selectors to be
overwritten instead of merged; change the keying to be type-based (e.g., use
reflect.TypeOf(res) or a GVK string) so the lookup/merge logic operates on the
resource type rather than the object pointer, update objectList's key type and
all access sites in addControllerCacheConfig to use that type-based key, and
preserve the existing merge logic (existing.Label.Requirements(),
labels.NewRequirement(...), etc.) so multiple controllers' label values are
combined.
In `@pkg/operator/starter.go`:
- Around line 156-160: The manager is being started with its own signal handler
via ctrl.SetupSignalHandler(), which creates a separate context and can cause
shutdown inconsistencies; change the goroutine that calls manager.Start so it
uses the passed ctx (from controllercmd.ControllerContext) instead of
ctrl.SetupSignalHandler() — i.e., call manager.Start(ctx) (or the appropriate
context variable name) so the unified controller manager respects the parent
context cancellation, and remove/avoid creating an independent signal handler.
Ensure the goroutine and any related shutdown logic reference the same ctx
variable.
In `@test/e2e/trustmanager_test.go`:
- Around line 206-210: The test should fail fast when the controller reports
Degraded=True instead of continuing to poll; replace the current early-return
(return false, nil) inside the degradedCondition check with a failing return
(return false, error) that includes the condition's Reason and Message so CI
shows the cause; use the existing symbols meta.FindStatusCondition,
trustManagerStatus.Conditions, v1alpha1.Degraded and metav1.ConditionTrue and
construct an error (e.g. via fmt.Errorf) containing degradedCondition.Reason and
degradedCondition.Message and add the fmt import if missing.
- Around line 44-58: BeforeAll currently patches the Subscription but never
restores the original UNSUPPORTED_ADDON_FEATURES value; capture the current
value before calling patchSubscriptionWithEnvVars (e.g., read the Subscription
and store its UNSUPPORTED_ADDON_FEATURES into a package-level var) and add a
matching AfterAll that calls patchSubscriptionWithEnvVars to restore that saved
value (or remove the key if it was empty) and then call
waitForDeploymentEnvVarAndRollout to wait for the operator to return to the
original env state; reference BeforeAll, patchSubscriptionWithEnvVars and
waitForDeploymentEnvVarAndRollout when locating where to add the capture and
restore logic.
---
Outside diff comments:
In `@Makefile`:
- Line 62: The TRUST_MANAGER_VERSION Makefile variable is defined with extra
quotes; update the TRUST_MANAGER_VERSION assignment in the Makefile so it
matches other version vars (e.g., CERT_MANAGER_VERSION, ISTIO_CSR_VERSION) by
removing the surrounding quotes and using TRUST_MANAGER_VERSION ?= v0.20.3 so
shell expansions and comparisons behave consistently.
---
Nitpick comments:
In `@pkg/controller/common/client.go`:
- Around line 100-104: The reflection code using
reflect.New(reflect.TypeOf(obj).Elem()) can panic or behave unexpectedly if obj
is nil, not a pointer, or not pointing to a struct; update the logic around
currentInterface/current to validate obj first (ensure obj != nil,
TypeOf(obj).Kind() == reflect.Ptr and TypeOf(obj).Elem().Kind() ==
reflect.Struct) and return a clear error if the constraint is violated, or
document that obj must be a non-nil pointer to a struct; keep the existing
conversion to client.Object (current, ok) after these checks.
In `@pkg/controller/common/errors.go`:
- Around line 45-55: The NewMultipleInstanceError function redundantly wraps
err.Error() with fmt.Sprint; replace fmt.Sprint(err.Error()) with err.Error() in
the returned ReconcileError (in function NewMultipleInstanceError) so Message is
set directly from err.Error(); keep the existing nil-check and preserve Reason:
MultipleInstanceError and Err: err.
- Around line 24-31: ReconcileError wraps an underlying error but lacks an
Unwrap method, preventing errors.Is()/errors.As() from traversing the wrapped
error; add a method func (e *ReconcileError) Unwrap() error that returns e.Err
(place it after the existing Error() method) so standard error chain functions
can inspect the underlying error; ensure the receiver is a pointer to
ReconcileError and it simply returns the Err field.
In `@pkg/controller/common/utils.go`:
- Around line 37-40: The function ObjectMetadataModified only compares labels
but the name implies full metadata comparison; rename ObjectMetadataModified to
ObjectLabelsModified and update all callers to use ObjectLabelsModified (keeping
the implementation that returns !reflect.DeepEqual(desired.GetLabels(),
fetched.GetLabels())); alternatively, if you intended to compare more metadata,
expand the function (e.g., add comparisons for
desired.GetAnnotations()/fetched.GetAnnotations() and any other metadata fields)
and keep or rename to ObjectMetadataModified accordingly—ensure to update
references to the function name (ObjectMetadataModified/ObjectLabelsModified)
and adjust tests or callers to match the chosen change.
- Around line 63-75: The DecodeObjBytes function panics on decode/type mismatch
which is OK for static asset decoding but needs explicit documentation and
clearer panic messages; update the function comment for DecodeObjBytes to state
that it will panic on decode failure or type mismatch and is intended for
static/initialization use, and replace the current panic messages that use
*new(T) with a clearer type representation (use a reflect-based or fmt-based
type string for T) so the error text unambiguously shows the expected type in
both decode and conversion failure cases.
In `@pkg/controller/istiocsr/utils.go`:
- Around line 136-230: The decodeXObjBytes functions (decodeDeploymentObjBytes,
decodeClusterRoleObjBytes, decodeClusterRoleBindingObjBytes, decodeRoleObjBytes,
decodeRoleBindingObjBytes, decodeServiceObjBytes, decodeServiceAccountObjBytes,
decodeCertificateObjBytes) are duplicative; replace their bodies with calls to
the generic helper common.DecodeObjBytes[T] (e.g.,
common.DecodeObjBytes[appsv1.Deployment](objBytes)) or remove the redundant
functions and call common.DecodeObjBytes[T] where they’re used, ensure you
import the common package, and update callers to the generic version so
duplication is eliminated while preserving panic/error behavior.
In `@pkg/controller/trustmanager/install_trustmanager.go`:
- Around line 59-61: The current return wraps the updateStatusObservedState
error with fmt.Errorf; replace that wrapping with the project's error helper so
the error is classified consistently and preserves retry semantics—call
common.FromClientError(err, "failed to update status observed state") (or
common.NewIrrecoverableError if the error should not be retried) where
updateStatusObservedState is invoked so the reconciler uses the same error
utilities as other handlers.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 6efa00d3-a92f-40fc-bc0f-c86732c30476
⛔ Files ignored due to path filters (9)
go.work.sumis excluded by!**/*.sumtools/go.sumis excluded by!**/*.sumvendor/github.com/maxbrunsfeld/counterfeiter/v6/.gitignoreis excluded by!**/vendor/**,!vendor/**vendor/github.com/maxbrunsfeld/counterfeiter/v6/README.mdis excluded by!**/vendor/**,!vendor/**vendor/github.com/maxbrunsfeld/counterfeiter/v6/arguments/parser.gois excluded by!**/vendor/**,!vendor/**vendor/github.com/maxbrunsfeld/counterfeiter/v6/generator/fake.gois excluded by!**/vendor/**,!vendor/**vendor/github.com/maxbrunsfeld/counterfeiter/v6/generator/interface_template.gois excluded by!**/vendor/**,!vendor/**vendor/github.com/maxbrunsfeld/counterfeiter/v6/generator/loader.gois excluded by!**/vendor/**,!vendor/**vendor/modules.txtis excluded by!**/vendor/**,!vendor/**
📒 Files selected for processing (50)
Makefileapi/operator/v1alpha1/features.gobundle/manifests/cert-manager-operator.clusterserviceversion.yamlconfig/manager/manager.yamlconfig/rbac/role.yamlpkg/controller/common/client.gopkg/controller/common/constants.gopkg/controller/common/errors.gopkg/controller/common/fakes/fake_ctrl_client.gopkg/controller/common/utils.gopkg/controller/istiocsr/certificates.gopkg/controller/istiocsr/certificates_test.gopkg/controller/istiocsr/controller.gopkg/controller/istiocsr/controller_test.gopkg/controller/istiocsr/deployments.gopkg/controller/istiocsr/deployments_test.gopkg/controller/istiocsr/install_instiocsr_test.gopkg/controller/istiocsr/install_istiocsr.gopkg/controller/istiocsr/networkpolicies.gopkg/controller/istiocsr/rbacs.gopkg/controller/istiocsr/rbacs_test.gopkg/controller/istiocsr/serviceaccounts.gopkg/controller/istiocsr/serviceaccounts_test.gopkg/controller/istiocsr/services.gopkg/controller/istiocsr/services_test.gopkg/controller/istiocsr/utils.gopkg/controller/trustmanager/certificates.gopkg/controller/trustmanager/certificates_test.gopkg/controller/trustmanager/constants.gopkg/controller/trustmanager/controller.gopkg/controller/trustmanager/controller_test.gopkg/controller/trustmanager/deployments.gopkg/controller/trustmanager/deployments_test.gopkg/controller/trustmanager/install_trustmanager.gopkg/controller/trustmanager/rbacs.gopkg/controller/trustmanager/rbacs_test.gopkg/controller/trustmanager/serviceaccounts.gopkg/controller/trustmanager/serviceaccounts_test.gopkg/controller/trustmanager/services.gopkg/controller/trustmanager/services_test.gopkg/controller/trustmanager/test_utils.gopkg/controller/trustmanager/utils.gopkg/controller/trustmanager/utils_test.gopkg/controller/trustmanager/webhooks.gopkg/controller/trustmanager/webhooks_test.gopkg/features/features_test.gopkg/operator/setup_manager.gopkg/operator/starter.gotest/e2e/trustmanager_test.gotools/go.mod
b48171b to
198894c
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
Makefile (1)
62-62:⚠️ Potential issue | 🟡 MinorRemove quotes around version string for consistency.
TRUST_MANAGER_VERSIONhas quotes around the value ("v0.20.3") while other version variables likeCERT_MANAGER_VERSIONandISTIO_CSR_VERSIONdon't. This inconsistency may cause the literal quotes to be included in image tags when the variable is expanded.🐛 Proposed fix
-TRUST_MANAGER_VERSION ?= "v0.20.3" +TRUST_MANAGER_VERSION ?= v0.20.3🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@Makefile` at line 62, The TRUST_MANAGER_VERSION variable in the Makefile is set with surrounding quotes which can inject literal quotes into expansions; change TRUST_MANAGER_VERSION ?= "v0.20.3" to an unquoted value TRUST_MANAGER_VERSION ?= v0.20.3 to match other variables (e.g., CERT_MANAGER_VERSION, ISTIO_CSR_VERSION) and ensure image tags or string concatenations don't include stray quotes, then run a quick make target that uses TRUST_MANAGER_VERSION to verify correct expansion.
🧹 Nitpick comments (4)
pkg/controller/trustmanager/test_utils.go (1)
55-62: Consider documenting that tests must inject a client separately.The
testReconcilerfactory creates aReconcilerwithout initializing theCtrlClientfield. Tests calling methods liker.Exists()orr.Patch()will panic unless they separately inject a client. This appears intentional for flexibility, but a brief comment would help future maintainers understand the expected usage pattern.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/trustmanager/test_utils.go` around lines 55 - 62, Add a short comment to the testReconciler factory explaining that it intentionally does not initialize the Reconciler.CtrlClient and that tests must inject a fake or real client before calling client-dependent methods (e.g., Reconciler.Exists, Reconciler.Patch); place the comment immediately above the testReconciler function and reference the Reconciler type and CtrlClient field so future maintainers understand the expected usage pattern.pkg/controller/trustmanager/services_test.go (1)
105-116: Reduce call-order coupling inExistsstubs.Multiple cases rely on
existsCallsequencing to choose webhook vs metrics objects. Prefer branching bykey.Nameso tests stay stable if reconciliation call order changes.Refactor example
- existsCall := 0 m.ExistsCalls(func(ctx context.Context, key client.ObjectKey, obj client.Object) (bool, error) { - existsCall++ - var svc *corev1.Service - if existsCall == 1 { - svc = getWebhookServiceObject(testResourceLabels(), testResourceAnnotations()) - } else { - svc = getMetricsServiceObject(testResourceLabels(), testResourceAnnotations()) - } + var svc *corev1.Service + switch key.Name { + case trustManagerServiceName: + svc = getWebhookServiceObject(testResourceLabels(), testResourceAnnotations()) + case trustManagerMetricsServiceName: + svc = getMetricsServiceObject(testResourceLabels(), testResourceAnnotations()) + default: + return false, nil + } svc.DeepCopyInto(obj.(*corev1.Service)) return true, nil })Also applies to: 124-136, 149-160, 169-180, 189-200
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/trustmanager/services_test.go` around lines 105 - 116, The current Exists stub in m.ExistsCalls uses an incremental existsCall counter to choose between webhook vs metrics Service fixtures (existsCall variable and the anonymous function passed to m.ExistsCalls), which couples tests to call order; change the stub to inspect the incoming client.ObjectKey (key.Name) and branch on that (e.g., if key.Name == "<webhook-service-name>" use getWebhookServiceObject(...), else if key.Name == "<metrics-service-name>" use getMetricsServiceObject(...)) so it returns the correct Service into obj.(*corev1.Service) regardless of call sequencing; apply the same key.Name-based branching fix to the other Exists stubs referenced (the blocks around lines 124-136, 149-160, 169-180, 189-200) so all tests are stable when reconciliation call order changes.pkg/controller/trustmanager/deployments_test.go (1)
475-477: Make the “missing image” case deterministic by explicitly clearing the env var.The
setImage=falsebranch currently relies on ambient process env state. Explicitly unset/cleartrustManagerImageNameEnvVarNamein that branch to avoid environment-dependent failures.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/trustmanager/deployments_test.go` around lines 475 - 477, The test relies on tt.setImage to decide whether to set the env var trustManagerImageNameEnvVarName, but when setImage is false it leaves the process env untouched which can cause nondeterministic failures; in the else branch of that if (where tt.setImage is false) explicitly clear the environment by calling os.Unsetenv(trustManagerImageNameEnvVarName) (or otherwise ensure the variable is removed) so the test no longer depends on ambient process env; update deployments_test.go around the tt.setImage block that currently sets testImage to use this explicit unset when not setting the image.pkg/controller/trustmanager/rbacs_test.go (1)
587-607: Consider failing fast if multiple ServiceAccount subjects exist.The helper iterates through all subjects but only validates the first ServiceAccount found. If multiple ServiceAccount subjects exist unexpectedly, the test would pass on the first match without flagging potential issues.
🔧 Suggested improvement
func assertSubjects(t *testing.T, subjects []rbacv1.Subject, expectedName, expectedNamespace string) { t.Helper() if len(subjects) == 0 { t.Fatal("expected at least one subject") } - found := false + var found *rbacv1.Subject for _, s := range subjects { if s.Kind == "ServiceAccount" { - found = true - if s.Name != expectedName { - t.Errorf("expected subject name %q, got %q", expectedName, s.Name) - } - if s.Namespace != expectedNamespace { - t.Errorf("expected subject namespace %q, got %q", expectedNamespace, s.Namespace) + if found != nil { + t.Error("expected exactly one ServiceAccount subject, found multiple") } + found = &s } } - if !found { + if found == nil { t.Error("expected to find a ServiceAccount subject") + return + } + if found.Name != expectedName { + t.Errorf("expected subject name %q, got %q", expectedName, found.Name) + } + if found.Namespace != expectedNamespace { + t.Errorf("expected subject namespace %q, got %q", expectedNamespace, found.Namespace) } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/trustmanager/rbacs_test.go` around lines 587 - 607, The helper assertSubjects currently scans subjects and validates only the first ServiceAccount it finds, which lets tests pass if multiple ServiceAccount subjects exist; update assertSubjects to detect and fail when more than one ServiceAccount is present (or alternatively require exactly one) and validate all ServiceAccount fields deterministically: count ServiceAccount occurrences while iterating the subjects slice, if count == 0 call t.Error as before, if count > 1 call t.Fatalf or t.Errorf to fail fast, and continue to assert that the (single) ServiceAccount's Name and Namespace match expectedName/expectedNamespace; reference the assertSubjects function and its subjects parameter to locate where to add the counting and failure logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/controller/trustmanager/controller.go`:
- Around line 105-119: The current use of withIgnoreStatusUpdatePredicates
(which is builder.WithPredicates(predicate.GenerationChangedPredicate{},
controllerManagedResources)) on the Deployment, Certificate, and Issuer watches
suppresses metadata-only updates; replace the third argument in Watches for
&appsv1.Deployment{}, &certmanagerv1.Certificate{}, and &certmanagerv1.Issuer{}
to use controllerManagedResourcePredicates (or a custom predicate that also
reacts to label/annotation changes) instead of withIgnoreStatusUpdatePredicates
so metadata-only drift will enqueue reconciles; modify the Watches lines
referencing withIgnoreStatusUpdatePredicates accordingly (or implement and
reference a new predicate function that detects metadata changes if preferred).
---
Outside diff comments:
In `@Makefile`:
- Line 62: The TRUST_MANAGER_VERSION variable in the Makefile is set with
surrounding quotes which can inject literal quotes into expansions; change
TRUST_MANAGER_VERSION ?= "v0.20.3" to an unquoted value TRUST_MANAGER_VERSION ?=
v0.20.3 to match other variables (e.g., CERT_MANAGER_VERSION, ISTIO_CSR_VERSION)
and ensure image tags or string concatenations don't include stray quotes, then
run a quick make target that uses TRUST_MANAGER_VERSION to verify correct
expansion.
---
Nitpick comments:
In `@pkg/controller/trustmanager/deployments_test.go`:
- Around line 475-477: The test relies on tt.setImage to decide whether to set
the env var trustManagerImageNameEnvVarName, but when setImage is false it
leaves the process env untouched which can cause nondeterministic failures; in
the else branch of that if (where tt.setImage is false) explicitly clear the
environment by calling os.Unsetenv(trustManagerImageNameEnvVarName) (or
otherwise ensure the variable is removed) so the test no longer depends on
ambient process env; update deployments_test.go around the tt.setImage block
that currently sets testImage to use this explicit unset when not setting the
image.
In `@pkg/controller/trustmanager/rbacs_test.go`:
- Around line 587-607: The helper assertSubjects currently scans subjects and
validates only the first ServiceAccount it finds, which lets tests pass if
multiple ServiceAccount subjects exist; update assertSubjects to detect and fail
when more than one ServiceAccount is present (or alternatively require exactly
one) and validate all ServiceAccount fields deterministically: count
ServiceAccount occurrences while iterating the subjects slice, if count == 0
call t.Error as before, if count > 1 call t.Fatalf or t.Errorf to fail fast, and
continue to assert that the (single) ServiceAccount's Name and Namespace match
expectedName/expectedNamespace; reference the assertSubjects function and its
subjects parameter to locate where to add the counting and failure logic.
In `@pkg/controller/trustmanager/services_test.go`:
- Around line 105-116: The current Exists stub in m.ExistsCalls uses an
incremental existsCall counter to choose between webhook vs metrics Service
fixtures (existsCall variable and the anonymous function passed to
m.ExistsCalls), which couples tests to call order; change the stub to inspect
the incoming client.ObjectKey (key.Name) and branch on that (e.g., if key.Name
== "<webhook-service-name>" use getWebhookServiceObject(...), else if key.Name
== "<metrics-service-name>" use getMetricsServiceObject(...)) so it returns the
correct Service into obj.(*corev1.Service) regardless of call sequencing; apply
the same key.Name-based branching fix to the other Exists stubs referenced (the
blocks around lines 124-136, 149-160, 169-180, 189-200) so all tests are stable
when reconciliation call order changes.
In `@pkg/controller/trustmanager/test_utils.go`:
- Around line 55-62: Add a short comment to the testReconciler factory
explaining that it intentionally does not initialize the Reconciler.CtrlClient
and that tests must inject a fake or real client before calling client-dependent
methods (e.g., Reconciler.Exists, Reconciler.Patch); place the comment
immediately above the testReconciler function and reference the Reconciler type
and CtrlClient field so future maintainers understand the expected usage
pattern.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: d1eb27f9-4095-4c34-8155-591dc092dfc6
📒 Files selected for processing (24)
Makefilebundle/manifests/cert-manager-operator.clusterserviceversion.yamlconfig/manager/manager.yamlpkg/controller/common/utils.gopkg/controller/trustmanager/certificates.gopkg/controller/trustmanager/certificates_test.gopkg/controller/trustmanager/constants.gopkg/controller/trustmanager/controller.gopkg/controller/trustmanager/controller_test.gopkg/controller/trustmanager/deployments.gopkg/controller/trustmanager/deployments_test.gopkg/controller/trustmanager/install_trustmanager.gopkg/controller/trustmanager/rbacs.gopkg/controller/trustmanager/rbacs_test.gopkg/controller/trustmanager/serviceaccounts.gopkg/controller/trustmanager/serviceaccounts_test.gopkg/controller/trustmanager/services.gopkg/controller/trustmanager/services_test.gopkg/controller/trustmanager/test_utils.gopkg/controller/trustmanager/utils.gopkg/controller/trustmanager/utils_test.gopkg/controller/trustmanager/webhooks.gopkg/controller/trustmanager/webhooks_test.gopkg/operator/setup_manager.go
🚧 Files skipped from review as they are similar to previous changes (4)
- pkg/controller/trustmanager/webhooks_test.go
- pkg/controller/trustmanager/serviceaccounts_test.go
- pkg/operator/setup_manager.go
- pkg/controller/trustmanager/webhooks.go
198894c to
53cb39e
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
Makefile (1)
318-319: Make local-run debug/feature flags configurable instead of commented.Line 318 and Line 319 are useful but currently require file edits. Consider routing them through a variable so developers can toggle at invocation time.
♻️ Suggested tweak
-# --unsupported-addon-features="TrustManager=true" \ -# --v=4 + $(LOCAL_RUN_ARGS)Add once near this target (outside this range):
LOCAL_RUN_ARGS ?= # Example: # make local-run LOCAL_RUN_ARGS='--unsupported-addon-features=TrustManager=true --v=4'🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@Makefile` around lines 318 - 319, The commented debug flags are currently hard-coded and require editing the Makefile; introduce a configurable variable (e.g., LOCAL_RUN_ARGS) and update the local-run target to append this variable when invoking the command so developers can pass flags at invocation time (for example: make local-run LOCAL_RUN_ARGS='--unsupported-addon-features=TrustManager=true --v=4'); add a default declaration like LOCAL_RUN_ARGS ?= near the top of the target and replace the commented flags in the local-run invocation with $(LOCAL_RUN_ARGS) so toggling is possible without file edits.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@Makefile`:
- Around line 318-319: The commented debug flags are currently hard-coded and
require editing the Makefile; introduce a configurable variable (e.g.,
LOCAL_RUN_ARGS) and update the local-run target to append this variable when
invoking the command so developers can pass flags at invocation time (for
example: make local-run
LOCAL_RUN_ARGS='--unsupported-addon-features=TrustManager=true --v=4'); add a
default declaration like LOCAL_RUN_ARGS ?= near the top of the target and
replace the commented flags in the local-run invocation with $(LOCAL_RUN_ARGS)
so toggling is possible without file edits.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 2ea7a87d-3bd2-4897-a939-01bdee005e40
📒 Files selected for processing (3)
Makefilebundle/manifests/cert-manager-operator.clusterserviceversion.yamlconfig/manager/manager.yaml
🚧 Files skipped from review as they are similar to previous changes (2)
- config/manager/manager.yaml
- bundle/manifests/cert-manager-operator.clusterserviceversion.yaml
|
@chiragkyal: This pull request references CM-867 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target either version "4.22." or "openshift-4.22.", but it targets "cert-manager-1.19" instead. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@chiragkyal: This pull request references CM-867 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target either version "4.22." or "openshift-4.22.", but it targets "cert-manager-1.19" instead. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@chiragkyal: This pull request references CM-867 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target either version "4.22." or "openshift-4.22.", but it targets "cert-manager-1.19" instead. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/retest |
|
@chiragkyal: An error was encountered searching for bug CM-867 on the Jira server at https://issues.redhat.com. No known errors were detected, please see the full error message for details. Full error message.
request failed. Please analyze the request body for more details. Status code: 403:
Please contact an administrator to resolve this issue, then request a bug refresh with DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/controller/trustmanager/deployments_test.go`:
- Around line 475-477: The test is flaky because when tt.setImage is false it
leaves the environment variable uncontrolled; modify the test to explicitly
clear the env var in that branch so the "missing image" case is
deterministic—replace the current conditional with: when tt.setImage is true
call t.Setenv(trustManagerImageNameEnvVarName, testImage), otherwise call
t.Setenv(trustManagerImageNameEnvVarName, "") (or explicitly unset via
os.Unsetenv with proper cleanup) so trustManagerImageNameEnvVarName is
guaranteed absent/empty during the test.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: f8d1a114-b2d3-4129-a4f6-fc6552195a93
📒 Files selected for processing (9)
pkg/controller/trustmanager/certificates_test.gopkg/controller/trustmanager/controller_test.gopkg/controller/trustmanager/deployments_test.gopkg/controller/trustmanager/rbacs_test.gopkg/controller/trustmanager/serviceaccounts_test.gopkg/controller/trustmanager/services_test.gopkg/controller/trustmanager/test_utils.gopkg/controller/trustmanager/utils_test.gopkg/controller/trustmanager/webhooks_test.go
There was a problem hiding this comment.
♻️ Duplicate comments (2)
test/e2e/trustmanager_test.go (2)
69-83:⚠️ Potential issue | 🟠 MajorMissing cleanup: Restore feature gate state after test suite.
The
BeforeAllmodifies the cluster-scoped Subscription to enableTrustManager=true, but there's no correspondingAfterAllto restore the originalUNSUPPORTED_ADDON_FEATURESvalue. This leaks state into subsequent e2e specs.Consider capturing the original value before patching and restoring it in
AfterAll:var originalUnsupportedAddonFeatures string BeforeAll(func() { // Capture original value first // ... then patch }) AfterAll(func() { // Restore original value using patchSubscriptionWithEnvVars })🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/trustmanager_test.go` around lines 69 - 83, Capture the current value of UNSUPPORTED_ADDON_FEATURES into a package-level variable (e.g., originalUnsupportedAddonFeatures) inside BeforeAll before calling patchSubscriptionWithEnvVars, then add an AfterAll that calls patchSubscriptionWithEnvVars to restore the original value and uses waitForDeploymentEnvVarAndRollout to ensure the operator rolls out with the restored env var; reference the existing BeforeAll, patchSubscriptionWithEnvVars, waitForDeploymentEnvVarAndRollout and add an AfterAll to perform the restore and verification.
812-815:⚠️ Potential issue | 🟡 MinorReturn error on
Degraded=Truefor faster failure diagnosis.When the controller reports
Degraded=True, continuing to poll until timeout delays CI feedback. Returning an error with the degraded reason would help diagnose failures faster.degradedCondition := meta.FindStatusCondition(trustManagerStatus.Conditions, v1alpha1.Degraded) if degradedCondition != nil && degradedCondition.Status == metav1.ConditionTrue { - return false, nil + return false, fmt.Errorf("TrustManager is degraded: %s - %s", degradedCondition.Reason, degradedCondition.Message) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/trustmanager_test.go` around lines 812 - 815, The test currently treats a Degraded=True condition as a non-fatal state; change the logic so that when meta.FindStatusCondition(trustManagerStatus.Conditions, v1alpha1.Degraded) returns a condition with Status==metav1.ConditionTrue the function returns an error instead of (false, nil). Construct the error including the degraded condition's Reason and/or Message (e.g., via fmt.Errorf("controller degraded: %s - %s", degradedCondition.Reason, degradedCondition.Message)) so callers fail fast and get diagnostic info; update the code path using degradedCondition, meta.FindStatusCondition, trustManagerStatus.Conditions and v1alpha1.Degraded accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@test/e2e/trustmanager_test.go`:
- Around line 69-83: Capture the current value of UNSUPPORTED_ADDON_FEATURES
into a package-level variable (e.g., originalUnsupportedAddonFeatures) inside
BeforeAll before calling patchSubscriptionWithEnvVars, then add an AfterAll that
calls patchSubscriptionWithEnvVars to restore the original value and uses
waitForDeploymentEnvVarAndRollout to ensure the operator rolls out with the
restored env var; reference the existing BeforeAll,
patchSubscriptionWithEnvVars, waitForDeploymentEnvVarAndRollout and add an
AfterAll to perform the restore and verification.
- Around line 812-815: The test currently treats a Degraded=True condition as a
non-fatal state; change the logic so that when
meta.FindStatusCondition(trustManagerStatus.Conditions, v1alpha1.Degraded)
returns a condition with Status==metav1.ConditionTrue the function returns an
error instead of (false, nil). Construct the error including the degraded
condition's Reason and/or Message (e.g., via fmt.Errorf("controller degraded: %s
- %s", degradedCondition.Reason, degradedCondition.Message)) so callers fail
fast and get diagnostic info; update the code path using degradedCondition,
meta.FindStatusCondition, trustManagerStatus.Conditions and v1alpha1.Degraded
accordingly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: e5fae2f5-dd94-4fbc-bacd-025ec1420a92
📒 Files selected for processing (2)
test/e2e/trustmanager_test.gotest/e2e/utils_test.go
|
@chiragkyal: An error was encountered searching for bug CM-867 on the Jira server at https://issues.redhat.com. No known errors were detected, please see the full error message for details. Full error message.
request failed. Please analyze the request body for more details. Status code: 403:
Please contact an administrator to resolve this issue, then request a bug refresh with DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
@coderabbitai resume |
✅ Actions performedReviews resumed. |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
2d56512 to
689b09c
Compare
There was a problem hiding this comment.
Actionable comments posted: 5
♻️ Duplicate comments (3)
pkg/controller/common/utils.go (1)
54-66:⚠️ Potential issue | 🔴 CriticalReplace panic-based decoding with error returns.
Line 56 through Line 64 can panic on decode/type mismatch. In controller paths, this can terminate the operator process instead of returning a reconcile error.
Proposed fix
-func DecodeObjBytes[T runtime.Object](codecs serializer.CodecFactory, gv schema.GroupVersion, objBytes []byte) T { +func DecodeObjBytes[T runtime.Object](codecs serializer.CodecFactory, gv schema.GroupVersion, objBytes []byte) (T, error) { + var zero T obj, err := runtime.Decode(codecs.UniversalDecoder(gv), objBytes) if err != nil { - panic(fmt.Sprintf("failed to decode object bytes for %T: %v", *new(T), err)) + return zero, fmt.Errorf("failed to decode object bytes for %T: %w", *new(T), err) } typed, ok := obj.(T) if !ok { - panic(fmt.Sprintf("failed to convert decoded object to %T", *new(T))) + return zero, fmt.Errorf("failed to convert decoded object to %T", *new(T)) } - return typed + return typed, nil }As per coding guidelines,
**: -Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/common/utils.go` around lines 54 - 66, The DecodeObjBytes function currently panics on decode/type-mismatch; change its signature to return (T, error) instead of panicking, i.e. DecodeObjBytes[T runtime.Object](codecs serializer.CodecFactory, gv schema.GroupVersion, objBytes []byte) (T, error); on decode failure return the zero value of T and an error created with fmt.Errorf that includes context (e.g. the target type from *new(T) and the original err), and on failed type assertion return zero T with a descriptive error instead of panic; update all callers of DecodeObjBytes to handle the error and propagate/reconcile appropriately.test/e2e/trustmanager_test.go (2)
813-816:⚠️ Potential issue | 🟠 MajorReturn the degraded reason instead of timing out.
return false, nilhere turns an irrecoverable controller failure into a generichighTimeoutexpiry and drops the condition details. Fail fast with the degradedReasonandMessage.Proposed fix
degradedCondition := meta.FindStatusCondition(trustManagerStatus.Conditions, v1alpha1.Degraded) if degradedCondition != nil && degradedCondition.Status == metav1.ConditionTrue { - return false, nil + return false, fmt.Errorf( + "TrustManager degraded: %s: %s", + degradedCondition.Reason, + degradedCondition.Message, + ) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/trustmanager_test.go` around lines 813 - 816, The code currently treats a Degraded condition as a generic timeout by returning (false, nil); instead detect the degradedCondition found via meta.FindStatusCondition(trustManagerStatus.Conditions, v1alpha1.Degraded) and fail fast by returning an error that includes degradedCondition.Reason and degradedCondition.Message so the test surface the actual controller failure details rather than expiring with highTimeout.
69-84:⚠️ Potential issue | 🟠 MajorRestore the patched Subscription env after this suite.
BeforeAllmutates the cluster-scoped Subscription but nothing restores the previousUNSUPPORTED_ADDON_FEATURESandOPERATOR_LOG_LEVELvalues. That leaks state into later e2e specs and makes the suite order-dependent.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/trustmanager_test.go` around lines 69 - 84, The BeforeAll block mutates cluster Subscription env vars but never restores them; capture the current UNSUPPORTED_ADDON_FEATURES and OPERATOR_LOG_LEVEL values before calling patchSubscriptionWithEnvVars in BeforeAll and add an AfterAll that calls patchSubscriptionWithEnvVars to restore the original values (or remove keys if they were absent). Modify the test to use the existing BeforeAll, patchSubscriptionWithEnvVars, and waitForDeploymentEnvVarAndRollout symbols and ensure AfterAll waits for the operator rollout to return to the original env state (using waitForDeploymentEnvVarAndRollout) so cluster-scoped state is cleaned up after the suite.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/controller/trustmanager/certificates.go`:
- Around line 21-37: The helper functions are using r.ctx (a background context)
for client calls which decouples them from the reconcile cancellation; change
the signatures of processReconcileRequest, reconcileTrustManagerDeployment,
createOrApplyIssuer, createOrApplyCertificate and any other helper that issues
client API calls to accept a context parameter (ctx context.Context) and replace
uses of r.ctx with that ctx when calling r.Exists, r.Patch, r.Create, r.Update,
etc.; then thread the reconcile-scoped ctx passed into Reconcile through
processReconcileRequest -> reconcileTrustManagerDeployment ->
createOrApplyIssuer/createOrApplyCertificate so all client operations use the
reconcile context and obey cancellation/timeouts.
In `@pkg/controller/trustmanager/controller.go`:
- Around line 109-118: The current predicate setup
(withIgnoreStatusUpdatePredicates using builder.WithPredicates and
controllerManagedResources) ANDs checks so removal of the managed label gets
filtered; change the logic so controllerManagedResources checks both old and new
objects (i.e. “managed in either state”) before being combined with
LabelChangedPredicate/AnnotationChangedPredicate/GenerationChangedPredicate, and
update the associated mapFunc to inspect both old and new object labels when
deciding to enqueue/restore the managed label; apply this change to all watch
registrations that currently pass controllerManagedResources so the controller
triggers reconciliation when a managed label is removed.
- Around line 120-121: The current primary watch uses
predicate.GenerationChangedPredicate in the
ctrl.NewControllerManagedBy(...).For(&v1alpha1.TrustManager{},
builder.WithPredicates(predicate.GenerationChangedPredicate{})) which filters
out deletion-timestamp updates so finalizer removal never runs; fix by replacing
that predicate with one that also allows deletion events (either remove the
predicate entirely or use predicate.Or to combine GenerationChangedPredicate
with a predicate that returns true for Delete and for updates where
metadata.deletionTimestamp changed, or add a predicate.Funcs that explicitly
allows Delete and update events when DeletionTimestamp transitions) so
reconciler runs on finalizer deletions for TrustManager.
In `@pkg/controller/trustmanager/deployments.go`:
- Around line 34-44: deploymentModified currently enforces exact equality on
NodeSelector, Volumes, Affinity, Tolerations and container order which conflicts
with the Server-Side Apply path (r.Patch(..., client.Apply,
client.FieldOwner(...), client.ForceOwnership)) and causes perpetual patches
when other field managers add fields; update deploymentModified (the function
referenced in the if check) to only compare fields that this controller manages
(the SSA-managed subset) or relax comparisons (e.g., ignore extra containers,
unordered volumes, and keys added by other managers) so the equality check
reflects SSA semantics, or alternatively change the apply logic to not use
ForceOwnership if exact ownership is intended; locate usages around
deploymentModified, the r.Patch call, and the desired/existing deployment
variables to implement the narrower comparison/ignoring logic.
In `@pkg/operator/setup_manager.go`:
- Around line 57-67: The trustManagerManagedResources slice is missing
&certmanagerv1.Issuer{}, causing Issuer to be excluded from the label-selector
based cache scoping used by addControllerCacheConfig; add
&certmanagerv1.Issuer{} into the trustManagerManagedResources slice (alongside
&certmanagerv1.Certificate{}) so the TrustManager controller (see
controller.go:131 that watches Issuer) will be filtered by the same
managed-label cache configuration and avoid cluster-wide caching.
---
Duplicate comments:
In `@pkg/controller/common/utils.go`:
- Around line 54-66: The DecodeObjBytes function currently panics on
decode/type-mismatch; change its signature to return (T, error) instead of
panicking, i.e. DecodeObjBytes[T runtime.Object](codecs serializer.CodecFactory,
gv schema.GroupVersion, objBytes []byte) (T, error); on decode failure return
the zero value of T and an error created with fmt.Errorf that includes context
(e.g. the target type from *new(T) and the original err), and on failed type
assertion return zero T with a descriptive error instead of panic; update all
callers of DecodeObjBytes to handle the error and propagate/reconcile
appropriately.
In `@test/e2e/trustmanager_test.go`:
- Around line 813-816: The code currently treats a Degraded condition as a
generic timeout by returning (false, nil); instead detect the degradedCondition
found via meta.FindStatusCondition(trustManagerStatus.Conditions,
v1alpha1.Degraded) and fail fast by returning an error that includes
degradedCondition.Reason and degradedCondition.Message so the test surface the
actual controller failure details rather than expiring with highTimeout.
- Around line 69-84: The BeforeAll block mutates cluster Subscription env vars
but never restores them; capture the current UNSUPPORTED_ADDON_FEATURES and
OPERATOR_LOG_LEVEL values before calling patchSubscriptionWithEnvVars in
BeforeAll and add an AfterAll that calls patchSubscriptionWithEnvVars to restore
the original values (or remove keys if they were absent). Modify the test to use
the existing BeforeAll, patchSubscriptionWithEnvVars, and
waitForDeploymentEnvVarAndRollout symbols and ensure AfterAll waits for the
operator rollout to return to the original env state (using
waitForDeploymentEnvVarAndRollout) so cluster-scoped state is cleaned up after
the suite.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 03762f5c-7d60-4948-9add-6671d394478f
📒 Files selected for processing (27)
Makefilebundle/manifests/cert-manager-operator.clusterserviceversion.yamlconfig/manager/manager.yamlconfig/rbac/role.yamlpkg/controller/common/utils.gopkg/controller/trustmanager/certificates.gopkg/controller/trustmanager/certificates_test.gopkg/controller/trustmanager/constants.gopkg/controller/trustmanager/controller.gopkg/controller/trustmanager/controller_test.gopkg/controller/trustmanager/deployments.gopkg/controller/trustmanager/deployments_test.gopkg/controller/trustmanager/install_trustmanager.gopkg/controller/trustmanager/rbacs.gopkg/controller/trustmanager/rbacs_test.gopkg/controller/trustmanager/serviceaccounts.gopkg/controller/trustmanager/serviceaccounts_test.gopkg/controller/trustmanager/services.gopkg/controller/trustmanager/services_test.gopkg/controller/trustmanager/test_utils.gopkg/controller/trustmanager/utils.gopkg/controller/trustmanager/utils_test.gopkg/controller/trustmanager/webhooks.gopkg/controller/trustmanager/webhooks_test.gopkg/operator/setup_manager.gotest/e2e/trustmanager_test.gotest/e2e/utils_test.go
🚧 Files skipped from review as they are similar to previous changes (7)
- pkg/controller/trustmanager/webhooks_test.go
- pkg/controller/trustmanager/utils_test.go
- pkg/controller/trustmanager/services_test.go
- config/manager/manager.yaml
- Makefile
- test/e2e/utils_test.go
- pkg/controller/trustmanager/serviceaccounts_test.go
| func (r *Reconciler) createOrApplyIssuer(trustManager *v1alpha1.TrustManager, resourceLabels, resourceAnnotations map[string]string) error { | ||
| desired := getIssuerObject(resourceLabels, resourceAnnotations) | ||
| resourceName := fmt.Sprintf("%s/%s", desired.GetNamespace(), desired.GetName()) | ||
| r.log.V(4).Info("reconciling issuer resource", "name", resourceName) | ||
|
|
||
| existing := &certmanagerv1.Issuer{} | ||
| exists, err := r.Exists(r.ctx, client.ObjectKeyFromObject(desired), existing) | ||
| if err != nil { | ||
| return common.FromClientError(err, "failed to check if issuer %q exists", resourceName) | ||
| } | ||
| if exists && !issuerModified(desired, existing) { | ||
| r.log.V(4).Info("issuer already matches desired state, skipping apply", "name", resourceName) | ||
| return nil | ||
| } | ||
|
|
||
| if err := r.Patch(r.ctx, desired, client.Apply, client.FieldOwner(fieldOwner), client.ForceOwnership); err != nil { | ||
| return common.FromClientError(err, "failed to apply issuer %q", resourceName) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, find the controller.go file and check how r.ctx is initialized
find . -name "controller.go" -path "*/trustmanager/*" -type fRepository: openshift/cert-manager-operator
Length of output: 118
🏁 Script executed:
# Check the context initialization in controller.go
grep -n "context.Background\|r.ctx\|\.ctx\s*=" pkg/controller/trustmanager/controller.go | head -20Repository: openshift/cert-manager-operator
Length of output: 257
🏁 Script executed:
# Read the controller.go file to understand the Reconciler structure
wc -l pkg/controller/trustmanager/controller.goRepository: openshift/cert-manager-operator
Length of output: 120
🏁 Script executed:
# Examine certificates.go to see the function signatures and context usage
wc -l pkg/controller/trustmanager/certificates.goRepository: openshift/cert-manager-operator
Length of output: 122
🏁 Script executed:
# Read the createOrApplyIssuer function (lines 21-37 and nearby context)
sed -n '1,50p' pkg/controller/trustmanager/certificates.go | cat -nRepository: openshift/cert-manager-operator
Length of output: 2617
🏁 Script executed:
# Read the Reconcile function signature in controller.go
sed -n '80,150p' pkg/controller/trustmanager/controller.go | cat -nRepository: openshift/cert-manager-operator
Length of output: 4482
🏁 Script executed:
# Search for processReconcileRequest to see its signature
rg "processReconcileRequest|func.*Reconcile" pkg/controller/trustmanager/ -A 3Repository: openshift/cert-manager-operator
Length of output: 46091
🏁 Script executed:
# Get the full processReconcileRequest signature
sed -n '104,120p' pkg/controller/trustmanager/controller.go | cat -nRepository: openshift/cert-manager-operator
Length of output: 760
🏁 Script executed:
# Check all the createOrApply functions and their signatures in certificates.go
sed -n '20,80p' pkg/controller/trustmanager/certificates.go | cat -nRepository: openshift/cert-manager-operator
Length of output: 3740
🏁 Script executed:
# Check what context is available and used in other similar helper functions
rg "createOrApply|func \(r \*Reconciler\)" pkg/controller/trustmanager/ --no-heading -A 1 | head -40Repository: openshift/cert-manager-operator
Length of output: 3650
🏁 Script executed:
# Get the full processReconcileRequest function definition
sed -n '127,160p' pkg/controller/trustmanager/controller.go | cat -nRepository: openshift/cert-manager-operator
Length of output: 2509
🏁 Script executed:
# Check how createOrApplyIssuer and createOrApplyCertificate are called from processReconcileRequest
rg "createOrApplyIssuer|createOrApplyCertificate" pkg/controller/trustmanager/controller.go -B 2 -A 2Repository: openshift/cert-manager-operator
Length of output: 57
🏁 Script executed:
# Check utils.go to understand the pattern - how other methods accept ctx
sed -n '1,30p' pkg/controller/trustmanager/utils.go | cat -nRepository: openshift/cert-manager-operator
Length of output: 1077
🏁 Script executed:
# Search for where these functions are called in the codebase
rg "createOrApplyIssuer\(|createOrApplyCertificate\(" pkg/controller/trustmanager/ -B 3 -A 1Repository: openshift/cert-manager-operator
Length of output: 2891
🏁 Script executed:
# Get more context on processReconcileRequest - the full function
sed -n '127,180p' pkg/controller/trustmanager/controller.go | cat -nRepository: openshift/cert-manager-operator
Length of output: 3416
🏁 Script executed:
# Check the full processReconcileRequest function and see how it calls the install functions
sed -n '127,208p' pkg/controller/trustmanager/controller.go | cat -nRepository: openshift/cert-manager-operator
Length of output: 4937
🏁 Script executed:
# Check install_trustmanager.go to see where createOrApplyIssuer/Certificate are called
sed -n '1,50p' pkg/controller/trustmanager/install_trustmanager.go | cat -nRepository: openshift/cert-manager-operator
Length of output: 2296
🏁 Script executed:
# Get the reconcileTrustManagerDeployment function
rg "reconcileTrustManagerDeployment" pkg/controller/trustmanager/ -A 20 | head -40Repository: openshift/cert-manager-operator
Length of output: 3165
🏁 Script executed:
# Verify that ctx is NOT being passed through the call chain
rg "processReconcileRequest|reconcileTrustManagerDeployment" pkg/controller/trustmanager/controller.go -A 2 | head -20Repository: openshift/cert-manager-operator
Length of output: 690
🏁 Script executed:
# Check if there are other similar functions that DO pass ctx correctly for comparison
rg "createOrApplyService|createOrApplyValidatingWebhook" pkg/controller/trustmanager/ -A 8 | head -50Repository: openshift/cert-manager-operator
Length of output: 4699
🏁 Script executed:
# Double-check the line numbers for the createOrApplyCertificate function
sed -n '55,80p' pkg/controller/trustmanager/certificates.go | cat -nRepository: openshift/cert-manager-operator
Length of output: 1705
Pass the reconcile-scoped ctx through the helper functions instead of using r.ctx.
Lines 27, 36, 61, and 70 use r.ctx (initialized from context.Background() in controller.go) for client API calls. This detaches these requests from reconcile cancellation and allows work to continue running after the current reconcile is cancelled. Thread the ctx from the Reconcile method through processReconcileRequest, reconcileTrustManagerDeployment, and into createOrApplyIssuer, createOrApplyCertificate, and all other helper functions that perform client operations. This pattern should be applied consistently across all similar helper methods in the trustmanager controller.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/controller/trustmanager/certificates.go` around lines 21 - 37, The helper
functions are using r.ctx (a background context) for client calls which
decouples them from the reconcile cancellation; change the signatures of
processReconcileRequest, reconcileTrustManagerDeployment, createOrApplyIssuer,
createOrApplyCertificate and any other helper that issues client API calls to
accept a context parameter (ctx context.Context) and replace uses of r.ctx with
that ctx when calling r.Exists, r.Patch, r.Create, r.Update, etc.; then thread
the reconcile-scoped ctx passed into Reconcile through processReconcileRequest
-> reconcileTrustManagerDeployment ->
createOrApplyIssuer/createOrApplyCertificate so all client operations use the
reconcile context and obey cancellation/timeouts.
| if exists && !deploymentModified(desired, existing) { | ||
| r.log.V(4).Info("deployment already matches desired state, skipping apply", "name", deploymentName) | ||
| return nil | ||
| } | ||
|
|
||
| if err := r.Patch(r.ctx, desired, client.Apply, client.FieldOwner(fieldOwner), client.ForceOwnership); err != nil { | ||
| return common.FromClientError(err, "failed to apply deployment %q", deploymentName) | ||
| } | ||
|
|
||
| r.eventRecorder.Eventf(trustManager, corev1.EventTypeNormal, "Reconciled", "deployment resource %s applied", deploymentName) | ||
| r.log.V(2).Info("applied deployment", "name", deploymentName) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
printf '\n== SSA apply path ==\n'
rg -n -C2 'client\.Apply|ForceOwnership|deploymentModified\(' pkg/controller/trustmanager/deployments.go
printf '\n== exact equality checks inside deploymentModified ==\n'
rg -n -C2 'maps\.Equal\(desiredPodSpec\.NodeSelector|reflect\.DeepEqual\(desiredPodSpec\.Volumes|len\(desiredPodSpec\.Containers\)|containerModified\(&desiredPodSpec\.Containers' pkg/controller/trustmanager/deployments.goRepository: openshift/cert-manager-operator
Length of output: 1645
Skip logic requires exact equality but apply path uses Server-Side Apply, causing perpetual patching when other field managers are present.
The deploymentModified function at line 34 checks for exact equality across NodeSelector, Volumes, Affinity, Tolerations, and container count/order (lines 226, 227, 228, 229, 233). However, line 39 applies the Deployment using Server-Side Apply (client.Apply + ForceOwnership), which allows unmanaged fields from other field managers to coexist. When a sidecar injector, admission webhook, or another controller adds a container, volume, or node-selector key, the equality checks return true and trigger unnecessary patches and Reconciled events on every reconciliation loop.
Either narrow deploymentModified to match only SSA-managed fields, or clarify the intent if exact state is required.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/controller/trustmanager/deployments.go` around lines 34 - 44,
deploymentModified currently enforces exact equality on NodeSelector, Volumes,
Affinity, Tolerations and container order which conflicts with the Server-Side
Apply path (r.Patch(..., client.Apply, client.FieldOwner(...),
client.ForceOwnership)) and causes perpetual patches when other field managers
add fields; update deploymentModified (the function referenced in the if check)
to only compare fields that this controller manages (the SSA-managed subset) or
relax comparisons (e.g., ignore extra containers, unordered volumes, and keys
added by other managers) so the equality check reflects SSA semantics, or
alternatively change the apply logic to not use ForceOwnership if exact
ownership is intended; locate usages around deploymentModified, the r.Patch
call, and the desired/existing deployment variables to implement the narrower
comparison/ignoring logic.
689b09c to
2552e3a
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (3)
test/e2e/trustmanager_test.go (3)
813-816:⚠️ Potential issue | 🟠 MajorFail fast when
Degraded=Trueduring readiness polling.Line 815 returns
(false, nil)on degraded status, so the test can spin until timeout and lose the condition reason/message that explains the failure. Return an error includingdegradedCondition.ReasonanddegradedCondition.Messageinstead.As per coding guidelines, "Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/trustmanager_test.go` around lines 813 - 816, The readiness poll currently returns (false, nil) when a Degraded condition is found (via meta.FindStatusCondition on trustManagerStatus.Conditions for v1alpha1.Degraded), which hides failure details; replace that early return with an error that includes degradedCondition.Reason and degradedCondition.Message so the test fails fast and surfaces the condition reason/message instead of timing out.
69-84:⚠️ Potential issue | 🟠 MajorRestore patched Subscription env vars after this ordered suite.
Line 75 mutates a cluster-scoped Subscription, but there is no matching
AfterAllto restore previousUNSUPPORTED_ADDON_FEATURES/OPERATOR_LOG_LEVEL. This leaks state into later e2e specs and makes runs order-dependent.As per coding guidelines, "Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/trustmanager_test.go` around lines 69 - 84, BeforeAll mutates the cluster Subscription via patchSubscriptionWithEnvVars setting UNSUPPORTED_ADDON_FEATURES and OPERATOR_LOG_LEVEL but never restores them; capture the current env vars before calling patchSubscriptionWithEnvVars in BeforeAll (read existing UNSUPPORTED_ADDON_FEATURES and OPERATOR_LOG_LEVEL from the Subscription) and add a matching AfterAll that calls patchSubscriptionWithEnvVars to restore those saved values (or remove them if they were absent), using the same ctx/loader and handling errors with Expect(...).NotTo(HaveOccurred()); reference BeforeAll, AfterAll, patchSubscriptionWithEnvVars, waitForDeploymentEnvVarAndRollout, UNSUPPORTED_ADDON_FEATURES, and OPERATOR_LOG_LEVEL in your changes.
868-875:⚠️ Potential issue | 🟠 MajorRecreation helper still allows false positives due to async delete race.
This helper only checks for eventual
Getsuccess after delete. If the object was never observed as deleted, the test can pass without verifying controller-driven recreation.Proposed fix
func verifyTrustManagerResourceRecreation(deleteFunc func() error, getFunc func() error) { err := deleteFunc() Expect(err).ShouldNot(HaveOccurred()) + Eventually(func() bool { + return errors.IsNotFound(getFunc()) + }, lowTimeout, fastPollInterval).Should(BeTrue(), "resource was not observed as deleted") + Eventually(func() error { return getFunc() }, lowTimeout, fastPollInterval).Should(Succeed(), "resource was not recreated by controller") }As per coding guidelines, "Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/trustmanager_test.go` around lines 868 - 875, The helper verifyTrustManagerResourceRecreation currently only asserts that getFunc eventually succeeds after deleteFunc, which can false-positive if the delete was never observed; change verifyTrustManagerResourceRecreation to first assert the resource was actually deleted by polling getFunc until it returns a "not found" (or an error indicating absence) before asserting recreation: call deleteFunc(), then poll getFunc until it reports missing (using the existing polling timeouts/intervals), and only after that use Eventually to assert getFunc succeeds (recreated); reference verifyTrustManagerResourceRecreation, deleteFunc, and getFunc when making this change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@test/e2e/trustmanager_test.go`:
- Around line 813-816: The readiness poll currently returns (false, nil) when a
Degraded condition is found (via meta.FindStatusCondition on
trustManagerStatus.Conditions for v1alpha1.Degraded), which hides failure
details; replace that early return with an error that includes
degradedCondition.Reason and degradedCondition.Message so the test fails fast
and surfaces the condition reason/message instead of timing out.
- Around line 69-84: BeforeAll mutates the cluster Subscription via
patchSubscriptionWithEnvVars setting UNSUPPORTED_ADDON_FEATURES and
OPERATOR_LOG_LEVEL but never restores them; capture the current env vars before
calling patchSubscriptionWithEnvVars in BeforeAll (read existing
UNSUPPORTED_ADDON_FEATURES and OPERATOR_LOG_LEVEL from the Subscription) and add
a matching AfterAll that calls patchSubscriptionWithEnvVars to restore those
saved values (or remove them if they were absent), using the same ctx/loader and
handling errors with Expect(...).NotTo(HaveOccurred()); reference BeforeAll,
AfterAll, patchSubscriptionWithEnvVars, waitForDeploymentEnvVarAndRollout,
UNSUPPORTED_ADDON_FEATURES, and OPERATOR_LOG_LEVEL in your changes.
- Around line 868-875: The helper verifyTrustManagerResourceRecreation currently
only asserts that getFunc eventually succeeds after deleteFunc, which can
false-positive if the delete was never observed; change
verifyTrustManagerResourceRecreation to first assert the resource was actually
deleted by polling getFunc until it returns a "not found" (or an error
indicating absence) before asserting recreation: call deleteFunc(), then poll
getFunc until it reports missing (using the existing polling
timeouts/intervals), and only after that use Eventually to assert getFunc
succeeds (recreated); reference verifyTrustManagerResourceRecreation,
deleteFunc, and getFunc when making this change.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 6bdc6f8f-8a5a-45ef-9292-74e885f47b9e
📒 Files selected for processing (2)
test/e2e/trustmanager_test.gotest/e2e/utils_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
- test/e2e/utils_test.go
|
/jira refresh |
|
@chiragkyal: This pull request references CM-867 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target either version "4.22." or "openshift-4.22.", but it targets "cert-manager-1.19" instead. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
| certificate.Spec.SecretName = trustManagerTLSSecretName | ||
| certificate.Spec.IssuerRef = certmanagermetav1.ObjectReference{ | ||
| Name: trustManagerIssuerName, | ||
| Kind: "Issuer", |
There was a problem hiding this comment.
We will need a tracker in GA to make this Issuer configurable.
There was a problem hiding this comment.
Created https://redhat.atlassian.net/browse/CM-961 tracker for this.
| // managedAnnotationsModified checks whether all annotations present in desired | ||
| // exist with matching values in existing. Extra annotations on existing are | ||
| // allowed and do not count as modified. | ||
| func managedAnnotationsModified(desired, existing client.Object) bool { | ||
| existingAnnotations := existing.GetAnnotations() | ||
| for k, v := range desired.GetAnnotations() { | ||
| if existingAnnotations[k] != v { | ||
| return true | ||
| } | ||
| } | ||
| return false | ||
| } | ||
|
|
||
| // managedMetadataModified returns true if any managed label or annotation has drifted. | ||
| func managedMetadataModified(desired, existing client.Object) bool { | ||
| return managedLabelsModified(desired, existing) || managedAnnotationsModified(desired, existing) |
There was a problem hiding this comment.
This is known and Mytreya had bought this up in a PR review(unable to recall which PR). We will need to handle this and as well restrict which labels and annotations a user can configure for GA.
Signed-off-by: chiragkyal <ckyal@redhat.com>
Signed-off-by: chiragkyal <ckyal@redhat.com>
520e13b to
78b0343
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (6)
test/e2e/trustmanager_test.go (2)
868-875:⚠️ Potential issue | 🟠 MajorRequire an observed deletion before asserting recreation.
The helper can pass while the original resource is still present, so it may never prove controller-driven recreation.
Proposed fix
func verifyTrustManagerResourceRecreation(deleteFunc func() error, getFunc func() error) { err := deleteFunc() Expect(err).ShouldNot(HaveOccurred()) + Eventually(func() bool { + return errors.IsNotFound(getFunc()) + }, lowTimeout, fastPollInterval).Should(BeTrue(), "resource was not observed as deleted") + Eventually(func() error { return getFunc() }, lowTimeout, fastPollInterval).Should(Succeed(), "resource was not recreated by controller") }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/trustmanager_test.go` around lines 868 - 875, verifyTrustManagerResourceRecreation currently can succeed without first observing the resource as deleted; update it to ensure a NotFound is observed before asserting recreation. After calling deleteFunc() (in verifyTrustManagerResourceRecreation), poll getFunc() until it returns a NotFound error (use Eventually with an assertion that the returned error is a NotFound), only then start the Existing Eventually check that waits for getFunc() to succeed (resource recreated). Reference deleteFunc, getFunc and verifyTrustManagerResourceRecreation when making the change.
69-84:⚠️ Potential issue | 🟠 MajorRestore subscription env state after this Ordered suite.
BeforeAllmutates a cluster-scoped Subscription, but there is no matchingAfterAllrollback. This leaks state into other e2e specs and makes outcomes suite-order dependent.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/trustmanager_test.go` around lines 69 - 84, BeforeAll mutates cluster-scoped Subscription via patchSubscriptionWithEnvVars but doesn't restore it; capture the original env/state before patching (e.g., read the Subscription's env map into a package-scoped variable using the same ctx/loader) and add a matching AfterAll that calls patchSubscriptionWithEnvVars to restore the original map (or remove the injected keys "UNSUPPORTED_ADDON_FEATURES"/"OPERATOR_LOG_LEVEL") and then waitForDeploymentEnvVarAndRollout to ensure operatorDeploymentName in operatorNamespace returns to the previous state; use the same helper functions (patchSubscriptionWithEnvVars, waitForDeploymentEnvVarAndRollout) and variables (ctx, loader, operatorNamespace, operatorDeploymentName) to implement the rollback.pkg/controller/trustmanager/certificates.go (1)
21-38:⚠️ Potential issue | 🟠 MajorUse reconcile-scoped
ctxinstead ofr.ctxfor API calls.Lines 27/37/61/70 use
r.ctx(context.Background()), so these operations are not tied to request cancellation/deadlines. ThreadctxfromReconcilethroughprocessReconcileRequestand reconciliation helpers, then use thatctxforExists/Patch/status updates.Also applies to: 55-71
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/trustmanager/certificates.go` around lines 21 - 38, The createOrApplyIssuer function (and other reconciliation helpers called from processReconcileRequest) is using r.ctx (a background context) for API calls (Exists, Patch and status updates) which prevents request-scoped cancellation; change the function signature(s) to accept a ctx context.Context threaded from Reconcile/processReconcileRequest and replace uses of r.ctx with that ctx when calling r.Exists, r.Patch and any status update methods (and update callers accordingly so Reconcile passes its ctx down), ensuring all reconciliation helpers (e.g., createOrApplyIssuer) and invocations use the reconcile-scoped ctx rather than r.ctx.pkg/controller/trustmanager/controller.go (2)
108-117:⚠️ Potential issue | 🟠 MajorManaged-label removal events can still be filtered out before reconcile.
Lines 110-117 compose predicates with
controllerManagedResources, which checks only the current object labels. Whencommon.ManagedResourceLabelKeyis removed, the update can be dropped beforemapFuncenqueues, so label self-healing may not run for watched children.Also applies to: 123-131
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/trustmanager/controller.go` around lines 108 - 117, The current predicate composition uses controllerManagedResources (which only inspects the current object labels) inside withIgnoreStatusUpdatePredicates and the similar block at lines 123-131, so label-removal updates (e.g., removal of common.ManagedResourceLabelKey) can be filtered out before mapFunc runs; change the predicate logic to inspect the UpdateEvent's OldObject as well or use a custom predicate that returns true when the managed-label was present on the OldObject (or changed between OldObject and Object), ensuring updates where the managed-label was removed still pass the predicate and allow the mapFunc and reconciliation to run; update references in the code where withIgnoreStatusUpdatePredicates and controllerManagedResources are built to use that adjusted predicate or a new containsManagedLabelChange predicate that checks OldObject labels.
119-121:⚠️ Potential issue | 🔴 CriticalPrimary
TrustManagerwatch should also admit deletion-start updates.Line 120 uses only
predicate.GenerationChangedPredicate{}. Deletion with finalizers starts as a metadata update (deletionTimestamp) that may not bump generation, so reconcile can miss finalizer cleanup and leave the CR terminating.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/trustmanager/controller.go` around lines 119 - 121, The controller currently only uses predicate.GenerationChangedPredicate in the For(...) predicate which ignores metadata-only updates like deletionTimestamp (so finalizer-start deletion can be missed); update the predicate to also admit metadata/deletion updates—for example combine predicate.GenerationChangedPredicate with predicate.ResourceVersionChangedPredicate (or add a custom predicate that returns true when old.ObjectMeta.DeletionTimestamp != new.ObjectMeta.DeletionTimestamp) in the ctrl.NewControllerManagedBy(...).For(...) builder so TrustManager reconciles when deletion starts and finalizers can be cleaned up.pkg/controller/trustmanager/deployments.go (1)
232-265:⚠️ Potential issue | 🟠 Major
deploymentModifiedis stricter than SSA semantics and may cause repeated apply loops.With SSA at Line 39, unmanaged fields from other managers can legitimately coexist. The exact checks in Lines 242-265 (including full map/list equality and container count/order) can keep reporting drift and trigger unnecessary patches/events every reconcile.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/controller/trustmanager/utils.go`:
- Around line 171-182: updateResourceAnnotations currently copies user
annotations over existing ones (maps.Copy(existing, annotations)), allowing
user-provided keys to overwrite controller-required annotations; change the
merging logic in updateResourceAnnotations so existing annotations (from
obj.GetAnnotations()) take precedence: create or use the existing map and for
each key in the incoming annotations map only add the key if it does not already
exist (i.e., skip or preserve when existing[key] != ""), then call
obj.SetAnnotations(existing); reference updateResourceAnnotations, existing,
annotations, obj.GetAnnotations, and obj.SetAnnotations to locate the change.
---
Duplicate comments:
In `@pkg/controller/trustmanager/certificates.go`:
- Around line 21-38: The createOrApplyIssuer function (and other reconciliation
helpers called from processReconcileRequest) is using r.ctx (a background
context) for API calls (Exists, Patch and status updates) which prevents
request-scoped cancellation; change the function signature(s) to accept a ctx
context.Context threaded from Reconcile/processReconcileRequest and replace uses
of r.ctx with that ctx when calling r.Exists, r.Patch and any status update
methods (and update callers accordingly so Reconcile passes its ctx down),
ensuring all reconciliation helpers (e.g., createOrApplyIssuer) and invocations
use the reconcile-scoped ctx rather than r.ctx.
In `@pkg/controller/trustmanager/controller.go`:
- Around line 108-117: The current predicate composition uses
controllerManagedResources (which only inspects the current object labels)
inside withIgnoreStatusUpdatePredicates and the similar block at lines 123-131,
so label-removal updates (e.g., removal of common.ManagedResourceLabelKey) can
be filtered out before mapFunc runs; change the predicate logic to inspect the
UpdateEvent's OldObject as well or use a custom predicate that returns true when
the managed-label was present on the OldObject (or changed between OldObject and
Object), ensuring updates where the managed-label was removed still pass the
predicate and allow the mapFunc and reconciliation to run; update references in
the code where withIgnoreStatusUpdatePredicates and controllerManagedResources
are built to use that adjusted predicate or a new containsManagedLabelChange
predicate that checks OldObject labels.
- Around line 119-121: The controller currently only uses
predicate.GenerationChangedPredicate in the For(...) predicate which ignores
metadata-only updates like deletionTimestamp (so finalizer-start deletion can be
missed); update the predicate to also admit metadata/deletion updates—for
example combine predicate.GenerationChangedPredicate with
predicate.ResourceVersionChangedPredicate (or add a custom predicate that
returns true when old.ObjectMeta.DeletionTimestamp !=
new.ObjectMeta.DeletionTimestamp) in the
ctrl.NewControllerManagedBy(...).For(...) builder so TrustManager reconciles
when deletion starts and finalizers can be cleaned up.
In `@test/e2e/trustmanager_test.go`:
- Around line 868-875: verifyTrustManagerResourceRecreation currently can
succeed without first observing the resource as deleted; update it to ensure a
NotFound is observed before asserting recreation. After calling deleteFunc() (in
verifyTrustManagerResourceRecreation), poll getFunc() until it returns a
NotFound error (use Eventually with an assertion that the returned error is a
NotFound), only then start the Existing Eventually check that waits for
getFunc() to succeed (resource recreated). Reference deleteFunc, getFunc and
verifyTrustManagerResourceRecreation when making the change.
- Around line 69-84: BeforeAll mutates cluster-scoped Subscription via
patchSubscriptionWithEnvVars but doesn't restore it; capture the original
env/state before patching (e.g., read the Subscription's env map into a
package-scoped variable using the same ctx/loader) and add a matching AfterAll
that calls patchSubscriptionWithEnvVars to restore the original map (or remove
the injected keys "UNSUPPORTED_ADDON_FEATURES"/"OPERATOR_LOG_LEVEL") and then
waitForDeploymentEnvVarAndRollout to ensure operatorDeploymentName in
operatorNamespace returns to the previous state; use the same helper functions
(patchSubscriptionWithEnvVars, waitForDeploymentEnvVarAndRollout) and variables
(ctx, loader, operatorNamespace, operatorDeploymentName) to implement the
rollback.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: b94637a6-7675-439c-b5da-f8ea3296da3e
📒 Files selected for processing (22)
pkg/controller/common/core_validation_helpers.gopkg/controller/common/validation.gopkg/controller/istiocsr/deployments.gopkg/controller/trustmanager/certificates.gopkg/controller/trustmanager/certificates_test.gopkg/controller/trustmanager/constants.gopkg/controller/trustmanager/controller.gopkg/controller/trustmanager/controller_test.gopkg/controller/trustmanager/deployments.gopkg/controller/trustmanager/deployments_test.gopkg/controller/trustmanager/install_trustmanager.gopkg/controller/trustmanager/rbacs.gopkg/controller/trustmanager/rbacs_test.gopkg/controller/trustmanager/serviceaccounts_test.gopkg/controller/trustmanager/services_test.gopkg/controller/trustmanager/test_utils.gopkg/controller/trustmanager/utils.gopkg/controller/trustmanager/utils_test.gopkg/controller/trustmanager/webhooks_test.gopkg/operator/setup_manager.gotest/e2e/trustmanager_test.gotest/e2e/utils_test.go
🚧 Files skipped from review as they are similar to previous changes (4)
- pkg/controller/trustmanager/rbacs_test.go
- pkg/controller/trustmanager/utils_test.go
- pkg/operator/setup_manager.go
- test/e2e/utils_test.go
| // updateResourceAnnotations merges user-provided annotations into the object's existing annotations. | ||
| // User-provided annotations take precedence over existing ones on key conflicts. | ||
| func updateResourceAnnotations(obj client.Object, annotations map[string]string) { | ||
| if len(annotations) == 0 { | ||
| return | ||
| } | ||
| existing := obj.GetAnnotations() | ||
| if existing == nil { | ||
| existing = make(map[string]string) | ||
| } | ||
| maps.Copy(existing, annotations) | ||
| obj.SetAnnotations(existing) |
There was a problem hiding this comment.
Prevent user annotations from overriding controller-required annotations.
updateResourceAnnotations currently lets user keys overwrite existing keys. On resources that carry required annotations from assets (for example webhook CA injection), a conflicting user annotation can break reconciliation/runtime behavior.
Proposed fix
func updateResourceAnnotations(obj client.Object, annotations map[string]string) {
if len(annotations) == 0 {
return
}
existing := obj.GetAnnotations()
if existing == nil {
existing = make(map[string]string)
}
- maps.Copy(existing, annotations)
+ for k, v := range annotations {
+ // Keep controller/asset-provided annotations authoritative on conflicts.
+ if _, alreadySet := existing[k]; alreadySet {
+ continue
+ }
+ existing[k] = v
+ }
obj.SetAnnotations(existing)
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/controller/trustmanager/utils.go` around lines 171 - 182,
updateResourceAnnotations currently copies user annotations over existing ones
(maps.Copy(existing, annotations)), allowing user-provided keys to overwrite
controller-required annotations; change the merging logic in
updateResourceAnnotations so existing annotations (from obj.GetAnnotations())
take precedence: create or use the existing map and for each key in the incoming
annotations map only add the key if it does not already exist (i.e., skip or
preserve when existing[key] != ""), then call obj.SetAnnotations(existing);
reference updateResourceAnnotations, existing, annotations, obj.GetAnnotations,
and obj.SetAnnotations to locate the change.
b7b8f43 to
f6a97f2
Compare
|
/retest |
| // certificateModified compares only the fields we manage via SSA. | ||
| // We compare individual spec fields rather than the full Spec because | ||
| // cert-manager's webhook may default fields we don't set (e.g. Duration). | ||
| func certificateModified(desired, existing *certmanagerv1.Certificate) bool { |
There was a problem hiding this comment.
We should have had these *Modified() helpers in a common package and used for both istio-csr and trust-manager. If not possible now, we can pick it up for GA.
There was a problem hiding this comment.
I think we can do that during GA. I can create a tracker for that.
There was a problem hiding this comment.
Yeah, for this and for configuring the issuer both, thank you!
There was a problem hiding this comment.
Signed-off-by: chiragkyal <ckyal@redhat.com>
Signed-off-by: chiragkyal <ckyal@redhat.com>
f6a97f2 to
de5b043
Compare
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bharath-b-rh, chiragkyal The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/retest |
|
/retest |
2 similar comments
|
/retest |
|
/retest |
|
@chiragkyal: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Summary
reconcilers that create and maintain the lifecycle to run the trust-manager operand.
Changes
RELATED_IMAGE_CERT_MANAGER_TRUST_MANAGERandTRUSTMANAGER_OPERAND_IMAGE_VERSIONenvironment variables to the CSV and manager deploymentmanifest.
--unsupported-addon-features="TrustManager=true"flag tolocal-runfor easy local testing.