Problem
When a SolrCloud CR is deleted and its providedConfigMap has already been removed from the cluster (e.g., by Helm during an upgrade or uninstall), the reconciler enters an infinite requeue loop and the SolrCloud CR can never be deleted.
The root cause is in controllers/solrcloud_controller.go:
if instance.Spec.CustomSolrKubeOptions.ConfigMapOptions != nil && instance.Spec.CustomSolrKubeOptions.ConfigMapOptions.ProvidedConfigMap != "" {
providedConfigMapName := instance.Spec.CustomSolrKubeOptions.ConfigMapOptions.ProvidedConfigMap
foundConfigMap := &corev1.ConfigMap{}
nn := types.NamespacedName{Name: providedConfigMapName, Namespace: instance.Namespace}
err = r.Get(ctx, nn, foundConfigMap)
if err != nil {
return requeueOrNot, err // if they passed a providedConfigMap name, then it must exist
}
The comment says "it must exist", but this assumption doesn't hold during deletion. The Reconcile() function has no early exit when DeletionTimestamp is set - it runs through ZooKeeper reconciliation, Service creation, node services, headless service, and then hits the ConfigMap lookup before it can ever reach the storage finalizer logic at line ~448. If the ConfigMap is gone, the error causes an immediate return and requeue, so the finalizer is never removed and the CR is stuck.
How to reproduce
- Create a
SolrCloud CR with spec.customSolrKubeOptions.configMapOptions.providedConfigMap pointing to a ConfigMap
- Delete the ConfigMap
- Delete the
SolrCloud CR
- Observe: the CR gets a
deletionTimestamp but is never finalized; the operator logs show a recurring NotFound error for the ConfigMap on every reconciliation cycle
This also occurs in Helm-managed deployments: when a Helm upgrade removes SolrCloud workloads from values, Helm deletes the ConfigMap (no longer rendered) while the SolrCloud CR still exists, and the operator cannot complete the CR's deletion.
Affected versions
Confirmed in the controller code as of v0.9.1 (latest stable). The code path has been unchanged through v0.7.0–v0.9.1.
Problem
When a
SolrCloudCR is deleted and itsprovidedConfigMaphas already been removed from the cluster (e.g., by Helm during an upgrade or uninstall), the reconciler enters an infinite requeue loop and the SolrCloud CR can never be deleted.The root cause is in
controllers/solrcloud_controller.go:The comment says "it must exist", but this assumption doesn't hold during deletion. The
Reconcile()function has no early exit whenDeletionTimestampis set - it runs through ZooKeeper reconciliation, Service creation, node services, headless service, and then hits the ConfigMap lookup before it can ever reach the storage finalizer logic at line ~448. If the ConfigMap is gone, the error causes an immediate return and requeue, so the finalizer is never removed and the CR is stuck.How to reproduce
SolrCloudCR withspec.customSolrKubeOptions.configMapOptions.providedConfigMappointing to a ConfigMapSolrCloudCRdeletionTimestampbut is never finalized; the operator logs show a recurringNotFounderror for the ConfigMap on every reconciliation cycleThis also occurs in Helm-managed deployments: when a Helm upgrade removes SolrCloud workloads from values, Helm deletes the ConfigMap (no longer rendered) while the SolrCloud CR still exists, and the operator cannot complete the CR's deletion.
Affected versions
Confirmed in the controller code as of v0.9.1 (latest stable). The code path has been unchanged through v0.7.0–v0.9.1.