Skip to content

reconciliation logic for connection poolers#1713

Draft
limak9182 wants to merge 5 commits intofeature/database-controllersfrom
db-poolers-reconciliation-logic
Draft

reconciliation logic for connection poolers#1713
limak9182 wants to merge 5 commits intofeature/database-controllersfrom
db-poolers-reconciliation-logic

Conversation

@limak9182
Copy link

Description

What does this PR have in it?

Key Changes

Highlight the updates in specific files

Testing and Verification

How did you test these changes? What automated tests are added?

Related Issues

Jira tickets, GitHub issues, Support tickets...

PR Checklist

  • Code changes adhere to the project's coding standards.
  • Relevant unit and integration tests are included.
  • Documentation has been updated accordingly.
  • All tests pass locally.
  • The PR description follows the project's guidelines.

@limak9182 limak9182 marked this pull request as draft February 20, 2026 09:39
@github-actions
Copy link
Contributor

github-actions bot commented Feb 20, 2026

CLA Assistant Lite bot:
Thank you for your submission, we really appreciate it. Like many open-source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution. You can sign the CLA by just posting a Pull Request Comment same as the below format.


I have read the CLA Document and I hereby sign the CLA


1 out of 2 committers have signed the CLA.
@DmytroPI-dev
@limak9182
You can retrigger this bot by commenting recheck in this Pull Request

@github-actions
Copy link
Contributor

CLA Assistant Lite bot: All contributors have NOT signed the COC Document


I have read the Code of Conduct and I hereby accept the Terms


You can retrigger this bot by commenting recheck in this Pull Request

if getClusterClassErr := r.Get(ctx, client.ObjectKey{Name: postgresCluster.Spec.Class}, postgresClusterClass); getClusterClassErr != nil {
logger.Error(getClusterClassErr, "Unable to fetch referenced PostgresClusterClass", "className", postgresCluster.Spec.Class)
r.setCondition(postgresCluster, metav1.ConditionFalse, "ClusterClassNotFound", getClusterClassErr.Error())
r.setCondition(postgresCluster, "Ready", metav1.ConditionFalse, "ClusterClassNotFound", getClusterClassErr.Error())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should probably use r.syncStatus everywhere

// It ensures that we always attempt to sync the status of the PostgresCluster based on the final state of the CNPG Cluster and any errors that may have occurred.
defer func() {
if syncErr := r.syncStatus(ctx, postgresCluster, cnpgCluster, err); syncErr != nil {
if syncErr := r.syncStatus(ctx, postgresCluster, cnpgCluster, poolerEnabled, err); syncErr != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this defer functions is probably not needed really if we map our state properly

if buildCNPGClusterErr := r.Create(ctx, cnpgCluster); buildCNPGClusterErr != nil {
logger.Error(buildCNPGClusterErr, "Failed to create CNPG Cluster")
r.setCondition(postgresCluster, metav1.ConditionFalse, "ClusterBuildFailed", buildCNPGClusterErr.Error())
r.setCondition(postgresCluster, "Ready", metav1.ConditionFalse, "ClusterBuildFailed", buildCNPGClusterErr.Error())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are setting condition everywhere but we are not syncing this via update

requeuePooler, poolerErr := r.reconcileConnectionPooler(ctx, postgresCluster, postgresClusterClass, cnpgCluster)
if poolerErr != nil {
logger.Error(poolerErr, "Failed to reconcile connection pooler")
r.setCondition(postgresCluster, "Ready", metav1.ConditionFalse, "PoolerReconciliationFailed", poolerErr.Error())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should update the state after we set conditon


// isConnectionPoolerEnabled determines if connection pooler should be active.
func (r *PostgresClusterReconciler) isConnectionPoolerEnabled(class *enterprisev4.PostgresClusterClass, cluster *enterprisev4.PostgresCluster) bool {
if cluster.Spec.ConnectionPoolerEnabled != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this really? we have a function that merges class and cluster spec. we can rely on it to get the latest value of pooler as well

}

// reconcileConnectionPooler creates or deletes CNPG Pooler resources based on the effective enabled state.
// Returns (requeue, error) — requeue is true when poolers were just created and may not be ready yet.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be decision of reconciller to reque if new connection pooler is created and then before running reconcileConnectionPooler also check the diff and the status. if status is that it still reconcille we should requeue one more time

return false, nil
}

if cnpgCluster.Status.Phase != cnpgv1.PhaseHealthy {
Copy link
Collaborator

@mploski mploski Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think checking the status of pooler should be separate function from creation even from the perspective of testing

return false, fmt.Errorf("failed to reconcile RO pooler: %w", err)
}

// Check if poolers are ready — requeue if they're still provisioning.
Copy link
Collaborator

@mploski mploski Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im not sure if this works - k8s is caching resources so if you call get operation immedietally you will get cached information without updated status + it takes more than milisecond to provision poolers. We should have workflow similar to this:

Reconciliation loop iteration 1:
├── Check if Pooler exists func → No
├── Create Pooler func
└── return RequeueAfter: 15s in reconcillation loop

Reconciliation loop iteration 2 (15s later):
├── Check if Pooler exists func → Yes
├── r.Get(Pooler) → read status from cache (now populated by CNPG)
├── Pooler ready? → No → return RequeueAfter: 15s
└── Pooler ready? → Yes → continue

logger := logs.FromContext(ctx)

if !r.isConnectionPoolerEnabled(class, postgresCluster) {
// Skip deletion if the cluster is not healthy — owner references handle cleanup via GC.
Copy link
Collaborator

@mploski mploski Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pooler is separated from cluster correct? So why we link removal to the cluster state? Shouldnt we instead check if pooler exists first and remove quickly if not?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But do we need to let the pooler stay after the cluster is removed? We may get orphaned resources in k8s, no?

UID: cnpgCluster.UID,
}

if poolerEnabled {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

status sync function shouldnt have domain specific switches to do smarter logic. Maybe pooler here can be simply another set of phase and conditions that controller needs to trigger?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can add it to our Confluence PostgresCluster Controller Design doc, I assume?

@DmytroPI-dev DmytroPI-dev force-pushed the db-poolers-reconciliation-logic branch from 72cfc39 to d661e84 Compare February 26, 2026 12:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants