Skip to content

James colo sync#103

Open
JPadovano1483 wants to merge 4 commits into
skupperproject:mainfrom
JPadovano1483:james-colo-sync
Open

James colo sync#103
JPadovano1483 wants to merge 4 commits into
skupperproject:mainfrom
JPadovano1483:james-colo-sync

Conversation

@JPadovano1483
Copy link
Copy Markdown
Contributor

@JPadovano1483 JPadovano1483 commented May 11, 2026

This module is responsible for managing the deployment and deletion of colocated namespaces/sites, tying their lifecycle to the lifecycle of the parent backbone network. It currently runs on MC startup, during backbone network creation and deletion, and is set to run every minute after startup to ensure up to date state reconciliation. Added a column on backboneaccesspoint to store routeraccess accesstype ('local' for colocated site's 'manage' accesspoint).

Fixes #91

Copy link
Copy Markdown
Member

@ted-ross ted-ross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many of my comments don't need to be addressed to merge this PR. Some of them could be deferred to later issues.

- skupper.io
resources:
- sites
- links
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't need access to Links, I think.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the mc creating networkaccesses for colocated sites, or this will be done through the sync process?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is done through the sync process. However, in order to give the site controller the permissions it needs to do that, the management controller itself needs those permissions. Though if there is a better way of handling that issue, please let me know.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. Since it is creating the deployment for the system-controller, it must have the same permissions granted to it! Thank you!

name: {{ include "management-server.serviceAccountName" . }}
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will want to squeeze the permissions in this ClusterRole down to the minimum possible set.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... not a blocker for merging, but something we will need to eventually do.

import * as sync from "./sync-management.js"
import * as common from "@skupperx/modules/common"

let client
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a module-global variable for client is different from what we do elsewhere. Is this significantly better than allocating and freeing the client on an as-needed basis?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I think my original thought was that since the main function polls database regularly that using the same connection pool made sense, but I will switch to the way we do it elsewhere.

*/
export async function processColoBackbones() {
// get all backbones with colo namespaces
const coloBackbones = await client.query(`SELECT Id, CoLocatedNamespace FROM Backbones WHERE NULLIF(CoLocatedNamespace, '') IS NOT NULL`).then(res => res.rows)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is my understanding that we are not permitting CoLocatedNamespace to be the empty string.

*/
async function deploySite(backboneId, ns) {
const siteId = await client.query(`SELECT Id FROM InteriorSites WHERE Backbone = $1 AND CoLocated = true`, [backboneId]).then(res => res.rows[0]?.id)
// Poll the db against the InteriorSites table for this siteId, waiting for lifecycle='ready'
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than polling here and hanging up the whole process, wouldn't it be better to only sync colo sites that are in the Ready state? i.e. a colo site that is not Ready is treated as though it doesn't yet exist.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that makes sense. Originally I wasn't running the main process backbones function with the settimeout, so I was concerned the deployment wouldn't get reconciled until the MC restarted or another backbone was created. This should now just get picked up the next time the function is set to run.

}

const site = result.rows[0];
if (site.deploymentstate == 'deployed') {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure these checks are appropriate in this case. The "already deployed" error might get thrown if we are re-establishing an already established colo site.

resourceTemplates.BackboneSite(site.name, siteId),
resourceTemplates.NetworkCR('mbone'),
];
const links = await sync.GetBackboneLinks_TX(client, siteId);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't apply the links here. That will be handled later by the sync protocol.

for (const [linkId, linkData] of Object.entries(links)) {
output.push(resourceTemplates.LinkCR(linkId, linkData, `skx-site-${siteId}`));
}
const accessPoints = await sync.GetBackboneAccessPoints_TX(client, siteId, true);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only apply the manage access point here. Any others will be added by the sync protocol

@JPadovano1483
Copy link
Copy Markdown
Contributor Author

Changes:

  • Removed a couple of permissions on the namespace resource. Most of the permissions in the ClusterRole are currently needed in order to deploy the site with the permissions it needs.
  • moved client from module scope to function scope
  • moved from polling for site readiness to only pulling sites that are in the ready state
  • moved the creation of the colocated namespace to the same block as the creation of the site to avoid situations where the first pass creates the namespace before the site is ready and the second pass skips the creation of the site due to the namespace already existing

Owner UUID REFERENCES Users,
OwnerGroup text
OwnerGroup text,
AccessType text
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if AccessType column is actually needed here?
Considering that a CoLocated Backbone Site can only have a single manage endpoint,
if the referenced InteriorSite record has Colocated field set as true, and the Kind field here is set to manage, then the access type must be local.
Unless we are anticipating future needs or scenarios.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would leave it there in case we need to influence the accessType attribute of the resulting CR. However, it should be considered optional (or have a default value) that is the equivalent to omitting it from the CR.

- skupper.io
resources:
- sites
- links
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the mc creating networkaccesses for colocated sites, or this will be done through the sync process?

if (coloBackbones.length > 0) {
await reconcileNamespaces(coloBackbones)
}
setTimeout(processColoBackbones, 60000)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor suggestion, but setTimeout could be moved to the finally block only.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NetworkAccesses will be created only via the sync protocol.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Co-located sites: Reconcile co-located deployments to the configured backbones

3 participants