James colo sync#103
Conversation
ted-ross
left a comment
There was a problem hiding this comment.
Many of my comments don't need to be addressed to merge this PR. Some of them could be deferred to later issues.
| - skupper.io | ||
| resources: | ||
| - sites | ||
| - links |
There was a problem hiding this comment.
Shouldn't need access to Links, I think.
There was a problem hiding this comment.
Is the mc creating networkaccesses for colocated sites, or this will be done through the sync process?
There was a problem hiding this comment.
It is done through the sync process. However, in order to give the site controller the permissions it needs to do that, the management controller itself needs those permissions. Though if there is a better way of handling that issue, please let me know.
There was a problem hiding this comment.
You're right. Since it is creating the deployment for the system-controller, it must have the same permissions granted to it! Thank you!
| name: {{ include "management-server.serviceAccountName" . }} | ||
| --- | ||
| apiVersion: rbac.authorization.k8s.io/v1 | ||
| kind: ClusterRole |
There was a problem hiding this comment.
We will want to squeeze the permissions in this ClusterRole down to the minimum possible set.
There was a problem hiding this comment.
... not a blocker for merging, but something we will need to eventually do.
| import * as sync from "./sync-management.js" | ||
| import * as common from "@skupperx/modules/common" | ||
|
|
||
| let client |
There was a problem hiding this comment.
Having a module-global variable for client is different from what we do elsewhere. Is this significantly better than allocating and freeing the client on an as-needed basis?
There was a problem hiding this comment.
No, I think my original thought was that since the main function polls database regularly that using the same connection pool made sense, but I will switch to the way we do it elsewhere.
| */ | ||
| export async function processColoBackbones() { | ||
| // get all backbones with colo namespaces | ||
| const coloBackbones = await client.query(`SELECT Id, CoLocatedNamespace FROM Backbones WHERE NULLIF(CoLocatedNamespace, '') IS NOT NULL`).then(res => res.rows) |
There was a problem hiding this comment.
It is my understanding that we are not permitting CoLocatedNamespace to be the empty string.
| */ | ||
| async function deploySite(backboneId, ns) { | ||
| const siteId = await client.query(`SELECT Id FROM InteriorSites WHERE Backbone = $1 AND CoLocated = true`, [backboneId]).then(res => res.rows[0]?.id) | ||
| // Poll the db against the InteriorSites table for this siteId, waiting for lifecycle='ready' |
There was a problem hiding this comment.
Rather than polling here and hanging up the whole process, wouldn't it be better to only sync colo sites that are in the Ready state? i.e. a colo site that is not Ready is treated as though it doesn't yet exist.
There was a problem hiding this comment.
Yes that makes sense. Originally I wasn't running the main process backbones function with the settimeout, so I was concerned the deployment wouldn't get reconciled until the MC restarted or another backbone was created. This should now just get picked up the next time the function is set to run.
| } | ||
|
|
||
| const site = result.rows[0]; | ||
| if (site.deploymentstate == 'deployed') { |
There was a problem hiding this comment.
I'm not sure these checks are appropriate in this case. The "already deployed" error might get thrown if we are re-establishing an already established colo site.
| resourceTemplates.BackboneSite(site.name, siteId), | ||
| resourceTemplates.NetworkCR('mbone'), | ||
| ]; | ||
| const links = await sync.GetBackboneLinks_TX(client, siteId); |
There was a problem hiding this comment.
Don't apply the links here. That will be handled later by the sync protocol.
| for (const [linkId, linkData] of Object.entries(links)) { | ||
| output.push(resourceTemplates.LinkCR(linkId, linkData, `skx-site-${siteId}`)); | ||
| } | ||
| const accessPoints = await sync.GetBackboneAccessPoints_TX(client, siteId, true); |
There was a problem hiding this comment.
Only apply the manage access point here. Any others will be added by the sync protocol
|
Changes:
|
| Owner UUID REFERENCES Users, | ||
| OwnerGroup text | ||
| OwnerGroup text, | ||
| AccessType text |
There was a problem hiding this comment.
I am wondering if AccessType column is actually needed here?
Considering that a CoLocated Backbone Site can only have a single manage endpoint,
if the referenced InteriorSite record has Colocated field set as true, and the Kind field here is set to manage, then the access type must be local.
Unless we are anticipating future needs or scenarios.
There was a problem hiding this comment.
I would leave it there in case we need to influence the accessType attribute of the resulting CR. However, it should be considered optional (or have a default value) that is the equivalent to omitting it from the CR.
| - skupper.io | ||
| resources: | ||
| - sites | ||
| - links |
There was a problem hiding this comment.
Is the mc creating networkaccesses for colocated sites, or this will be done through the sync process?
| if (coloBackbones.length > 0) { | ||
| await reconcileNamespaces(coloBackbones) | ||
| } | ||
| setTimeout(processColoBackbones, 60000) |
There was a problem hiding this comment.
Minor suggestion, but setTimeout could be moved to the finally block only.
There was a problem hiding this comment.
NetworkAccesses will be created only via the sync protocol.
This module is responsible for managing the deployment and deletion of colocated namespaces/sites, tying their lifecycle to the lifecycle of the parent backbone network. It currently runs on MC startup, during backbone network creation and deletion, and is set to run every minute after startup to ensure up to date state reconciliation. Added a column on backboneaccesspoint to store routeraccess accesstype ('local' for colocated site's 'manage' accesspoint).
Fixes #91