Skip to content

Develop SP approval runbook #19

@BigLep

Description

@BigLep

Done Criteria

We have a runbook for SP approvers that covers:

  1. How often to check if approved SPs are not meeting approval criteria
  2. What to do when they aren't?
    • What channel to message in or where to keep an issue tracking the action
    • How long to wait before unapproving them
    • Commands to run to unapprove them

Why Important

  • This supports having more bus factor to this critical area that @TippyFlitsUK has been handling by default.

User/Customer

Notes

  1. We can give a time window (e.g., 1–4 hours) to diagnose & see if they recover.
    • Reasons to give a window:
      • Avoid flapping the approved list on minor or transient issues.
      • Allow James to account for known short maintenance, etc.
    • But, from a risk standpoint:
      • If SP is failing for more than that window, better to unapprove to protect users.
  2. We'll get automated alerting with Automated "alerting" if an SP should get approved or unapproved dealbot#280, but when the alarm goes off, we still need a runbook for how we handle it.
  3. Before automated alerting, we're expecting to manually check twice per day.
  4. We need to get a couple of people onboarded onto this process (e.g., Beck, Orjan)

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

Status

⌨️ In Progress

Relationships

None yet

Development

No branches or pull requests

Issue actions