Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions config/monitoring/prometheus-rule.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -80,3 +80,13 @@ spec:
annotations:
summary: "SeiNodeGroup reconcile substep {{ $labels.substep }} is slow"
description: "p99 latency above 10s for 10 minutes."

- alert: ControllerMetricsDown
expr: up{job=~".*sei-k8s-controller.*"} == 0
for: 5m
labels:
severity: critical
team: platform
annotations:
summary: "sei-k8s-controller metrics endpoint is down"
description: "Prometheus has been unable to scrape the controller for 5 minutes. All other controller alerts are blind."
1 change: 1 addition & 0 deletions config/monitoring/service-monitor.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ metadata:
app.kubernetes.io/name: sei-k8s-controller
app.kubernetes.io/managed-by: kustomize
spec:
jobLabel: app.kubernetes.io/name
selector:
matchLabels:
control-plane: controller-manager
Expand Down
4 changes: 4 additions & 0 deletions config/network-policy/allow-metrics-traffic.yaml
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
# Allows Prometheus to scrape the controller metrics endpoint.
# Prerequisite: the Prometheus namespace must carry the label
# kubectl label namespace <prometheus-ns> metrics=enabled
# Without this label, scrapes will be blocked by the namespaceSelector.
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
Expand Down
3 changes: 0 additions & 3 deletions config/rbac/kustomization.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,3 @@ resources:
- role_binding.yaml
- leader_election_role.yaml
- leader_election_role_binding.yaml
- metrics_auth_role.yaml
- metrics_auth_role_binding.yaml
- metrics_reader_role.yaml
17 changes: 0 additions & 17 deletions config/rbac/metrics_auth_role.yaml

This file was deleted.

12 changes: 0 additions & 12 deletions config/rbac/metrics_auth_role_binding.yaml

This file was deleted.

9 changes: 0 additions & 9 deletions config/rbac/metrics_reader_role.yaml

This file was deleted.

Loading