Skip to content

Add node affinity and tolerations support#114

Open
groundsada wants to merge 1 commit intospegel-org:mainfrom
groundsada:main
Open

Add node affinity and tolerations support#114
groundsada wants to merge 1 commit intospegel-org:mainfrom
groundsada:main

Conversation

@groundsada
Copy link
Copy Markdown

Summary

Adds support for node affinity and tolerations to improve benchmark scheduling on large clusters with heterogeneous nodes.

Problem

The benchmark tool was experiencing scheduling issues on large clusters where pods would get stuck on incompatible nodes (control plane, tainted nodes, etc.), leading to incomplete benchmark runs.

Solution

  • Added --affinity-file and --tolerations-file CLI flags
  • Added YAML parsing for node affinity and tolerations configuration
  • Applied affinity/tolerations to DaemonSet pod specs
  • Included example configuration files

Changes

  • main.go: Added CLI flags and updated function signatures
  • internal/measure/measure.go: Added YAML parsing and pod spec configuration
  • affinity.yaml: Example node affinity configuration
  • tolerations.yaml: Example tolerations configuration

Usage

benchmark measure \
  --output-dir ./results \
  --kubeconfig ~/.kube/config \
  --namespace seam \
  --images ghcr.io/spegel-org/benchmark:v1-10MB-1 ghcr.io/spegel-org/benchmark:v2-10MB-1 \
  --affinity-file ./affinity.yaml \
  --tolerations-file ./tolerations.yaml

Testing

Successfully tested on a 256-node cluster, achieving 120/120 pods scheduled correctly with proper node selection.

Backward Compatibility

✅ Fully backward compatible - existing functionality unchanged when flags are not provided

Acknowledgement

  • LLM coding assistance provided by gemma3-27b with aider
  • Testing environment provided by the National Research Platform, on a heterogeneous dual-stack Kubernetes cluster

@groundsada
Copy link
Copy Markdown
Author

Hi @phillebaba can you please look at this and see if the daemonset can be extended? We run a very heterogenous cluster and we can benefit from granular per-node benchmarks for spegel and are happy to contribute.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant