Slurm: "start request repeated too quickly"

Found the following error with a single compute node when launching 32 compute nodes at once: 

```
Sep 29 20:40:20 flight-149 systemd[1]: clusterware-slurm-slurmd.service: control process exited, code=exited status=1
Sep 29 20:40:20 flight-149 systemd[1]: Failed to start Alces Clusterware Slurm compute node daemon.
Sep 29 20:40:20 flight-149 systemd[1]: Unit clusterware-slurm-slurmd.service entered failed state.
Sep 29 20:40:20 flight-149 systemd[1]: clusterware-slurm-slurmd.service failed.
Sep 29 20:40:21 flight-149 systemd[1]: clusterware-slurm-slurmd.service holdoff time over, scheduling restart.
Sep 29 20:40:21 flight-149 systemd[1]: start request repeated too quickly for clusterware-slurm-slurmd.service
Sep 29 20:40:21 flight-149 systemd[1]: Failed to start Alces Clusterware Slurm compute node daemon.
```

Restarting the service fixes it

Process to repeat:
- Start a cluster using the `2016.3rc6` template (professional edition)
- Select `slurm` scheduler type
- Launch 32 nodes
- Node(s) may appear in `sinfo -N` as `unknown` state


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slurm: "start request repeated too quickly" #212

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Slurm: "start request repeated too quickly" #212

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions