Skip to content

Autoscale VMs do not follow ScaleDown Rules after a Node Failure #9336

@btzq

Description

@btzq
ISSUE TYPE
  • Bug Report
COMPONENT NAME
Autoscale
CLOUDSTACK VERSION
4.19.0
CONFIGURATION
OS / ENVIRONMENT
SUMMARY
STEPS TO REPRODUCE

We are actively using Autoscale Groups which have the following scale down rules.

Autoscale Rule

  • Name: ScaleDownPolicy-0
  • Duration (in sec) = 60 Seconds
  • Quiet Time (in sec) = 30 seconds

Conditions:

  • Counter: VM CPU - Average Percentage
  • Operator: Less Than
  • Threshold: 35

These rules work well on normal days. But today, we had a node failure, which the node was hosting some autoscaleVMs

As a result, we experienced this issue, which have already been reported:

In the case where the VMs were not 'Orphaned', and managed to find its way back to the Autoscale Group (or just werent affected), i noticed that the scale down rule did not work anymore.

I had VMs that:

  • Min : 2 Members
  • Max : 6 Members
  • Available Instances: 6 <- This should be 2 Instead.

And after 5 minutes, it still did not scale down. All VMs will running.
Ive verified the CPU Utilisation of all VMs were only 1% consistently.

In order to have the service resume, i had to:

  • Disable the Autoscale Group
  • Delete the VMs
  • Re-enable the Autoscale VM

After that, the scaledown rule works, and i had the setup below again:

I had VMs that:

  • Min : 2 Members
  • Max : 6 Members
  • Available Instances: 2
NA
EXPECTED RESULTS
NA
ACTUAL RESULTS
In the case where the VMs were not 'Orphaned', and managed to find its way back to the Autoscale Group (or just happened to be in a host that was not affected), i noticed that the scale down rule did not work anymore.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions