The default TORQUE configuration we have seems to not do scheduling entirely correctly, for example in testing it is possible to:
- submit 16 x 2-core jobs, that will then run on a single 16-core compute node
The autoscaler also assumes that multiple jobs cannot be run on a single compute host (this is usually the case for TORQUE) - so the autoscaler is calculating the number of nodes required based on a single job per node, when in fact the scheduler is allowing us currently to run multiple jobs on a single compute host.
There is probably some series of configuration that allocates jobs appropriately
Possibly useful (thanks @mjtko) - http://www.supercluster.org/pipermail/torqueusers/2012-May/014636.html
The default TORQUE configuration we have seems to not do scheduling entirely correctly, for example in testing it is possible to:
The autoscaler also assumes that multiple jobs cannot be run on a single compute host (this is usually the case for TORQUE) - so the autoscaler is calculating the number of nodes required based on a single job per node, when in fact the scheduler is allowing us currently to run multiple jobs on a single compute host.
There is probably some series of configuration that allocates jobs appropriately
Possibly useful (thanks @mjtko) - http://www.supercluster.org/pipermail/torqueusers/2012-May/014636.html