action #40811

Single Machine jobs starve clusters

Added by coolo over 1 year ago. Updated about 1 year ago.

Status:ResolvedStart date:10/09/2018
Priority:NormalDue date:
Assignee:-% Done:

0%

Category:Feature requests
Target version:Done
Difficulty:hard
Duration:

Description

As the scheduler tries to fill all available slots, the workers that can run
multi machine and single machine jobs will be filled with single machine jobs
as the clusters don't fit. The only way that clusters fit is if 4 jobs finish
within one scheduling round.

This is a tricky problem in general, but it has been solved before :)

History

#1 Updated by coolo over 1 year ago

The general idea is: whenver a job would be scheduled according to priority - but can't be scheduled due to cluster dependency, we increase a counter (or decrease the priority).
Once that counter reached a limit (or the priority turned 0), we reserve a worker slot for the job - and just won't allocate it until we have the full cluster.

#2 Updated by coolo over 1 year ago

  • Target version changed from Ready to Current Sprint

#3 Updated by coolo over 1 year ago

  • Status changed from New to Resolved

#4 Updated by szarate over 1 year ago

  • Target version changed from Current Sprint to Done

#5 Updated by coolo over 1 year ago

  • Target version changed from Done to Current Sprint

#6 Updated by coolo about 1 year ago

  • Target version changed from Current Sprint to Done

Also available in: Atom PDF