action #32725

action #32851: [tools][EPIC] Scheduling redesign

[tools] Scheduler job_grab/filter_jobs refactoring

Added by EDiGiacinto almost 2 years ago. Updated over 1 year ago.

Status:ResolvedStart date:05/05/2018
Priority:NormalDue date:
Assignee:-% Done:

0%

Category:Feature requests
Target version:Done
Difficulty:
Duration:

Description

For the moment it would be sufficient to refactor job_grab, and pass over schemas objects instead of hashes and re-use them in the filtering phase, meanwhile refactoring properly those areas (e.g. extracting them) - better if we also optimize the queries.

Kinda of a hard task, since could potentially make things 'shaky'.


Related issues

Related to openQA Project - action #39560: Tests for blocked_by and loops inside of it Resolved 10/08/2018
Related to openQA Project - action #39629: openQA Scheduler refactor fallout Resolved 13/08/2018

History

#1 Updated by EDiGiacinto almost 2 years ago

  • Description updated (diff)

#2 Updated by EDiGiacinto almost 2 years ago

  • Description updated (diff)

#3 Updated by coolo almost 2 years ago

  • Target version set to Ready

Taken that we usually have < 2000 jobs scheduled, I think it's worth an experiment to grab all open jobs and schedule in memory. There is only one scheduler process, so IMO this should simplify a lot. Yes, hard task - but worth the time spent to make the whole scheduling understandable.

#4 Updated by EDiGiacinto almost 2 years ago

  • Description updated (diff)
  • Parent task set to #32851

#5 Updated by szarate almost 2 years ago

  • Start date changed from 02/03/2018 to 05/05/2018

due to changes in a related task

#6 Updated by szarate over 1 year ago

  • Target version changed from Ready to Current Sprint

We agreed during the sprint planning to keep going with this: https://github.com/os-autoinst/openQA/pull/1718 ettore will take over coolo's branch.

#7 Updated by EDiGiacinto over 1 year ago

PR: https://github.com/os-autoinst/openQA/pull/1729 ( missing acceptance testing now )

#8 Updated by szarate over 1 year ago

  • Assignee set to EDiGiacinto

I think there are still some rough edges here, right?

#9 Updated by EDiGiacinto over 1 year ago

  • Status changed from New to Feedback
  • Assignee deleted (EDiGiacinto)

Yes, but as i'm not the author of this set of changes, can't help much more from here - i adapted the tests we already had for the scheduler and did some adaptations to the issues i could notice, but i think there are still bugs to catch.

Latest related change to it is: https://github.com/os-autoinst/openQA/pull/1743

#10 Updated by EDiGiacinto over 1 year ago

  • Related to action #39560: Tests for blocked_by and loops inside of it added

#11 Updated by EDiGiacinto over 1 year ago

  • Related to action #39629: openQA Scheduler refactor fallout added

#12 Updated by coolo over 1 year ago

  • Status changed from Feedback to Resolved

The scheduler was refactored - we have some new problems (like starving multimachine jobs), but they are design decisions and need future fine tuning

#13 Updated by coolo over 1 year ago

  • Target version changed from Current Sprint to Done

Also available in: Atom PDF