action #23318

Limit gru tasks

Added by coolo over 2 years ago. Updated almost 2 years ago.

Status:ResolvedStart date:11/08/2017
Priority:NormalDue date:
Assignee:EDiGiacinto% Done:

0%

Category:Feature requests
Target version:Done
Difficulty:
Duration:

Description

We right now schedule one limit_assets and one limit_results job for every ISO posted. Which was fine when all ISOs
posted were either TW or SLES builds - i.e. rarely.

But now we see dozens of incidents posted as ISO per hour - and the GRU is not able to run them as quickly. And fact is,
the actual task is a global one, so there is no need to do this more often than once in a while. We can't really decouple
it from the actual load, but binding it to ISO posts sounds wrong too.

Just brainstorming possibly solutions:
- keep binding it to ISO post, but make sure there is only one of such job around
- only trigger the task on every Xth job created (where X could be 1000 or generally configurable)

this is mildly urgent as I'm removing 1000 gru tasks daily from the backlog


Related issues

Related to openQA Project - action #18462: Move GRU tasks into Minion jobs Resolved 10/04/2017

History

#1 Updated by coolo over 2 years ago

One (actually independent) option is to check why limit_results is so slow :)

#2 Updated by coolo over 2 years ago

  • Target version set to Ready

for limit_results Xth job is actually the better option, but limit_assets we need to keep connected to assets posted. But possibly to Xth asset.

#3 Updated by szarate almost 2 years ago

  • Related to action #18462: Move GRU tasks into Minion jobs added

#4 Updated by szarate almost 2 years ago

  • Target version changed from Ready to Current Sprint

#5 Updated by szarate almost 2 years ago

  • Assignee set to EDiGiacinto

#6 Updated by EDiGiacinto almost 2 years ago

  • Status changed from New to In Progress

#7 Updated by EDiGiacinto almost 2 years ago

  • Status changed from In Progress to Resolved
  • Target version changed from Current Sprint to Done

We have now a ttl mechanism ( https://github.com/os-autoinst/openQA/pull/1637 ) to be sure that very old tasks ( 2 days ) are not executed, but with the recent optimizations by coolo and the Minion migration ( with error handling ) this became just a failsafe option as we don't have too much tasks anymore in the backlog.

Also available in: Atom PDF