action #18462
closed
Move GRU tasks into Minion jobs
Added by coolo over 7 years ago.
Updated over 6 years ago.
Category:
Feature requests
Description
See https://github.com/os-autoinst/openQA/issues/519 for features that come with Minion - that we don't want to reimplement in GRU. The main reason I introduced GRU was that Minion requires Mojo::Pg and I did not want to have so many new dependencies for a simple things as background jobs.
This is Adam's issue (copied from 519, which I did not expect to turn into
a 404 when disabling issues :(
So, yeah, I really hate how Gru is set up to work when a task fails.
It just leaves it in the queue and loops back around. So until a higher priority task appears, it just tries the failed task over and over again. If the failure isn't transient, it'll just keep failing over and over and over and over. It never goes to sleep. It never decides "this just isn't working out" and puts the task off to the side and warns the admin or anything. Nope. It just loops around eternally, failing again and again and again. When a higher priority task appears it'll do that, but then go right back to looping on the broken task. Lower priority tasks will never get run until the failing task is cleared out somehow.
I would like to fix this; I hope I'll get some spare time to work on it. Here is my initial idea: the Gru task schema should get a new column, 'failure_count' or somesuch. It's an integer. Every time Gru ran a task and it failed, it would increment the integer. Gru's search for 'what task should I do next' should exclude tasks whose failure_count is higher than, say, 5. There would be a page or something in the admin interface which listed tasks in this state and let you manually reset their failure count to get them run again (so you could figure out what was wrong with them). Maybe Gru would have a one-time code block which searched for all tasks with failure_count > 5 and logged their IDs on startup (as just another place where the admin could notice broken tasks).
- Target version set to Ready
- Has duplicate action #30877: If gru task cannot be completed, it will attempt it forever in a loop, never reaching other tasks added
- Status changed from New to In Progress
- Target version changed from Ready to Current Sprint
- Assignee set to EDiGiacinto
- % Done changed from 0 to 50
Changes are almost done: https://github.com/mudler/openQA/tree/minion
Currently refining it, and needs staging tests still.
Just a side note: since GRU is tied with openQA jobs, i modified GRU as a soft-wrap over Minion.
Minion supports additionals 'data' to be carried for single jobs, but it would require further queries to scrape the relationships that we already have defined in GRU with our schema classes.
- Status changed from In Progress to Resolved
- Target version changed from Current Sprint to Done
Also available in: Atom
PDF