Project

General

Profile

action #120793

Improve restarting behavior when (dependend) jobs have already been restarted

Added by mkittler 2 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
Feature requests
Target version:
Start date:
2022-11-21
Due date:
% Done:

0%

Estimated time:
Difficulty:

Description

Observation

When restarting a job dependencies are restarted by default as well as needed. All restarted jobs show up as "clones" (e.g. "Cloned as …" is shown in the info box on the test details)¹. It can happen that a dependency tree is only partially restarted so some of the jobs have a clone and some don't. When trying to restart a job that has no clone yet openQA might try to restart a dependent job that has already a clone leading to the error "Job … already has clone …".

Workaround

So far one can only workaround this by explicitly skipping to restart certain dependencies as well, e.g. by using the "Skip parents" restart option if a parent dependency is the problematic one. Of course this doesn't help if that dependency actually should be restarted once again.

Suggestion

  • A better error message with suggesting the workaround would be useful. However, it is likely not that easy to implement to make the right suggestion (and not confuse users even further).
  • There might be a nice way of handling this: Trying to consider the clone instead. For instance, when a chained parent is being restarted as well because it hasn't passed/softfailed then the code would check whether it has already been restarted. If it has already been restarted it would look at that clone. If the clone hasn't been passed/softfailed then it would attempt to restart the clone but check again whether it has already been restarted. If it has already been restarted it would look at the clone again. This would continue recursively until the end of the clone-chain is reached.
  • Alternatively we could just skip restarting those jobs with a warning but then e.g. an incomplete parent job creating an asset would perhaps not be restarted when restarting the chained child requiring it (and thus the restarted child would end up incomplete).
  • Alternatively we could just break the rule of not restarting jobs that have already a clone but then we wouldn't just having "clone chains" but "clone trees" and that would likely make things even more complicated. (So that's likely a very bad idea.)

¹ This has nothing to do with the openqa-clone-job script. This whole ticket is about openQA's restart API (which restarts jobs within and openQA instance).

Also available in: Atom PDF