Project

General

Profile

Actions

action #70720

closed

Unable to restart a child from START_DIRECTLY_AFTER_TEST chain if another child has been restarted already

Added by ggardet_arm over 4 years ago. Updated about 4 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Regressions/Crashes
Target version:
Start date:
2020-08-31
Due date:
% Done:

0%

Estimated time:

Description

Observation

I had 2 jobs to restart (in skipped state). Both are children from RPi_flash_firmware@RPi3 and use START_DIRECTLY_AFTER_TEST.
I can restart the first job properly, but when I try to restart the other child, I get an error:

Errors occurred when restarting jobs:
  Job 1375370 has already been cloned as 1377993

You can try to restart https://openqa.opensuse.org/tests/1375372

Steps to reproduce

  • Find cluster of directly-chained jobs with parent, here called A, and at least two children, called B1 and B2
  • Restart one child job, B1
  • Observe that B1 is cloned to B1' and A is cloned to A'
  • Navigate to second child job, B2
  • Observe that retrigger button is available
  • Observe that clicking on retrigger button yields "Job A has already been cloned as A'" but B2 is not retriggered

Effect: There is no way available over UI to restart B2 and the cluster relationship is not obvious

Acceptance criteria

  • AC1: Every job without a clone can be retriggered directly or indirectly or is linked to what can be considered a clone

Problem

Regression introduced with #68956 .

Suggestions

In the above example "Steps to reproduce" the actual problem already starts with retriggering B1 to B1' which implicitly clones A to A' but does not touch other siblings. Retriggering the parent A to A' would have correctly cloned all children. Maybe it is the easiest option for now to just prevent retriggering a child of a directly-chained dependency and only allow retriggering the parent. Based on this I propose:

  • Given the API endpoint of the retrigger button click, e.g. the "duplicate" method, If job has directly-chained parent and job has no clone, Then explain that the proper way is to retrigger parent and link there and mention API alternatives

Workaround

  • Instead of trying to restart failed children restart the parent over API and skip passed children following https://open.qa/docs/#_further_notes_2
  • As an alternative to START_DIRECTLY_AFTER_TEST one can define a specific "machine" with a specific worker class that is only fulfilled by a single, unique worker instance

Related issues 2 (1 open1 closed)

Related to openQA Project (public) - action #70618: Automatically avoid restarting the directly chained parent if possible to save timeNew2020-08-27

Actions
Related to openQA Project (public) - action #69979: Advanced job restarting via the web UIResolvedokurz2020-08-13

Actions
Actions

Also available in: Atom PDF