action #112868: Helpful instructions to prevent incomplete cluster restarts - openQA Project (public) - openSUSE Project Management Tool

Actions

Copy link

action #112868

open

coordination #112862: [saga][epic] Future ideas for easy multi-machine handling: MM-tests as first-class citizens

Helpful instructions to prevent incomplete cluster restarts

Added by okurz almost 3 years ago. Updated almost 3 years ago.

Status:

New

Priority:

Normal

Assignee:

Category:

Feature requests

Target version:

QA (public) - future

Start date:

2022-06-22

Due date:

% Done:

Estimated time:

Description

Motivation¶

In a case like
https://openqa.suse.de/tests/8966763#dependencies
a job is not passed so users might like to restart. Trying to retrigger over the button in the webUI shows an error

Errors occurred when restarting jobs:

    Job 8966755 already has clone 8998406

First an inconvenience is that just the job IDs are shown but no links are rendered. Second, the user would still like to restart the job but can't. In the above example 8966755 is the serial parent "create_hdd_ha_textmode_maintenance" which already has a clone 8998406 which likely was created when a job in another sub-cluster was retriggered

Suggestions¶

In https://github.com/os-autoinst/openQA/blob/1fa560517e812a3886219eb3667e9fc05f9f873d/lib/OpenQA/Schema/Result/Jobs.pm#L625
Extend the die-message, maybe with proper URLs? Can we do that here?
Add explanations to that die-message to explain what options the user has, e.g. include the text from the section "Workaround" further below

Further details¶

See https://suse.slack.com/archives/C02CANHLANP/p1655887247175179 for details

Workaround¶

To avoid this problem retrigger the serial parent for multiple sub-clusters to achieve consistent results
To fix the situation if already an incomplete cluster was created delete the serial parent job which prevents cloning of the original failed job and restart the serial parent of the complete cluster (instead of any child job)

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

QA (public) » openQA Project (public)

Tags

Custom queries