action #70618: Automatically avoid restarting the directly chained parent if possible to save time - openQA Project (public) - openSUSE Project Management Tool

Actions

Copy link

action #70618

open

Automatically avoid restarting the directly chained parent if possible to save time

Added by mkittler almost 5 years ago. Updated over 4 years ago.

Status:

New

Priority:

Low

Assignee:

Category:

Feature requests

Target version:

QA (public) - future

Start date:

2020-08-27

Due date:

% Done:

Estimated time:

Description

Motivation¶

Considering #69979#note-6 and subsequent comments people want to avoid restarting the directly chained parent as much as possible and considering that this is usually a long running job it also makes sense.

Acceptance criteria¶

AC1: Directly chained parents are not restarted if the resulting worker instance would be the same and there was no job from any other cluster running on the worker instance since the original parent ran
AC2: A directly chained parent is still restarted if the worker ran a job from a different cluster in the meantime

Suggestions¶

This ticket is far from workable. I'm just creating it do save and share an implementation idea I've just had.

Actually, restart the directly chained parent. If the parent has already been restarted, just the clone of the parent. (Yes, so far it sounds not like an improvement.)
When assigning the directly chained job cluster to a worker, prefer the previously used worker. We already track cloning history and which job has been executed on which worker so that part shouldn't be hard.
When sending the job to the worker:
1. Send the original job IDs from the old directly chained cluster to the worker as well.
2. Send a list of job IDs we would actually like to skip to the worker. That list would contain the IDs of directly chained parents.
The worker checks whether it ran no other jobs then the jobs from 3.1. If it ran other jobs it will just execute all jobs as usual. If it did not run other jobs it will skip jobs from 3.2 and effectively not run the restarted parent jobs again.

This way we would not change a lot in openQA and I guess we would still achieve what the users are after. We would restart the parent "just in case" we really need to re-run it and otherwise just skip the restarted job. It would even work when a worker is working for different web UIs. What do you think? Did I forget something?

Workaround¶

As an alternative to START_DIRECTLY_AFTER_TEST one can define a specific "machine" with a specific worker class that is only fulfilled by a single, unique worker instance. This can help to optimize test runtime

Related issues 3 (0 open — 3 closed)

Actions

Copy link

Updated by mkittler almost 5 years ago

Description updated (diff)

Actions

Copy link

Updated by mkittler almost 5 years ago

Related to action #70720: Unable to restart a child from START_DIRECTLY_AFTER_TEST chain if another child has been restarted already added

Actions

Copy link

Updated by mkittler almost 5 years ago

Adding #70720 as related ticket as it is basically 1. from this ticket's suggestions.

Actions

Copy link

Updated by ph03nix almost 5 years ago

Hi Marius,

this sound like a very nice addition to openQA. We want to have this :-)

Especially point 4 makes sense and seems to be a very good compromise between guaranteeing that a child job has the pre-required steps done (e.g. bigger installations) on it's worker, and avoiding unnecessary re-runs of those installation when restarting a child.
Preferring the previous worker is very nice and shows that the concept has been though through

Actions

Copy link

Updated by okurz over 4 years ago

Related to action #69979: Advanced job restarting via the web UI added

Actions

Copy link

Updated by okurz over 4 years ago

Subject changed from Avoid restarting the directly chained parent if possible to Automatically avoid restarting the directly chained parent if possible to save time
Description updated (diff)
Priority changed from Normal to Low

I updated the ticket with acceptance criteria and also added a workaround that should help in the meantime. Given that I consider this of low prio and keep it in the "future" target version, i.e. the SUSE QA tools team does not plan to implement this anytime soon (not within the next months or years). We are always happy to receive pull requests from anyone :)

Actions

Copy link

Updated by okurz over 4 years ago

Related to action #68956: Restart the parent and child jobs of a test in a START_DIRECTLY_AFTER_TEST test chain added

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

QA (public) » openQA Project (public)

Tags

Custom queries

action #70618

Automatically avoid restarting the directly chained parent if possible to save time

Motivation¶

Acceptance criteria¶

Suggestions¶

Workaround¶

Updated by mkittler almost 5 years ago

Updated by mkittler almost 5 years ago

Updated by mkittler almost 5 years ago

Updated by ph03nix almost 5 years ago

Updated by okurz over 4 years ago

Updated by okurz over 4 years ago

Updated by okurz over 4 years ago