action #104241: Retrigger the original/initial job chain after parts have been retriggered - openQA Project (public) - openSUSE Project Management Tool

Actions

Copy link

action #104241

open

Retrigger the original/initial job chain after parts have been retriggered

Added by Julie_CAO almost 3 years ago. Updated almost 3 years ago.

Status:

New

Priority:

Low

Assignee:

Category:

Feature requests

Target version:

QA (public) - future

Start date:

2021-12-22

Due date:

% Done:

Estimated time:

Description

Given a common scenario in our daily test, a test(the red one in the middle of the graph A) in a job chain fails .

We give a glimpse of the failing test and it does not look like a product issue( pxe/ipmi/needle_mismatch/network_glitch/openqa or other environment issue).
We rerun this test. The original graph disappears and a new dependency graph is created.

And you can find another graph from the test which is remained untouched.

We can see Graph A = Graph B + Graph C, the original job chain splite into B and C.

After rerun the job still fails, we look into this test again and we make some automation enhancement.
Finally, we'd like rerun the entire job chain to test product again, to check our automation or whatever we may would like. But it is no way via web UI. The only feasible way I know so far is with the command line(/usr/share/openqa/script/client isos post ...), but it needs extra permission(admin permission?) and is unsafe(I once kicked off all the tests in OSD accidently by only missing a space).

It would be very helpful if the openqa webui or an easy command line can fulfill my requirements in step 4.

Files

Download all files

initial_job_chain.png (100 KB) initial_job_chain.png	initial_job_chain.png	Julie_CAO, 2021-12-22 05:27
recreated_job_chain.png (47.7 KB) recreated_job_chain.png	recreated_job_chain.png	Julie_CAO, 2021-12-22 05:39
Truncated_job_chain.png (55.4 KB) Truncated_job_chain.png	Truncated_job_chain.png	Julie_CAO, 2021-12-22 05:45

Related issues 1 (0 open — 1 closed)

Actions

Copy link

Updated by Julie_CAO almost 3 years ago

Related to action #69979: Advanced job restarting via the web UI added

Actions

Copy link

Updated by Julie_CAO almost 3 years ago

For step 4: not only I have not way to restart the entire job chain, but also I am unable to retrigger any tests in Graph C. Becuase its parent job(install_15sp4_xen_host) has already been reruned in step 2.

Actions

Copy link

Updated by okurz almost 3 years ago

Category changed from Feature requests to Support
Status changed from New to Feedback
Assignee set to okurz
Target version set to Ready

Hi, could you please share the URLs to the actual jobs in a reachable openQA instance for reference? I assume this is coming from openqa.suse.de . This could help to understand properly what is going on.

Julie_CAO wrote:

Given a common scenario in our daily test, a test(the red one in the middle of the graph A) in a job chain fails .

We give a glimpse of the failing test and it does not look like a product issue( pxe/ipmi/needle_mismatch/network_glitch/openqa or other environment issue).

In the above example the job shows up as "incomplete" but that shouldn't matter for the rest of the observation you describe.

The only feasible way I know so far is with the command line(/usr/share/openqa/script/client isos post ...), but it needs extra permission(admin permission?)

The further down described restarting/retriggering is likely the better way but in case for using the openQA-client at least a better start is with openqa-cli api -X post … instead of the full-path of /usr/share/openqa/…

[…]

Finally, we'd like rerun the entire job chain to test product again […] But it is no way via web UI

You should be able to just retrigger the original parent, "install_15sp4_xen_host@64…" in your example. If you only want to retrigger failed jobs in the dependency tree then you can do that with the skip_ok_result_children=1 API parameter as described in http://open.qa/docs/#_handling_of_related_jobs_on_failure_cancellation_restart . So you could use e.g. openqa-cli api --osd -X post jobs/<job_id>/restart skip_ok_result_children=1. With https://github.com/os-autoinst/openQA/pull/4417 this should also be possible over the webUI

Actions

Copy link

Updated by Julie_CAO almost 3 years ago

okurz wrote:

Hi, could you please share the URLs to the actual jobs in a reachable openQA instance for reference? I assume this is coming from openqa.suse.de . This could help to understand properly what is going on.

Here is one of the job but its depency graph changed after some retrigger. http://10.67.129.51/tests/2734#dependencies

And make a correction in the description, graph A != graph B+ graph C. I am confused how the graphs is created. Moreover, the graph seems to be have different changes between webui restart and openqa-cli api. So after some retrigger, the graph has become completely unknown to me.

Finally, we'd like rerun the entire job chain to test product again […] But it is no way via web UI

You should be able to just retrigger the original parent, "install_15sp4_xen_host@64…" in your example. If you only want to retrigger failed jobs in the dependency tree then you can do that with the skip_ok_result_children=1 API parameter as described in http://open.qa/docs/#_handling_of_related_jobs_on_failure_cancellation_restart . So you could use e.g. openqa-cli api --osd -X post jobs/<job_id>/restart skip_ok_result_children=1. With https://github.com/os-autoinst/openQA/pull/4417 this should also be possible over the webUI

I can handle step 2 to rerun only failing jobs, but my problem is step 4: after part of tests in the chain has been retriggered, the original ones(those have not been triggered) are unable to be restarted any more, because their parant job has been cloned already.

Yes, the pr to provide restart options in the webui is helpful, thanks, Oliver. I just want more.

Actions

Copy link

Updated by okurz almost 3 years ago

Subject changed from retrigger the orginal/initial job chain to Retrigger the original/initial job chain after parts have been retriggered
Category changed from Support to Feature requests
Status changed from Feedback to New
Assignee deleted (~~okurz~~)
Priority changed from Normal to Low
Target version changed from Ready to future

Julie_CAO wrote:

Finally, we'd like rerun the entire job chain to test product again […] But it is no way via web UI

You should be able to just retrigger the original parent, "install_15sp4_xen_host@64…" in your example. If you only want to retrigger failed jobs in the dependency tree then you can do that with the skip_ok_result_children=1 API parameter as described in http://open.qa/docs/#_handling_of_related_jobs_on_failure_cancellation_restart . So you could use e.g. openqa-cli api --osd -X post jobs/<job_id>/restart skip_ok_result_children=1. With https://github.com/os-autoinst/openQA/pull/4417 this should also be possible over the webUI

I can handle step 2 to rerun only failing jobs, but my problem is step 4: after part of tests in the chain has been retriggered, the original ones(those have not been triggered) are unable to be restarted any more, because their parant job has been cloned already.

so do I understand correctly that you would like to retrigger the complete chain after parts of it have been initially retriggered? In any case I guess this would then actually be a new feature so putting to "future" for now

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

QA (public) » openQA Project (public)

Tags

Custom queries

action #104241

Retrigger the original/initial job chain after parts have been retriggered

Updated by Julie_CAO almost 3 years ago

Updated by Julie_CAO almost 3 years ago

Updated by okurz almost 3 years ago

Updated by Julie_CAO almost 3 years ago

Updated by okurz almost 3 years ago