Project

General

Profile

Actions

action #104241

open

Retrigger the original/initial job chain after parts have been retriggered

Added by Julie_CAO almost 3 years ago. Updated almost 3 years ago.

Status:
New
Priority:
Low
Assignee:
-
Category:
Feature requests
Target version:
Start date:
2021-12-22
Due date:
% Done:

0%

Estimated time:

Description

Given a common scenario in our daily test, a test(the red one in the middle of the graph A) in a job chain fails .
Graph A

  1. We give a glimpse of the failing test and it does not look like a product issue( pxe/ipmi/needle_mismatch/network_glitch/openqa or other environment issue).

  2. We rerun this test. The original graph disappears and a new dependency graph is created.
    Graph B
    And you can find another graph from the test which is remained untouched.
    Graph C

We can see Graph A = Graph B + Graph C, the original job chain splite into B and C.

  1. After rerun the job still fails, we look into this test again and we make some automation enhancement.

  2. Finally, we'd like rerun the entire job chain to test product again, to check our automation or whatever we may would like. But it is no way via web UI. The only feasible way I know so far is with the command line(/usr/share/openqa/script/client isos post ...), but it needs extra permission(admin permission?) and is unsafe(I once kicked off all the tests in OSD accidently by only missing a space).

It would be very helpful if the openqa webui or an easy command line can fulfill my requirements in step 4.


Files

initial_job_chain.png (100 KB) initial_job_chain.png initial_job_chain.png Julie_CAO, 2021-12-22 05:27
recreated_job_chain.png (47.7 KB) recreated_job_chain.png recreated_job_chain.png Julie_CAO, 2021-12-22 05:39
Truncated_job_chain.png (55.4 KB) Truncated_job_chain.png Truncated_job_chain.png Julie_CAO, 2021-12-22 05:45

Related issues 1 (0 open1 closed)

Related to openQA Project (public) - action #69979: Advanced job restarting via the web UIResolvedokurz2020-08-13

Actions
Actions #1

Updated by Julie_CAO almost 3 years ago

  • Related to action #69979: Advanced job restarting via the web UI added
Actions #2

Updated by Julie_CAO almost 3 years ago

For step 4: not only I have not way to restart the entire job chain, but also I am unable to retrigger any tests in Graph C. Becuase its parent job(install_15sp4_xen_host) has already been reruned in step 2.

Actions #3

Updated by okurz almost 3 years ago

  • Category changed from Feature requests to Support
  • Status changed from New to Feedback
  • Assignee set to okurz
  • Target version set to Ready

Hi, could you please share the URLs to the actual jobs in a reachable openQA instance for reference? I assume this is coming from openqa.suse.de . This could help to understand properly what is going on.

Julie_CAO wrote:

Given a common scenario in our daily test, a test(the red one in the middle of the graph A) in a job chain fails .
Graph A

  1. We give a glimpse of the failing test and it does not look like a product issue( pxe/ipmi/needle_mismatch/network_glitch/openqa or other environment issue).

In the above example the job shows up as "incomplete" but that shouldn't matter for the rest of the observation you describe.

The only feasible way I know so far is with the command line(/usr/share/openqa/script/client isos post ...), but it needs extra permission(admin permission?)

The further down described restarting/retriggering is likely the better way but in case for using the openQA-client at least a better start is with openqa-cli api -X post … instead of the full-path of /usr/share/openqa/…

[…]

  1. Finally, we'd like rerun the entire job chain to test product again […] But it is no way via web UI

You should be able to just retrigger the original parent, "install_15sp4_xen_host@64…" in your example. If you only want to retrigger failed jobs in the dependency tree then you can do that with the skip_ok_result_children=1 API parameter as described in http://open.qa/docs/#_handling_of_related_jobs_on_failure_cancellation_restart . So you could use e.g. openqa-cli api --osd -X post jobs/<job_id>/restart skip_ok_result_children=1. With https://github.com/os-autoinst/openQA/pull/4417 this should also be possible over the webUI

Actions #4

Updated by Julie_CAO almost 3 years ago

okurz wrote:

Hi, could you please share the URLs to the actual jobs in a reachable openQA instance for reference? I assume this is coming from openqa.suse.de . This could help to understand properly what is going on.

Here is one of the job but its depency graph changed after some retrigger. http://10.67.129.51/tests/2734#dependencies

And make a correction in the description, graph A != graph B+ graph C. I am confused how the graphs is created. Moreover, the graph seems to be have different changes between webui restart and openqa-cli api. So after some retrigger, the graph has become completely unknown to me.

  1. Finally, we'd like rerun the entire job chain to test product again […] But it is no way via web UI

You should be able to just retrigger the original parent, "install_15sp4_xen_host@64…" in your example. If you only want to retrigger failed jobs in the dependency tree then you can do that with the skip_ok_result_children=1 API parameter as described in http://open.qa/docs/#_handling_of_related_jobs_on_failure_cancellation_restart . So you could use e.g. openqa-cli api --osd -X post jobs/<job_id>/restart skip_ok_result_children=1. With https://github.com/os-autoinst/openQA/pull/4417 this should also be possible over the webUI

I can handle step 2 to rerun only failing jobs, but my problem is step 4: after part of tests in the chain has been retriggered, the original ones(those have not been triggered) are unable to be restarted any more, because their parant job has been cloned already.

Yes, the pr to provide restart options in the webui is helpful, thanks, Oliver. I just want more.

Actions #5

Updated by okurz almost 3 years ago

  • Subject changed from retrigger the orginal/initial job chain to Retrigger the original/initial job chain after parts have been retriggered
  • Category changed from Support to Feature requests
  • Status changed from Feedback to New
  • Assignee deleted (okurz)
  • Priority changed from Normal to Low
  • Target version changed from Ready to future

Julie_CAO wrote:

  1. Finally, we'd like rerun the entire job chain to test product again […] But it is no way via web UI

You should be able to just retrigger the original parent, "install_15sp4_xen_host@64…" in your example. If you only want to retrigger failed jobs in the dependency tree then you can do that with the skip_ok_result_children=1 API parameter as described in http://open.qa/docs/#_handling_of_related_jobs_on_failure_cancellation_restart . So you could use e.g. openqa-cli api --osd -X post jobs/<job_id>/restart skip_ok_result_children=1. With https://github.com/os-autoinst/openQA/pull/4417 this should also be possible over the webUI

I can handle step 2 to rerun only failing jobs, but my problem is step 4: after part of tests in the chain has been retriggered, the original ones(those have not been triggered) are unable to be restarted any more, because their parant job has been cloned already.

so do I understand correctly that you would like to retrigger the complete chain after parts of it have been initially retriggered? In any case I guess this would then actually be a new feature so putting to "future" for now

Actions

Also available in: Atom PDF