coordination #58166: EPIC: Continue tests after failures on !qemu - openQA Project (public) - openSUSE Project Management Tool

Actions

Copy link

coordination #58166

closed

EPIC: Continue tests after failures on !qemu

Added by xlai over 5 years ago. Updated over 4 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

mkittler

Category:

Feature requests

Target version:

Done

Start date:

2019-10-15

Due date:

% Done:

Estimated time:

Description

Our jobs run on ipmi workers. When many tests chained, to get high test efficiency, we need the feature that the following tests can continue when earlier tests fail.

We were suggested to set fatal flag to 0 to these tests. However from the tried example, it did not work.

Failure job link:
http://10.67.18.220/tests/38#.

Can any expert on this help to confirm whether we use it the correct way?

Job details:

Test order:
login_console -> fail_moduleA -> fail_moduleB

 fail_moduleA main code:

 sub run {
    type_string("echo start fail_moduleA.pm\n");
    die "die on purpose to check if test continue to next module";
 }


 sub post_fail_hook {
    #force_soft_failure("let test continue...");
         type_string("post_fail_hook DONE");
    save_screenshot;
 }

 sub test_flags {
     return {fatal => 0};

But B was not started after A fail.

Actions

Copy link

Updated by xlai over 5 years ago

Subject changed from [OpenQA tool][ipmi backend] Test can not continue when fatal flag is 0 to [OpenQA tool][ipmi backend] Test can not continue when test with fatal flag 0 fail

Actions

Copy link

Updated by coolo over 5 years ago

Subject changed from [OpenQA tool][ipmi backend] Test can not continue when test with fatal flag 0 fail to EPIC: Continue tests after failures on !qemu
Category set to Feature requests
Target version set to Ready

everything is fatal on !qemu backends (and fatal => 0 is the default for tests). I'm open to change the behavior for !qemu, but it will bring new problems. Like with failures as https://openqa.suse.de/tests/3475676#step/yast2_apparmor/9 you need more code than blindly continuing.

Actions

Copy link

Updated by xlai over 5 years ago

coolo wrote:

everything is fatal on !qemu backends (and fatal => 0 is the default for tests). I'm open to change the behavior for !qemu, but it will bring new problems. Like with failures as https://openqa.suse.de/tests/3475676#step/yast2_apparmor/9 you need more code than blindly continuing.

I anticipated that. IMO, it is reasonable requirement to test code when given the ability to turn on this "continue" feature. So it is not wise to do it to necessary preparation steps, but suitable for optional feature tests.

Thanks for accepting this feature. This is important for us to run chained tests efficiently.

Actions

Copy link

Updated by mkittler over 5 years ago

Description updated (diff)

Actions

Copy link

Updated by mkittler over 5 years ago

Maybe make fatal => 1 the default for backends other than QEMU but don't override fatal when it has been set explicitly? Then it wouldn't instantly affect all tests and is in accordance with how @xlai intuitively thought it would work.

But as @coolo says this can generally some cleanup code to run and that cleanup code will be hard to write since the test system is likely in an unknown state.

Actions

Copy link

Updated by mkittler over 5 years ago

Assignee set to mkittler
Target version changed from Ready to Current Sprint

Actions

Copy link

Updated by mkittler over 5 years ago

Status changed from New to In Progress

The change to make the workflow from the ticket description work is actually not much: https://github.com/os-autoinst/os-autoinst/pull/1270

@xlai Would that tiny adjustment be sufficient? As explained in the PR message it would not mess with the default behavior.

Actions

Copy link

Updated by mkittler over 5 years ago

Status changed from In Progress to Feedback

Actions

Copy link

Updated by xlai over 5 years ago

@mkitter, thanks for implementing this feature. Will try it later, because currently I am busy with alpha6 and a P1 gmc2 product bug.

Actions

Copy link

#10

Updated by mkittler over 5 years ago

@xlai Ok, the PR has also already been merged. So with an updated os-autoinst the workflow should be possible now.

Actions

Copy link

#11

Updated by xlai over 5 years ago

mkittler wrote:

@xlai Ok, the PR has also already been merged. So with an updated os-autoinst the workflow should be possible now.

@mkittler Sorry for the late feedback. I tried again the example in description on ipmi backend. However the second test module B is not triggered at all. Same as before.

Failure job: http://10.67.19.98/tests/42/file/autoinst-log.txt

Pkg versions:

linux-gepp:/var/lib/openqa/tests/sle-12-SP5/tests/virt_autotest # rpm -qa |grep -i openqa
openQA-client-4.6.1574313539.7b1e3a33c-2029.1.noarch
openQA-4.6.1574313539.7b1e3a33c-2029.1.noarch
openQA-local-db-4.6.1574313539.7b1e3a33c-2029.1.noarch
openQA-common-4.6.1574313539.7b1e3a33c-2029.1.noarch
openQA-worker-4.6.1574313539.7b1e3a33c-2029.1.noarch
linux-gepp:/var/lib/openqa/tests/sle-12-SP5/tests/virt_autotest # rpm -qa | grep os-autoinst
os-autoinst-4.6.1574082336.cfde39a0-245.1.x86_64

Actions

Copy link

#12

Updated by mkittler over 5 years ago

I've just did git log cfde39a0 and my commit "Allow unsetting 'fatal' test flag without snapshot support" is not part of it. So your os-autoinst version is too old.

Actions

Copy link

#13

Updated by xlai over 5 years ago

mkittler wrote:

I've just did git log cfde39a0 and my commit "Allow unsetting 'fatal' test flag without snapshot support" is not part of it. So your os-autoinst version is too old.

Oh, really sorry about it. I did not check the exact code content. But what I installed was the latest in Dec 4, the day before yesterday. I thought a new version of these tools would be built once a new PR merged. Would you mind sharing from which version this patch is in?

Actions

Copy link

#14

Updated by okurz over 5 years ago

Version 4.6.1574429927.5158b63b and newer have this feature. The package from the OBS repo devel:openQA as well as the package in Tumbleweed has it.

Actions

Copy link

#15

Updated by xlai over 5 years ago

okurz wrote:

Version 4.6.1574429927.5158b63b and newer have this feature. The package from the OBS repo devel:openQA as well as the package in Tumbleweed has it.

Thank you, will update to that and reverify. Will keep you updated!

Actions

Copy link

#16

Updated by xlai over 5 years ago

xlai wrote:

okurz wrote:

Version 4.6.1574429927.5158b63b and newer have this feature. The package from the OBS repo devel:openQA as well as the package in Tumbleweed has it.

@okurz, the build flag of os-autoinst is disabled for opensuse 42.3, as shown in https://build.opensuse.org/repositories/devel:openQA/os-autoinst. Is it on purpose? If not, can you help enable it? Installing tumbleweed version on 42.3 is troublesome.

Actions

Copy link

#17

Updated by okurz over 5 years ago

xlai wrote:

xlai wrote:

okurz wrote:

Version 4.6.1574429927.5158b63b and newer have this feature. The package from the OBS repo devel:openQA as well as the package in Tumbleweed has it.

@okurz, the build flag of os-autoinst is disabled for opensuse 42.3, as shown in https://build.opensuse.org/repositories/devel:openQA/os-autoinst. Is it on purpose? If not, can you help enable it?

Yes, it is disabled on purpose as openSUSE Leap 42.3 is EOL already since July 2019 and not supported. It is disabled rather than removed to ensure binaries are kept for the time being but new versions will not be built, are untested or even fail to build.

Installing tumbleweed version on 42.3 is troublesome.

More like: Don't do it :) It might work for some packages with no further strict dependencies but it will not work for os-autoinst or openQA for sure.

The best tested version for a current os-autoinst and openQA is openSUSE Leap 15.1, Tumbleweed should be second best, other versions might work depending on package state in https://build.opensuse.org/project/show/devel:openQA , Leap 42.3 will not work.

Actions

Copy link

#18