action #52559: [network] test fails in t01_basic to ping the other node - openQA Tests (public) - openSUSE Project Management Tool

Actions

Copy link

action #52559

closed

[network] test fails in t01_basic to ping the other node

Added by okurz almost 6 years ago. Updated almost 6 years ago.

Status:

Resolved

Priority:

High

Assignee:

okurz

Category:

Bugs in existing tests

Target version:

Start date:

2019-06-04

Due date:

% Done:

Estimated time:

Difficulty:

Description

Observation¶

openQA test in scenario opensuse-Tumbleweed-DVD-x86_64-wicked_basic_sut@64bit fails in
t01_basic

Test suite description¶

Include basic sanity checks of wicked network framework
Maintainer: asmorodskyi@suse.de

Reproducible¶

Fails since (at least) Build 20190529
but probably a recent regression as stated by asmorodskyi in #opensuse-factory chat

Expected result¶

Last good: 20190527 (or more recent)

Further details¶

Always latest result in this scenario: latest

Related issues 2 (0 open — 2 closed)

Actions

Copy link

Updated by okurz almost 6 years ago

Status changed from New to Workable
Assignee set to asmorodskyi

@asmorodskyi you wanted to look into this, right?

Actions

Copy link

Updated by okurz almost 6 years ago

Blocks action #52499: [aarch64] Proper multi-machine test setup and wicked_basic successfully tested (was: wicked tests always in schedule state - tap worker required) added

Actions

Copy link

Updated by asmorodskyi almost 6 years ago

Assignee deleted (~~asmorodskyi~~)

Actions

Copy link

Updated by okurz almost 6 years ago

Status changed from Workable to In Progress
Assignee set to okurz

fine, I will try again. I did echo 'zypper -n in libcap-progs && for i in /usr/bin/qemu-system-*; do setcap CAP_NET_ADMIN=ep $i ; done' | transactional-update shell && reboot on openqaworker1 and retriggered wicked_basic, see https://openqa.opensuse.org/tests/958541# , do we expect that to work now? what next?

Actions

Copy link

Updated by okurz almost 6 years ago

moved most recent jobs into the dev group to not block any TW release:

for i in 953624 953742 958540 958541; do openqa-client --host https://openqa.opensuse.org jobs/$i put --json-data '{"group_id": 38}'; done

{ job_id => 953624 }
{ job_id => 953742 }
{ job_id => 958540 }
{ job_id => 958541 }

https://openqa.opensuse.org/tests/958701 retriggered.

Moved both "wicked_basic_ref" and "wicked_basic_sut" from the validation job group Tumbleweed to Development Tumbleweed. Also added for aarch64 in dev group.

Actions

Copy link

Updated by okurz almost 6 years ago

https://openqa.opensuse.org/tests/958701#step/before_test/80 fails what looks like the same. @asmorodskyi triggered some jobs with isos post to try out or debug stuff and then hit problems that the triggered jobs lost their relation to the parent and fail to load the hdd image even though it's referenced in the settings. He wants to continue looking into the issue later.

Actions

Copy link

Updated by okurz almost 6 years ago

Priority changed from Urgent to High

We looked into this together and the problem is in the former test relying on the qemu user network and also incorrect DNS information

https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/7710 created to address this, seemingly fixes the problem on x86_64, see https://openqa.opensuse.org/tests/962324 , but not yet on aarch64.

Actions

Copy link

Updated by asmorodskyi almost 6 years ago

IMO problem described in this ticket is already solved . for x86_64 there is no issues at all - all is working . for aarch64 problem is different than this ticket stating and also for aarch64 we have another ticket ( linked to this one ) so no point to keep two track same issue

Actions

Copy link

Updated by okurz almost 6 years ago

Yes, agreed. I will close this ticket as soon as we have the original scenario validated in the product validation job group of Tumbleweed.

Actions

Copy link

#10

Updated by okurz almost 6 years ago

Assignee changed from okurz to asmorodskyi

@asmorodskyi also the scenario is still in the development job group so not solved.

Now we have a failure in https://openqa.opensuse.org/tests/963181#step/t08_setup_second_card/175 .

In https://openqa.opensuse.org/tests/963181/file/serial_terminal.txt I can see a lot of errors, all which seem to not stop the test execution.

could you take a look please?

Actions

Copy link

#11

Updated by okurz almost 6 years ago

Blocks action #51635: [network] test fails in t08_setup_second_card added

Actions

Copy link

#12

Updated by asmorodskyi almost 6 years ago

Assignee deleted (~~asmorodskyi~~)

ticket #51635 is covering t08 exclusively . don't want to track two tickets for same issue

Actions

Copy link

#13

Updated by okurz almost 6 years ago

Status changed from In Progress to Feedback
Assignee set to okurz

@asmorodskyi please reference other tickets by #<id> and not the full URL to make use of the title and status preview. Also, "In Progress" without an assignee does not make much sense. So whatever ticket you prefer is fine for me. I will set this one to "Feedback" then and wait for your results in the other ticket to declare the scenario as stable before we can move it to group 1.

Actions

Copy link

#14

Updated by ggardet_arm almost 6 years ago

It seems to be fixed, or am I missing something?

Actions

Copy link

#15

Updated by okurz almost 6 years ago

ggardet_arm wrote:

It seems to be fixed, or am I missing something?

well, I guess you have read the comment just above your question? Yes, the original problem is fixed however we moved the test scenario into the "development" job group until it is stable again and currently the test scenario reproducibly fails on #51635 . So my suggested chain of work (serialized) is: Fix #51635 -> prove that the whole scenario is stable (green/soft-fail) -> move from "dev" group to product validation group -> prove that the scenario @ aarch64 is stable as well -> move to aarch64 product validation job group. My ETA: 4 weeks - 4 months, depending on when asmorodskyi plans to investigate the detailed problem and/or enjoys some summer break holiday ;) To speed up you can of course adjust the scenario to skip the test module "t08_setup_second_card" and only execute the others, e.g.: Add EXCLUDE_MODULES=t08_setup_second_card to the test suite along with a comment in the test suite description pointing to #51635 and add another testsuite that does not exclude the failing module and add that to the dev group and reference that in #51635

Actions

Copy link

#16

Updated by okurz almost 6 years ago

Status changed from Feedback to Blocked

let's make it more clear that currently there is no work within this ticket by setting to "Blocked". Blocked by #51635

Actions

Copy link

#17

Updated by okurz almost 6 years ago

Status changed from Blocked to Resolved

resolved as per #51635#note-9, jobs are in product validation job group of openSUSE Tumbleweed again.

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

QA (public) » openQA Project (public) » openQA Tests (public)

Tags

Custom queries

action #52559

[network] test fails in t01_basic to ping the other node

Observation¶

Test suite description¶

Reproducible¶

Expected result¶

Further details¶

Updated by okurz almost 6 years ago

Updated by okurz almost 6 years ago

Updated by asmorodskyi almost 6 years ago

Updated by okurz almost 6 years ago

Updated by okurz almost 6 years ago

Updated by okurz almost 6 years ago

Updated by okurz almost 6 years ago

Updated by asmorodskyi almost 6 years ago

Updated by okurz almost 6 years ago

Updated by okurz almost 6 years ago

Updated by okurz almost 6 years ago

Updated by asmorodskyi almost 6 years ago

Updated by okurz almost 6 years ago

Updated by ggardet_arm almost 6 years ago

Updated by okurz almost 6 years ago

Updated by okurz almost 6 years ago

Updated by okurz almost 6 years ago