Project

General

Profile

Actions

action #169843

closed

MultiMachine: test fails in rsync_server

Added by dimstar about 1 month ago. Updated about 1 month ago.

Status:
Resolved
Priority:
Immediate
Assignee:
Category:
Bugs in existing tests
Target version:
-
Start date:
2024-11-13
Due date:
% Done:

0%

Estimated time:
Difficulty:

Description

Observation

A good number of multimachine tests are failing - communication seems severely impacted.

According to the openQA investigation jobs, neither rerunning the previous build, not the previous test code have a different impact.

This implies infrastructure changes / issue (apparently OW 27 was setup into the pool again)

openQA test in scenario opensuse-Tumbleweed-DVD-x86_64-rsync-server@64bit fails in
rsync_server

Test suite description

Maintainer: zluo@suse.de
install and test rsync server

Reproducible

Fails since (at least) Build 20241112

Expected result

Last good: 20241111 (or more recent)

Further details

Always latest result in this scenario: latest

Actions #1

Updated by gpathak about 1 month ago

Related to #168916

Actions #2

Updated by gpathak about 1 month ago

  • Status changed from New to In Progress
  • Assignee set to gpathak
Actions #3

Updated by gpathak about 1 month ago · Edited

  • Removed openqaworker27 gre entries from all other workers using ovs-vsctl del-port br1 gre7
  • Restarted ovs services on all workers systemctl stop ovsdb-server.service ovs-delete-transient-ports.service ovs-vswitchd.service os-autoinst-openvswitch.service; sleep 2; systemctl start ovsdb-server.service ovs-delete-transient-ports.service ovs-vswitchd.service os-autoinst-openvswitch.service
  • Performed wicked ifup all on all workers

Found out that openqaworker20 isn't providing internet access to guest via tap interface, to verify it performed below steps on w20, w23 and w28

  • Downloaded qcow image from https://openqa.opensuse.org/tests/4643044/#downloads, changed root password using virt-customize -a opensuse-Tumbleweed-x86_64-20241112-gnome-wayland@64bit_virtio.qcow2 --root-password password:opensuse
  • Booted guest using qemu-system-x86_64 -m 2048 -enable-kvm -snapshot -netdev tap,id=qanet0,ifname=tap40,script=no,downscript=no -device virtio-net,netdev=qanet0,mac=c0:0c:b0:0c:c0:0c opensuse-Tumbleweed-x86_64-20241112-gnome-wayland@64bit_virtio.qcow2 -nographic
  • Setup network, https://open.qa/docs/#_within_the_vm_configure_the_network_like
  • doing, ping 8.8.8.8 from within the guest succeeded on w23 and w28 except w20

rsync server and client test passed: https://openqa.opensuse.org/tests/overview?result=passed&arch=&flavor=&machine=&test=rsync-server&test=rsync-client&modules=&module_re=&group_glob=¬_group_glob=&comment=#

yast2_nfs tests are failing: https://openqa.opensuse.org/tests/overview?result=failed&result=incomplete&arch=&flavor=&machine=&test=yast2_nfs_v3_server&modules=&module_re=&modules_result=passed&group_glob=¬_group_glob=&comment=#

Actions #4

Updated by slo-gin about 1 month ago

This ticket was set to Immediate priority but was not updated within the SLO period. Please consider picking up this ticket or just set the ticket to the next lower priority.

Actions #5

Updated by gpathak about 1 month ago

  • Status changed from In Progress to Feedback
Actions #6

Updated by gpathak about 1 month ago

@dimstar the mentioned rsync tests are passing now.

Actions #7

Updated by dimstar about 1 month ago

  • Status changed from Feedback to Workable
Actions #8

Updated by gpathak about 1 month ago · Edited

@dimstar
I observed something weird, the good/passed tests have more steps as compared to failed ones.
The failed tests after calling nmcli connection modify 'ens4' ipv4.dns '10.150.1.11'; echo sClQx-$?- immediately calls step bsc#1083486.
The failed tests are not executing below commands/steps:

  • nmcli networking off
  • nmcli networking on
  • until nmcli networking connectivity check | tee /dev/stderr | grep 'full'; do sleep 10; done;
  • ip address show; echo; ip route show; echo; grep -v "^#" /etc/resolv.conf
Actions #9

Updated by gpathak about 1 month ago

gpathak wrote in #note-8:

@dimstar
I observed something weird, the good/passed tests have more steps as compared to failed ones.
The failed tests after calling nmcli connection modify 'ens4' ipv4.dns '10.150.1.11'; echo sClQx-$?- immediately calls step bsc#1083486.
The failed tests are not executing below commands/steps:

  • nmcli networking off
  • nmcli networking on
  • until nmcli networking connectivity check | tee /dev/stderr | grep 'full'; do sleep 10; done;
  • ip address show; echo; ip route show; echo; grep -v "^#" /etc/resolv.conf

This is the reason the guest isn't able to install yast2 nfs server package https://openqa.opensuse.org/tests/4645253#step/yast2_nfs_server/60

Actions #10

Updated by gpathak about 1 month ago

  • Assignee changed from gpathak to nicksinger

Assigning it to @nicksinger as he recently worked on those test cases.

Actions #11

Updated by nicksinger about 1 month ago

  • Status changed from Workable to Resolved
Actions #12

Updated by nicksinger about 1 month ago

I missed some nfs4 tests which I restarted as part of https://progress.opensuse.org/issues/169945#note-3 as well - I should now really have all tests, please reopen if you find some more.

Actions #13

Updated by gpathak about 1 month ago

Thanks a lot @nicksinger !!

Actions

Also available in: Atom PDF