action #169843
closedMultiMachine: test fails in rsync_server
0%
Description
Observation¶
A good number of multimachine tests are failing - communication seems severely impacted.
According to the openQA investigation jobs, neither rerunning the previous build, not the previous test code have a different impact.
This implies infrastructure changes / issue (apparently OW 27 was setup into the pool again)
openQA test in scenario opensuse-Tumbleweed-DVD-x86_64-rsync-server@64bit fails in
rsync_server
Test suite description¶
Maintainer: zluo@suse.de
install and test rsync server
Reproducible¶
Fails since (at least) Build 20241112
Expected result¶
Last good: 20241111 (or more recent)
Further details¶
Always latest result in this scenario: latest
Updated by gpathak about 1 month ago
- Status changed from New to In Progress
- Assignee set to gpathak
Updated by gpathak about 1 month ago · Edited
- Removed
openqaworker27
gre entries from all other workers usingovs-vsctl del-port br1 gre7
- Restarted ovs services on all workers
systemctl stop ovsdb-server.service ovs-delete-transient-ports.service ovs-vswitchd.service os-autoinst-openvswitch.service; sleep 2; systemctl start ovsdb-server.service ovs-delete-transient-ports.service ovs-vswitchd.service os-autoinst-openvswitch.service
- Performed
wicked ifup all
on all workers
Found out that openqaworker20
isn't providing internet access to guest via tap interface, to verify it performed below steps on w20, w23 and w28
- Downloaded qcow image from https://openqa.opensuse.org/tests/4643044/#downloads, changed root password using
virt-customize -a opensuse-Tumbleweed-x86_64-20241112-gnome-wayland@64bit_virtio.qcow2 --root-password password:opensuse
- Booted guest using
qemu-system-x86_64 -m 2048 -enable-kvm -snapshot -netdev tap,id=qanet0,ifname=tap40,script=no,downscript=no -device virtio-net,netdev=qanet0,mac=c0:0c:b0:0c:c0:0c opensuse-Tumbleweed-x86_64-20241112-gnome-wayland@64bit_virtio.qcow2 -nographic
- Setup network, https://open.qa/docs/#_within_the_vm_configure_the_network_like
- doing,
ping 8.8.8.8
from within the guest succeeded on w23 and w28 except w20
rsync server and client test passed: https://openqa.opensuse.org/tests/overview?result=passed&arch=&flavor=&machine=&test=rsync-server&test=rsync-client&modules=&module_re=&group_glob=¬_group_glob=&comment=#
yast2_nfs tests are failing: https://openqa.opensuse.org/tests/overview?result=failed&result=incomplete&arch=&flavor=&machine=&test=yast2_nfs_v3_server&modules=&module_re=&modules_result=passed&group_glob=¬_group_glob=&comment=#
- xrdp has open issue: #138743
Updated by slo-gin about 1 month ago
This ticket was set to Immediate priority but was not updated within the SLO period. Please consider picking up this ticket or just set the ticket to the next lower priority.
Updated by gpathak about 1 month ago
- Status changed from In Progress to Feedback
Updated by gpathak about 1 month ago
@dimstar the mentioned rsync tests are passing now.
Updated by dimstar about 1 month ago
- Status changed from Feedback to Workable
There seem still a bunch of multim machine tests failing - even on snapshot 1113
I tagged the relevant with thisticket ID, which allows to search for them and track them:
https://openqa.opensuse.org/tests/overview?result=failed&result=incomplete&result=timeout_exceeded&arch=&flavor=&machine=&test=&modules=&module_re=&group_glob=¬_group_glob=&comment=poo%23169843&distri=microos&distri=opensuse&version=Tumbleweed&build=20241113&groupid=1#
Updated by gpathak about 1 month ago · Edited
@dimstar
I observed something weird, the good/passed tests have more steps as compared to failed ones.
The failed tests after calling nmcli connection modify 'ens4' ipv4.dns '10.150.1.11'; echo sClQx-$?-
immediately calls step bsc#1083486.
The failed tests are not executing below commands/steps:
nmcli networking off
nmcli networking on
until nmcli networking connectivity check | tee /dev/stderr | grep 'full'; do sleep 10; done;
ip address show; echo; ip route show; echo; grep -v "^#" /etc/resolv.conf
Updated by gpathak about 1 month ago
gpathak wrote in #note-8:
@dimstar
I observed something weird, the good/passed tests have more steps as compared to failed ones.
The failed tests after callingnmcli connection modify 'ens4' ipv4.dns '10.150.1.11'; echo sClQx-$?-
immediately calls step bsc#1083486.
The failed tests are not executing below commands/steps:
nmcli networking off
nmcli networking on
until nmcli networking connectivity check | tee /dev/stderr | grep 'full'; do sleep 10; done;
ip address show; echo; ip route show; echo; grep -v "^#" /etc/resolv.conf
This is the reason the guest isn't able to install yast2 nfs server package https://openqa.opensuse.org/tests/4645253#step/yast2_nfs_server/60
Updated by gpathak about 1 month ago
- Assignee changed from gpathak to nicksinger
Assigning it to @nicksinger as he recently worked on those test cases.
Updated by nicksinger about 1 month ago
- Status changed from Workable to Resolved
I have reverted https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/20638 which explains the missing steps @gpathak described. I restarted the mentioned jobs as
but found some more which I restarted as well:
- https://openqa.opensuse.org/tests/4646275#live
- https://openqa.opensuse.org/tests/4646273#live
- https://openqa.opensuse.org/tests/4646271#live
- https://openqa.opensuse.org/tests/4646269#live
I will take this module into consideration doing another change in that code (I touched it because of https://progress.opensuse.org/issues/169531).
Updated by nicksinger about 1 month ago
I missed some nfs4 tests which I restarted as part of https://progress.opensuse.org/issues/169945#note-3 as well - I should now really have all tests, please reopen if you find some more.