action #116257
open[virtualization][svirt] Some workers in openqaworker2 time out while copying the assets in bootloader_svirt module
Added by jlausuch over 2 years ago. Updated 8 months ago.
0%
Description
Observation¶
openQA test in scenario sle-12-SP5-JeOS-for-kvm-and-xen-Updates-x86_64-jeos-extratest@svirt-xen-hvm fails in
bootloader_svirt
It hits the MAX_JOB_TIMEOUT
while trying to copy the image.
The affected workers are:
openqaworker2:9
openqaworker2:10
openqaworker2:16
Most jobs using these workers time out during this step. Other examples:
https://openqa.suse.de/tests/9459036
https://openqa.suse.de/tests/9459031
https://openqa.suse.de/tests/9459037
https://openqa.suse.de/tests/9459064
https://openqa.suse.de/tests/9459069
Reproducible¶
Fails since (at least) Build 20220905-1 (current job)
Expected result¶
Last good: 20220903-1 (or more recent)
Further details¶
Always latest result in this scenario: latest
Updated by jlausuch over 2 years ago
I have the following in openqaworker2:
systemctl mask --now openqa-worker-auto-restart@{9,10,16}.service
systemctl mask --now openqa-reload-worker-auto-restart@{9,10,16}.service
systemctl mask --now openqa-reload-worker-auto-restart@{9,10,16}.path
Updated by jlausuch over 2 years ago
I tried to bring it up again and the job https://openqa.suse.de/tests/9459087 still fails. So it needs more investigation.
Updated by jlausuch over 2 years ago
I have rebooted openqaw5-xen
machine and enabled systemctl enable libvirtd
. I also have restarted the 3 workers in openqaworker2
host.
Now at least, the bootloader_svirt
module passes that step, but fails in another place:
https://openqa.suse.de/tests/9459108#step/bootloader_svirt/31
https://openqa.suse.de/tests/9459106#step/bootloader_svirt/33
https://openqa.suse.de/tests/9459107#step/bootloader_svirt/31
virsh start failed at /usr/lib/os-autoinst/consoles/sshVirtsh.pm line 546.
at /usr/lib/os-autoinst/backend/console_proxy.pm line 46.
backend::console_proxy::__ANON__(undef) called at sle/tests/installation/bootloader_svirt.pm line 282
bootloader_svirt::run(bootloader_svirt=HASH(0x55b912784ae8)) called at /usr/lib/os-autoinst/basetest.pm line 328
eval {...} called at /usr/lib/os-autoinst/basetest.pm line 322
basetest::runtest(bootloader_svirt=HASH(0x55b912784ae8)) called at /usr/lib/os-autoinst/autotest.pm line 367
eval {...} called at /usr/lib/os-autoinst/autotest.pm line 367
autotest::runalltests() called at /usr/lib/os-autoinst/autotest.pm line 243
eval {...} called at /usr/lib/os-autoinst/autotest.pm line 243
autotest::run_all() called at /usr/lib/os-autoinst/autotest.pm line 294
autotest::__ANON__(Mojo::IOLoop::ReadWriteProcess=HASH(0x55b914915538)) called at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/IOLoop/ReadWriteProcess.pm line 326
eval {...} called at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/IOLoop/ReadWriteProcess.pm line 326
Mojo::IOLoop::ReadWriteProcess::_fork(Mojo::IOLoop::ReadWriteProcess=HASH(0x55b914915538), CODE(0x55b914a8e628)) called at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/IOLoop/ReadWriteProcess.pm line 488
Mojo::IOLoop::ReadWriteProcess::start(Mojo::IOLoop::ReadWriteProcess=HASH(0x55b914915538)) called at /usr/lib/os-autoinst/autotest.pm line 296
autotest::start_process() called at /usr/bin/isotovideo line 273
Updated by jlausuch over 2 years ago
Difference between passed job:
<interface type="network">
<mac address="00:16:3e:68:97:43"/>
<model type="netfront"/>
<source network="br0"/>
<virtualport type="openvswitch"/>
</interface>
and failed job:
<interface type="bridge">
<mac address="00:16:3e:79:99:8b"/>
<virtualport type="openvswitch"/>
<source bridge="ovs-system"/>
<model type="netfront"/>
</interface>
For some reason, jobs are taking <source bridge="ovs-system"/>
instead of <source network="br0"/>
.
I have checked openvswitch.service
and it's active, br0 is there and active too.
Not sure what's going on.
Updated by jlausuch over 2 years ago
Trying to run steps manually:
openqaw5-xen:~ # virsh define /var/lib/libvirt/images/openQA-SUT-2.xml
Domain openQA-SUT-2 defined from /var/lib/libvirt/images/openQA-SUT-2.xml
openqaw5-xen:~ # virsh start openQA-SUT-2
error: Failed to start domain openQA-SUT-2
error: internal error: libxenlight failed to create new domain 'openQA-SUT-2'
Maybe this host doesn't have VT-x enabled after reboot?
openqaw5-xen:~ # virt-host-validate
QEMU: Checking for hardware virtualization : FAIL (Only emulated CPUs are available, performance will be significantly limited)
QEMU: Checking if device /dev/vhost-net exists : PASS
QEMU: Checking if device /dev/net/tun exists : PASS
QEMU: Checking for cgroup 'cpu' controller support : PASS
QEMU: Checking for cgroup 'cpuacct' controller support : PASS
QEMU: Checking for cgroup 'cpuset' controller support : PASS
QEMU: Checking for cgroup 'memory' controller support : PASS
QEMU: Checking for cgroup 'devices' controller support : PASS
QEMU: Checking for cgroup 'blkio' controller support : PASS
WARN (Unknown if this platform has IOMMU support)
QEMU: Checking for secure guest support : WARN (Unknown if this platform has Secure Guest support)
LXC: Checking for Linux >= 2.6.26 : PASS
LXC: Checking for namespace ipc : PASS
LXC: Checking for namespace mnt : PASS
LXC: Checking for namespace pid : PASS
LXC: Checking for namespace uts : PASS
LXC: Checking for namespace net : PASS
LXC: Checking for namespace user : PASS
LXC: Checking for cgroup 'cpu' controller support : PASS
LXC: Checking for cgroup 'cpuacct' controller support : PASS
LXC: Checking for cgroup 'cpuset' controller support : PASS
LXC: Checking for cgroup 'memory' controller support : PASS
LXC: Checking for cgroup 'devices' controller support : PASS
LXC: Checking for cgroup 'freezer' controller support : PASS
LXC: Checking for cgroup 'blkio' controller support : PASS
LXC: Checking if device /sys/fs/fuse/connections exists : FAIL (Load the 'fuse' module to enable /proc/ overrides)
But if it was working before, why this shouldn't survive a simple reboot?
Updated by jlausuch over 2 years ago
ovs-vsctl show
bc0b58fc-7882-4a8c-ba39-69ffc77671b0
Manager "--help"
Bridge br0
Port vif2.0
Interface vif2.0
Port vif2.0-emu
Interface vif2.0-emu
error: "could not open network device vif2.0-emu (No such device)"
Port br0
Interface br0
type: internal
Port eth0
Interface eth0
ovs_version: "2.13.2"
openqaw5-xen:~ # virsh iface-list --all
Name State MAC Address
------------------------------------
ovs-system active
openqaw5-xen:~ # virsh iface-dumpxml ovs-system
error: An error occurred, but the cause is unknown
Updated by rfan1 over 2 years ago
- Related to action #116644: [qe-core][functional][sle15sp5]test fails in bootloader_svirt, the test is using different network bridge 'ovs-system' rather than 'br0' added
Updated by openqa_review over 2 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: jeos-extratest@svirt-xen-hvm
https://openqa.suse.de/tests/9469862#step/bootloader_svirt/1
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released" or "EOL" (End-of-Life)
- The bugref in the openQA scenario is removed or replaced, e.g.
label:wontfix:boo1234
Expect the next reminder at the earliest in 28 days if nothing changes in this ticket.
Updated by slo-gin about 2 years ago
This ticket was set to High priority but was not updated within the SLO period. Please consider picking up this ticket or just set the ticket to the next lower priority.
Updated by slo-gin about 2 years ago
This ticket was set to High priority but was not updated within the SLO period. Please consider picking up this ticket or just set the ticket to the next lower priority.
Updated by okurz about 2 years ago
- Subject changed from [svirt] Some workers in openqaworker2 time out while copying the assets in bootloader_svirt module to [virtualization][svirt] Some workers in openqaworker2 time out while copying the assets in bootloader_svirt module
Updated by slo-gin about 2 years ago
This ticket was set to High priority but was not updated within the SLO period. Please consider picking up this ticket or just set the ticket to the next lower priority.
Updated by slo-gin almost 2 years ago
This ticket was set to High priority but was not updated within the SLO period. Please consider picking up this ticket or just set the ticket to the next lower priority.
Updated by slo-gin almost 2 years ago
This ticket was set to High priority but was not updated within the SLO period. Please consider picking up this ticket or just set the ticket to the next lower priority.
Updated by slo-gin 10 months ago
This ticket was set to High priority but was not updated within the SLO period. Please consider picking up this ticket or just set the ticket to the next lower priority.