action #126647
closed[qe-core] test fails in bootloader_start - we should use br0 not ovs-system
0%
Description
Observation¶
while fix issue on sle-15-SP5-Online-x86_64-guided_ext4@svirt-xen-pv, the test fails in
bootloader_start
and complained with:
Test suite description¶
Guided Partitioning installation with ext4 filesystem.
error message:¶
# Test died: {
"cmd" => "backend_proxy_console_call",
"json_cmd_token" => "lZptbtVI",
"wantarray" => undef,
"console" => "svirt",
"function" => "define_and_start",
"args" => []
}
virsh start failed: 1
virsh domain XML:
<domain type='xen'>
<name>openQA-SUT-8</name>
<uuid>6a5cd9cb-74d7-4780-b4b9-93734a361337</uuid>
<description>openQA WebUI: openqa.suse.de (8): 10796948-sle-15-SP5-Online-x86_64-Buildhjluo_os-autoinst-distri-opensuse_stop_time_xen-guided_ext4@hjluo_os-autoinst-distri-opensuse_stop_time_xen@svirt-xen-pv</description>
<memory unit='KiB'>1048576</memory>
<currentMemory unit='KiB'>1048576</currentMemory>
<vcpu placement='static'>1</vcpu>
<os>
<type arch='x86_64'>linux</type>
<kernel>/usr/lib/grub2/x86_64-xen/grub.xen</kernel>
</os>
<features>
<acpi/>
<apic/>
<pae/>
</features>
<clock offset='utc'/>
<on_poweroff>destroy</on_poweroff>
<on_reboot>destroy</on_reboot>
<on_crash>destroy</on_crash>
<devices>
<disk type='file' device='cdrom'>
<driver name='qemu' type='raw'/>
<source file='/var/lib/libvirt/images/SLE-15-SP5-Online-x86_64-Build81.1-Media1.iso'/>
<target dev='sda' bus='scsi'/>
<readonly/>
<address type='drive' controller='0' bus='0' target='0' unit='0'/>
</disk>
<disk type='file' device='disk'>
<driver name='qemu' type='qcow2' cache='unsafe'/>
<source file='/var/lib/libvirt/images/openQA-SUT-8b.img'/>
<target dev='xvdb' bus='xen'/>
</disk>
<controller type='xenbus' index='0'/>
<controller type='scsi' index='0'/>
<interface type='bridge'>
<mac address='00:16:3e:7d:4b:b7'/>
<source bridge='ovs-system'/>
<virtualport type='openvswitch'> <======
Reproducible¶
Fails since (at least) Build hjluo/os-autoinst-distri-opensuse#stop_time_xen (current job)
Expected result¶
Last good: (unknown) (or more recent)
Further details¶
Always latest result in this scenario: latest
we have this kind of ticket before:
https://progress.opensuse.org/issues/116644
Updated by szarate over 1 year ago
- Related to action #125783: [jeos] Test fails in kdump_and_crash on SLE 12sp5 and 15sp4 XEN after worker migration from SLES to Leap 15.4 added
Updated by rfan1 over 1 year ago
- Status changed from New to Feedback
- Assignee set to rfan1
I just destroyed the iface as a temporary workaround:
# virsh iface-destroy ovs-system
Interface ovs-system destroyed
Updated by okurz over 1 year ago
- Subject changed from test fails in bootloader_start - we should use br0 not ovs-system to [qe-core] test fails in bootloader_start - we should use br0 not ovs-system
Updated by rfan1 over 1 year ago
I added the systemd service to stop the ovs-system iface if it is active:
# cat /etc/systemd/system/stop-iface-ovs-system.service
[Unit]
Description=stop iface 'ovs-system'
Requires=libvirtd.service
After=libvirtd.service
[Service]
ExecStart=/bin/bash /usr/sbin/stop-iface-ovs-system.sh
[Install]
WantedBy=multi-user.target
# cat /usr/sbin/stop-iface-ovs-system.sh
#!/bin/bash
set -x
virsh iface-list --all | grep -w active | awk '{ print $1 }' | grep ovs-system
if [ $? -eq 0 ]; then
virsh iface-destroy ovs-system && echo "stopped ovs-system" > /tmp/destroy_ovs-system
else
echo "ovs-system is not active on this host" > /tmp/destroy_ovs-system
fi
Updated by rfan1 over 1 year ago
- Related to action #127097: [alert] Failed systemd services alert added
Updated by rfan1 over 1 year ago
There are some coding style issue which caused non-zero return code even the iface is destroyed.
I have fixed it:
# cat /usr/sbin/stop-iface-ovs-system.sh
#!/bin/bash
# poo#126647
set -x
virsh iface-list --all | grep -w active | awk '{ print $1 }' | grep ovs-system
if [ $? -eq 0 ]; then
virsh iface-destroy ovs-system
echo "stopped ovs-system" > /tmp/destroy_ovs-system
else
echo "ovs-system is not active on this host" > /tmp/destroy_ovs-system
fi
# /usr/sbin/stop-iface-ovs-system.sh
+ virsh iface-list --all
+ grep -w active
+ grep ovs-system
+ awk '{ print $1 }'
+ '[' 1 -eq 0 ']'
+ echo 'ovs-system is not active on this host'
# echo $?
0
# virsh iface-start ovs-system
Interface ovs-system started
# /usr/sbin/stop-iface-ovs-system.sh
+ virsh iface-list --all
+ grep -w active
+ grep ovs-system
+ awk '{ print $1 }'
ovs-system
+ '[' 0 -eq 0 ']'
+ virsh iface-destroy ovs-system
Interface ovs-system destroyed
+ echo 'stopped ovs-system'
# echo $?
0
Updated by rfan1 about 1 year ago
- Status changed from Resolved to Feedback
Seems the issue can be seen again.
Updated by nicksinger about 1 year ago
- Status changed from Feedback to New
As discussed in slack (https://suse.slack.com/archives/C02CANHLANP/p1687167780103479?thread_ts=1687166022.818889&cid=C02CANHLANP) this systemd unit is just a workaround to reduce priority. Please ensure that the correct interface is used in the first place.
Updated by rfan1 about 1 year ago
Updated by rfan1 about 1 year ago
- Status changed from In Progress to Feedback
PR is merged,
I will wait for next xen server reboot to check if it can work fine.
Then I will try to delete the systemd service I added before.
Updated by rfan1 about 1 year ago
- Status changed from Feedback to Resolved
Removed the old service:
# rm /etc/systemd/system/stop-iface-ovs-system.service
# rm /usr/sbin/stop-iface-ovs-system.sh
# systemctl daemon-reload