action #178906
closedcoordination #112862: [saga][epic] Future ideas for easy multi-machine handling: MM-tests as first-class citizens
coordination #111929: [epic] Stable multi-machine tests covering multiple physical workers
Support with broken MultiMachine setup size:S
0%
Description
On a fresh Leap 15.6 install I am trying to setup a multi-machine cluster using this script as mentioned in the documentation: (machine setup is simple https://en.opensuse.org/openSUSE:OpenQA:Setup)
instances=2 ethernet=eth1 bash -x $(which os-autoinst-setup-multi-machine)
(machine has two network interfaces)
Unfortunately after this the tests fail with:
qemu-system-x86_64: -netdev tap,id=qanet0,ifname=tap21,script=no,downscript=no: could not configure /dev/net/tun (tap21): Operation not permitted
Also the firewall seems borked:
Error: RUNNING_BUT_FAILED: Changing permanent configuration is not allowed while firewalld is in FAILED state. The permanent configuration must be fixed and then firewalld restarted. Try `firewall-offline-cmd --check-config`.
Calling the command suggested above:
Configuration error: INVALID_INTERFACE: Zone 'public': interface 'eth1' already bound to zone 'trusted'
Suggestions¶
- Redo the above steps to reproduce the issue or work with szarate to further investigate
- Crosscheck with how we test os-autoinst-setup-multi-machine, e.g. in openQA-in-openQA tests https://openqa.opensuse.org/group_overview/24
Files
Updated by szarate 17 days ago
- Description updated (diff)
A failing job: http://quake2.qe.nue2.suse.org/tests/5 more over the worker instances that I have restarted show now as unavailable: http://quake2.qe.nue2.suse.org/admin/workers/1
Note that I can reinstall if needed
Updated by gpathak 14 days ago
- Related to action #159414: Ensure that os-autoinst-setup-multi-machine reliably sets firewall zones not interfering with /etc/sysconfig/network/ifcfg-* size:S added
Updated by gpathak 14 days ago
szarate wrote:
On a fresh leap 15.6 install I am trying to setup a multi-machine cluster using this script as mentioned in the documentation: (machine setup is simple https://en.opensuse.org/openSUSE:OpenQA:Setup)
instances=2 ethernet=eth1 bash -x $(which os-autoinst-setup-multi-machine)
(machine has two network interfaces)unfortunately after this the tests fail with
qemu-system-x86_64: -netdev tap,id=qanet0,ifname=tap21,script=no,downscript=no: could not configure /dev/net/tun (tap21): Operation not permitted
I found this exact error easily reproducible in case of NetworkManager.
I didn't get the firewall error though. Also, to me it seems like openQA-worker along with multi-machine installation and setup was done via official repositories
Updated by gpathak 7 days ago · Edited
- File clipboard-202503241747-rom28.png clipboard-202503241747-rom28.png added
- File clipboard-202503241748-h3xgr.png clipboard-202503241748-h3xgr.png added
Unfortunately, this is not happening on my local setup I just did on a fresh Leap 15.6 installation, followed steps from https://en.opensuse.org/openSUSE:OpenQA:Setup.
The web-UI and workers are running on same machine in my case.
rsync-server:
rsync-client:
Updated by openqa_review 6 days ago
- Due date set to 2025-04-08
Setting due date based on mean cycle time of SUSE QE Tools
Updated by gpathak 6 days ago · Edited
I tried this on quake2.qe.nue2.suse.org
, and the execution was successful: http://10.168.195.106/tests/13#dependencies
I didn't perform a fresh Leap 15.6 installation on quake2, instead I removed the packages and re-installed the PACKAGES, I have listed the commands that I executed (after some trial and error) for removing, re-installing and configuring openQA and multimachine setup
zypper rm openQA* os-autoinst* firewalld* openvswitch* apache2*
rm -rvf /etc/openqa/
rm -rvf /var/lib/openqa/
rm -rf /etc/firewalld
rm -rf /etc/openvswitch/
find / -name "*openqa*" | xargs rm -rvf
find / -name "*apache2*" | xargs rm -rvf
find / -name "*openqa*" | xargs rm -rvf
reboot
After the machine booted, logged in via ssh and followed https://en.opensuse.org/openSUSE:OpenQA:Setup
zypper ref
zypper in openQA-bootstrap firewalld firewalld-bash-completion firewalld-lang firewalld-test
systemctl enable --now firewalld.service
skip_suse_specifics=1 skip_suse_tests=1 /usr/share/openqa/script/openqa-bootstrap
firewall-cmd --zone=public --add-service=http --permanent
firewall-cmd --add-port=5991/tcp --permanent
firewall-cmd --add-port=5992/tcp --permanent
I don't know why the above command firewall-cmd --zone=public --add-service=http --permanent
didn't work, I had to explicitly run:
firewall-cmd --permanent --add-port=80/tcp --zone=public
firewall-cmd --permanent --add-port=80/udp --zone=public
firewall-cmd --reload
Then added tap worker class in workers.ini after that:
systemctl enable --now openqa-worker-plain@{1..2}.service
systemctl restart openqa-worker-plain@{1..2}.service
instances=2 ethernet=eth0 bash -x $(which os-autoinst-setup-multi-machine)
systemctl restart os-autoinst-openvswitch.service
wicked ifup all
openqa-clone-job --skip-chained-deps --show-progress https://openqa.opensuse.org/tests/4942922
The multimachine test didn't pass in first attempt.
I have to run systemctl restart os-autoinst-openvswitch.service openvswitch.service
and then wicked ifup all
to get rid of this error Open vSwitch command 'set_vlan' with arguments 'tap1 1' failed: org.freedesktop.DBus.Error.ServiceUnknown: The name org.opensuse.os_autoinst.switch was not provided by any .service files
: http://10.168.195.106/tests/10#line-68
then I have to install ffmpeg-4
to make it work http://10.168.195.106/tests/9#line-46 and fix this error Can't exec "ffmpeg": No such file or directory at /usr/lib/os-autoinst/backend/baseclass.pm line 348, <$fh> line 16.
But I didn't get any of the error:
qemu-system-x86_64: -netdev tap,id=qanet0,ifname=tap21,script=no,downscript=no: could not configure /dev/net/tun (tap21): Operation not permitted
Error: RUNNING_BUT_FAILED: Changing permanent configuration is not allowed while firewalld is in FAILED state. The permanent configuration must be fixed and then firewalld restarted. Try firewall-offline-cmd --check-config.
Configuration error: INVALID_INTERFACE: Zone 'public': interface 'eth1' already bound to zone 'trusted'
Updated by gpathak 6 days ago
gpathak wrote in #note-18:
PLease let me know how can I perform a fresh Leap 15.6 installation on quake2 and then I can perform these steps mentioned in the wiki on quake2 from scratch
Did a fresh Leap 15.6 installation on quake2 (with wicked), will be performing these setup steps: https://en.opensuse.org/openSUSE:OpenQA:Setup
Updated by szarate 6 days ago
gpathak wrote in #note-19:
gpathak wrote in #note-18:
PLease let me know how can I perform a fresh Leap 15.6 installation on quake2 and then I can perform these steps mentioned in the wiki on quake2 from scratch
Did a fresh Leap 15.6 installation on quake2 (with wicked), will be performing these setup steps: https://en.opensuse.org/openSUSE:OpenQA:Setup
Cool, thanks @gpathak lmk how it goes
Updated by gpathak 6 days ago
szarate wrote in #note-20:
gpathak wrote in #note-19:
gpathak wrote in #note-18:
PLease let me know how can I perform a fresh Leap 15.6 installation on quake2 and then I can perform these steps mentioned in the wiki on quake2 from scratch
Did a fresh Leap 15.6 installation on quake2 (with wicked), will be performing these setup steps: https://en.opensuse.org/openSUSE:OpenQA:Setup
Cool, thanks @gpathak lmk how it goes
Indeed, I got this error:
Error: RUNNING_BUT_FAILED: Changing permanent configuration is not allowed while firewalld is in FAILED state. The permanent configuration must be fixed and then firewalld restarted. Try `firewall-offline-cmd --check-config`.
Updated by gpathak 6 days ago
Created a MR: https://github.com/os-autoinst/os-autoinst/pull/2685
Updated by gpathak 6 days ago
Tests passed: http://quake2.qe.nue2.suse.org/tests/5#dependencies
Updated by gpathak 5 days ago · Edited
Verified with the changes made recently in the MR for review comments.
MM tests doesn't complain could not configure /dev/net/tun (tap21): Operation not permitted
, os-autoinst-setup-multi-machine
also takes care of setting interface in correct firewalld zone
Test execution result: http://quake2.qe.nue2.suse.org/tests/2