Actions
action #162284
closedPrevent multi-machine tests to be picked up if os-autoinst-openvswitch service does not work size:M
Start date:
2024-06-14
Due date:
% Done:
0%
Estimated time:
Tags:
Description
Observation¶
During unintended upgrade of worker31 and others to Leap 15.6 the network did only come up after 20m(!) for yet unknown reasons, see #157975. Additional problems were caused because os-autoinst-openvswitch timed out eventually but then openQA workers picked up and destroyed jobs happily with
backend died: Open vSwitch command 'set_vlan' with arguments 'tap37 121' failed: org.freedesktop.DBus.Error.ServiceUnknown: The name org.opensuse.os_autoinst.switch was not provided by any .service files
Fabian Vogt suggests: "Instead of relying on the systemd .service to be enabled and running you could add a dbus .service file with a SystemdService= key to make use of dbus autolaunch"
Acceptance criteria¶
- AC1: openQA workers do not pick up multi-machine tests if the os-autoinst openvswitch service is not available over DBUS
- AC2: openQA workers clearly communicate this problematic situation
Suggestions¶
- Make the worker check for the availability of the requested DBUS service or something if it has a multi-machine worker class and make the worker show up as "broken" which we already use in other cases
- As alternative look into the suggestion by fvogt but that would still rely on systemd
- As alternative to declaring the worker "broken" remove the "tap" class dynamically or change it to "tap_broken_$reason"
Actions