Project

General

Profile

Actions

action #162284

closed

Prevent multi-machine tests to be picked up if os-autoinst-openvswitch service does not work size:M

Added by okurz 5 months ago. Updated 2 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Feature requests
Target version:
Start date:
2024-06-14
Due date:
% Done:

0%

Estimated time:

Description

Observation

During unintended upgrade of worker31 and others to Leap 15.6 the network did only come up after 20m(!) for yet unknown reasons, see #157975. Additional problems were caused because os-autoinst-openvswitch timed out eventually but then openQA workers picked up and destroyed jobs happily with

backend died: Open vSwitch command 'set_vlan' with arguments 'tap37 121' failed: org.freedesktop.DBus.Error.ServiceUnknown: The name org.opensuse.os_autoinst.switch was not provided by any .service files

Fabian Vogt suggests: "Instead of relying on the systemd .service to be enabled and running you could add a dbus .service file with a SystemdService= key to make use of dbus autolaunch"

Acceptance criteria

  • AC1: openQA workers do not pick up multi-machine tests if the os-autoinst openvswitch service is not available over DBUS
  • AC2: openQA workers clearly communicate this problematic situation

Suggestions

  • Make the worker check for the availability of the requested DBUS service or something if it has a multi-machine worker class and make the worker show up as "broken" which we already use in other cases
  • As alternative look into the suggestion by fvogt but that would still rely on systemd
  • As alternative to declaring the worker "broken" remove the "tap" class dynamically or change it to "tap_broken_$reason"

Related issues 3 (1 open2 closed)

Related to openQA Tests - action #165132: test fails in openqa_worker with 'No such timeout policy "ovs_test_tp"' and other problems regarding move to scripts/Resolvedokurz2024-08-12

Actions
Related to openQA Infrastructure - action #165860: wicked_basic_ref fails on ppc64le: Open vSwitch command 'set_vlan' with arguments 'tap3 8' failedResolvedmkittler2024-08-27

Actions
Copied from openQA Infrastructure - action #157975: Upgrade osd workers to openSUSE Leap 15.6Blockedokurz

Actions
Actions

Also available in: Atom PDF