Project

General

Profile

Actions

action #66376

closed

MM tests fail in obscure way when tap device is not present

Added by okurz over 4 years ago. Updated over 4 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Feature requests
Target version:
Start date:
2020-05-04
Due date:
% Done:

0%

Estimated time:

Description

Observation

MM tests fail to access other machines in the network if tap device is not present, there is a diag message reported but the test does not fail at this point.

For example
https://openqa.suse.de/tests/4189385#step/setup/42 shows "Failed to connect to 10.0.2.2 port 20243: No route to host" and fails. What is hidden in autoinst-log.txt:

[2020-05-04T08:30:16.586 CEST] [debug] Failed to run dbus command 'set_vlan' with arguments 'tap23 13' : 'tap23' is not connected to bridge 'br1'

but does not fail on this. The worker machine instance is openqaworker-arm-3:24 which should not even have been running as https://gitlab.suse.de/openqa/salt-pillars-openqa/-/blob/master/openqa/workerconf.sls#L570 shows that only 20 worker instances have been configured. The service "openqa-worker@24" is "enabled-runtime", so not "enabled", maybe pulled in manually by "openqa-worker.target".

Acceptance criteria

  • AC1: multi-machine tests relying on a tap device should fail hard if the tap device is not present or can not be configured

Suggestions


Related issues 2 (0 open2 closed)

Related to openQA Infrastructure (public) - action #63874: ensure openqa worker instances are disabled and stopped when "numofworkers" is reduced in salt pillars, e.g. causing non-obvious multi-machine failuresResolvedmkittler2020-02-26

Actions
Related to openQA Infrastructure (public) - action #69694: openqa-worker systemd services running in osd which should not be enabled at all and have no tap-device configured auto_review:"backend died:.*tap.*is not connected to bridge.*br1":retryResolvedokurz2020-08-072020-09-01

Actions
Actions

Also available in: Atom PDF