Project

General

Profile

action #69694

openqa-worker systemd services running in osd which should not be enabled at all and have no tap-device configured auto_review:"backend died:.*tap.*is not connected to bridge.*br1":retry

Added by okurz 6 months ago. Updated 4 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
Start date:
2020-08-07
Due date:
2020-09-01
% Done:

0%

Estimated time:

Description

Observation

https://openqa.suse.de/tests/4535923 is incomplete with proper reason "backend died: 'tap14' is not connected to bridge 'br1' at /usr/lib/os-autoinst/backend/qemu.pm line 149." as introduced in #66376 but the worker instance openqaworker3:15 should not be running at all

Workaround

Restart the job until it ends up on a properly configured worker instance


Related issues

Related to openQA Project - action #66376: MM tests fail in obscure way when tap device is not presentResolved2020-05-04

Related to openQA Project - coordination #65118: [epic] multimachine test fails with symptoms "websocket refusing connection" and other unclear reasonsResolved2020-04-012020-09-30

History

#1 Updated by okurz 6 months ago

  • Project changed from openQA Project to openQA Infrastructure
  • Status changed from New to In Progress
  • Assignee set to okurz
  • Target version set to Ready

#2 Updated by okurz 6 months ago

  • Related to action #66376: MM tests fail in obscure way when tap device is not present added

#3 Updated by okurz 6 months ago

Following up to #65118#note-13 and doing sudo salt -l error --state-output=changes -C 'G@roles:worker' cmd.run 'curl -s https://w3.suse.de/~okurz/check_num_openqa_workers | sh -' returns that openqaworker3 is again superfluous worker instances.

> for i in {1..50}; do echo -e "$i:\t$(sudo systemctl is-enabled openqa-worker@$i)"; done
1:  enabled
2:  enabled
3:  enabled
4:  enabled
5:  enabled
6:  enabled
7:  enabled
8:  enabled
9:  enabled
10: enabled
11: enabled
12: enabled
13: enabled
14: disabled
15: enabled
16: enabled

I did sudo systemctl disable --now openqa-worker@{15,16} and now:

> for i in {1..50}; do echo -e "$i:\t$(sudo systemctl is-enabled openqa-worker@$i)"; done
1:  enabled
2:  enabled
3:  enabled
4:  enabled
5:  enabled
6:  enabled
7:  enabled
8:  enabled
9:  enabled
10: enabled
11: enabled
12: enabled
13: enabled
14: disabled
15: enabled-runtime
16: enabled-runtime

triggered reboot.

#4 Updated by okurz 6 months ago

  • Related to coordination #65118: [epic] multimachine test fails with symptoms "websocket refusing connection" and other unclear reasons added

#5 Updated by okurz 5 months ago

  • Due date set to 2020-09-01
  • Status changed from In Progress to Feedback

Same checks do not report workers with the same problem. I can check again after my vacation.

#6 Updated by okurz 4 months ago

  • Status changed from Feedback to Resolved

same check as in #69694#note-3 reported no problem so this is considered fixed.

Also available in: Atom PDF