Project

General

Profile

Actions

action #177138

closed

[tools]test fails in s390x zkvm bootloader_start tests, define a new vm fails due to "Cannot set interface flags on 'macvtapxx': Address already in use"

Added by rfan1 18 days ago. Updated 14 days ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Bugs in existing tests
Start date:
2025-02-13
Due date:
% Done:

0%

Estimated time:
Difficulty:

Description

Description

The issue is seen from ~1 hour ago, when openQA uses virsh command to define a vm, it reports the below error messages:

[2025-02-13T10:18:38.242390Z] [debug] [pid:98929] [run_ssh_cmd(virsh  start openQA-SUT-2 2> >(tee /tmp/os-autoinst-openQA-SUT-2-stderr.log >&2))] stderr:
  error: Failed to start domain 'openQA-SUT-2'
  error: Cannot set interface flags on 'macvtap148': Address already in use

I did some investigation on s390zl12, and found that there were some stale macvtap NICs even no running VM.

rfan@s390zl12:~> ip a|grep macvtap
28471: macvtap57@vlan2114: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 500
28472: macvtap58@vlan2114: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 500
28473: macvtap59@vlan2114: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 500
28474: macvtap60@vlan2114: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 500
28475: macvtap61@vlan2114: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 500
28476: macvtap62@vlan2114: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 500
28477: macvtap63@vlan2114: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 500
28478: macvtap64@vlan2114: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 500
28479: macvtap65@vlan2114: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 500
28480: macvtap66@vlan2114: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 500
28481: macvtap67@vlan2114: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 500
28482: macvtap68@vlan2114: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 500
28483: macvtap69@vlan2114: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 500

Observation

openQA test in scenario sle-micro-6.2-Base-qcow-s390x-ssh_mm_whitelist@s390x-kvm fails in
bootloader_start

Test suite description

SLE Micro image boot with ignition disk and default tests. Default tests are transactional-update, rebootmgr, health_check, cockpit service and some other checks specific to SLE Micro.

Reproducible

Fails since (at least) Build rfan0214

Expected result

Last good: (unknown) (or more recent)

Further details

Always latest result in this scenario: latest

Actions #1

Updated by okurz 18 days ago

  • Tags set to infra, reactive work, s390x
  • Target version set to Ready
Actions #2

Updated by mkittler 18 days ago

Also see https://suse.slack.com/archives/C02CANHLANP/p1739441650120979 and another thread linked from there for additional context.

Actions #3

Updated by mkittler 18 days ago

  • Assignee set to mkittler

I restarted one of the affected test. The affected test failed 8 hours ago. The restarted test is now past the critical point, see https://openqa.suse.de/tests/16759998#details.

I have also seen other tests successfully running on previously affected worker slots. So perhaps this issue has already resolved itself.

Actions #4

Updated by okurz 17 days ago

  • Priority changed from Normal to High
Actions #5

Updated by mkittler 17 days ago ยท Edited

  • Status changed from New to In Progress

The restarted test worked. So I restarted other jobs that failed due to this and I'm currently monitoring them. (Unfortunately they are not picked up right now because of the max. running job limit.)

Actions #6

Updated by mkittler 17 days ago

  • Status changed from In Progress to Feedback

The jobs I restarted were picked up and all are now passed the critical part. So I wouldn't know what else to do. We could just consider this resolved as well.

Actions #7

Updated by rfan1 14 days ago

Thanks all for your help. now the issue is gone

Actions #8

Updated by mkittler 14 days ago

  • Status changed from Feedback to Resolved
Actions

Also available in: Atom PDF