Project

General

Profile

action #88217

[qe-core] test fails in bootloader_svirt - libxenlight failed to create new domain: leftover qemu process

Added by SLindoMansilla about 1 year ago. Updated 11 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Bugs in existing tests
Target version:
-
Start date:
2021-01-26
Due date:
% Done:

0%

Estimated time:
Difficulty:

Description

Observation

error: Failed to start domain openQA-SUT-1
error: internal error: libxenlight failed to create new domain 'openQA-SUT-1'

Reproducible

  • In all xen machines
  • Fails since Build 132.4
  • Current occurrence for scenario [sle-15-SP3-Online-x86_64-Build132.4-memtest@svirt-xen-hvm](bootloader_svirt)
  • Last good: 130.3
  • latest

Related issues

Related to openQA Tests - action #54863: [functional][u] test fails in bootloader_svirt - Missing domains in libvirt but still runnning in XEN.Resolved2019-07-30

Related to openQA Infrastructure - action #88299: [virtualization] Worker openqaw5-xen-1.qa.suse.de is not reachable (xen-hvm/xen-pv failing)Resolved2021-01-28

Related to openQA Tests - action #97532: [qe-core][sporadic] s390x jobs are failing to boot auto_review:"error: Cannot set interface flags on 'macvtap.*': Address already in use":retryResolved

History

#1 Updated by SLindoMansilla about 1 year ago

  • Subject changed from [qe-core] test fails in bootloader_svirt to [qe-core] test fails in bootloader_svirt - libxenlight failed to create new domain

#2 Updated by szarate about 1 year ago

  • Related to action #54863: [functional][u] test fails in bootloader_svirt - Missing domains in libvirt but still runnning in XEN. added

#3 Updated by szarate about 1 year ago

  • Status changed from New to In Progress
  • Assignee set to szarate

Smells like: poo#54863

#4 Updated by szarate about 1 year ago

  • Subject changed from [qe-core] test fails in bootloader_svirt - libxenlight failed to create new domain to [qe-core] test fails in bootloader_svirt - libxenlight failed to create new domain: leftover qemu process
  • Status changed from In Progress to Resolved

I modified the script added before for a similar problem.

What happened here is:

At some point due to possibly a bug in libvirt, when a domain is destroyed, there could be leftovers in xen (as previously mentioned), and sometimes, the leftovers could be left in a worse state, like leaving a qemu process roaming around, pgrep -f qemu.*openQA-SUT- comes particularly handy here.

I remember something similar happening in the past, so for the time being I updated the cleanup script to also kill the qemu process (https://github.com/foursixnine/stunning-octo-chainsaw/commit/be43227d76bfd55bfd99e2311e61d8faa8c8ed36)

For now, https://openqa.suse.de/tests/05347291#live is running on the virsh instance that was previously failing.

Also the cron job has been changed to only run once a day, but I'll be getting emails if failures ocurr (later to be moved to one of the monitoring mailing lists??)

#5 Updated by szarate 12 months ago

  • Related to action #88299: [virtualization] Worker openqaw5-xen-1.qa.suse.de is not reachable (xen-hvm/xen-pv failing) added

#6 Updated by okurz 11 months ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: extra_tests_gnome@svirt-xen-pv
https://openqa.suse.de/tests/5478993

To prevent further reminder comments one of the following options should be followed:

  1. The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
  2. The openQA job group is moved to "Released"
  3. The label in the openQA scenario is removed

#7 Updated by okurz 11 months ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: extra_tests_gnome@svirt-xen-pv
https://openqa.suse.de/tests/5585765

To prevent further reminder comments one of the following options should be followed:

  1. The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
  2. The openQA job group is moved to "Released"
  3. The label in the openQA scenario is removed

#8 Updated by okurz 11 months ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: extra_tests_gnome@svirt-xen-pv
https://openqa.suse.de/tests/5585765

To prevent further reminder comments one of the following options should be followed:

  1. The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
  2. The openQA job group is moved to "Released"
  3. The label in the openQA scenario is removed

#9 Updated by szarate 5 months ago

  • Related to action #97532: [qe-core][sporadic] s390x jobs are failing to boot auto_review:"error: Cannot set interface flags on 'macvtap.*': Address already in use":retry added

Also available in: Atom PDF