action #164817
closed[qe-core] on worker sapworker1:7 is test mostly failing error: Domain already exists, editing existing domains is not supported yet
100%
Description
Observation¶
I checked the failure manually and there is no instance running.
vmware-jump:~ # virsh -c esx://root@unreal7.qe.nue2.suse.org/?no_verify=1\&authfile=/tmp/libvirtauth-RoG0 list
2024-08-01 08:19:54.627+0000: 21586: info : libvirt version: 9.0.0
2024-08-01 08:19:54.627+0000: 21586: info : hostname: vmware-jump.qe.nue2.suse.org
2024-08-01 08:19:54.627+0000: 21586: warning : esxUtil_ParseUri:144 : Ignoring unexpected query parameter 'authfile'
2024-08-01 08:19:54.696+0000: 21586: warning : esxVI_GetVirtualMachineIdentity:2484 : Cannot access UUID, because 'configStatus' property indicates a config problem
Id Name State
--------------------
vmware-jump:~ # virsh -c esx://root@unreal7.qe.nue2.suse.org/?no_verify=1\&authfile=/tmp/libvirtauth-RoG0 define /var/lib/libvirt/images/openQA-SUT-37.xml
2024-08-01 08:20:06.090+0000: 21594: info : libvirt version: 9.0.0
2024-08-01 08:20:06.090+0000: 21594: info : hostname: vmware-jump.qe.nue2.suse.org
2024-08-01 08:20:06.090+0000: 21594: warning : esxUtil_ParseUri:144 : Ignoring unexpected query parameter 'authfile'
error: Failed to define domain from /var/lib/libvirt/images/openQA-SUT-37.xml
error: internal error: Domain already exists, editing existing domains is not supported yet
vmware-jump:~ #
For some reason openQA-SUT-37
can't be defined on unreal7.qe.nue2.suse.org
One day ago this job passed. https://openqa.suse.de/tests/15037728
[2024-07-31T10:40:57.298834+02:00] [debug] [pid:64102] [run_ssh_cmd(virsh -c esx://root@unreal7.qe.nue2.suse.org/?no_verify=1\&authfile=/tmp/libvirtauth-E0tY define /var/lib/libvirt/images/openQA-SUT-35.xml)] stderr:
2024-07-31 08:40:57.217+0000: 516: info : libvirt version: 9.0.0
2024-07-31 08:40:57.217+0000: 516: info : hostname: vmware-jump.qe.nue2.suse.org
2024-07-31 08:40:57.217+0000: 516: warning : esxUtil_ParseUri:144 : Ignoring unexpected query parameter 'authfile'
error: Failed to define domain from /var/lib/libvirt/images/openQA-SUT-35.xml
error: internal error: Domain already exists, editing existing domains is not supported yet
openQA test in scenario sle-micro-6.0-Base-VMware-Updates-x86_64-slem_networking@svirt-vmware70 fails in
bootloader_svirt
Reproducible¶
Fails since (at least) Build 21.3
Expected result¶
Last good: (unknown) (or more recent)
Further details¶
Always latest result in this scenario: latest
Updated by dzedro 3 months ago
Same issue on esxi7 ? https://openqa.suse.de/tests/15133846#step/bootloader_svirt/28
[2024-08-09T10:56:42.262044+02:00] [debug] [pid:32684] [run_ssh_cmd(virsh -c esx://root@esxi7.qa.suse.cz/?no_verify=1\&authfile=/tmp/libvirtauth-Ln28 define /var/lib/libvirt/images/openQA-SUT-8.xml)] stderr:
2024-08-09 08:56:42.069+0000: 26392: info : libvirt version: 9.0.0
2024-08-09 08:56:42.069+0000: 26392: info : hostname: vmware-jump.qe.nue2.suse.org
2024-08-09 08:56:42.069+0000: 26392: warning : esxUtil_ParseUri:144 : Ignoring unexpected query parameter 'authfile'
error: Failed to define domain from /var/lib/libvirt/images/openQA-SUT-8.xml
error: internal error: Domain already exists, editing existing domains is not supported yet
Updated by dzedro 3 months ago
- Tags set to bugbusters
- Status changed from New to In Progress
- Assignee set to dzedro
I deleted the VM openQA-SUT-37
vmware-jump:~ # virsh -c esx://root@unreal7.qe.nue2.suse.org/?no_verify=1\&authfile=/tmp/libvirtauth-RoG0 list --all
2024-08-12 08:40:14.444+0000: 16312: info : libvirt version: 9.0.0
2024-08-12 08:40:14.444+0000: 16312: info : hostname: vmware-jump.qe.nue2.suse.org
2024-08-12 08:40:14.444+0000: 16312: warning : esxUtil_ParseUri:144 : Ignoring unexpected query parameter 'authfile'
Id Name State
---------------------------------
321 openQA-SUT-35 running
- openQA-SUT-37 shut off
vmware-jump:~ # virsh -c esx://root@unreal7.qe.nue2.suse.org/?no_verify=1\&authfile=/tmp/libvirtauth-RoG0 undefine --snapshots-metadata openQA-SUT-37
2024-08-12 08:40:38.671+0000: 16327: info : libvirt version: 9.0.0
2024-08-12 08:40:38.671+0000: 16327: info : hostname: vmware-jump.qe.nue2.suse.org
2024-08-12 08:40:38.671+0000: 16327: warning : esxUtil_ParseUri:144 : Ignoring unexpected query parameter 'authfile'
Domain 'openQA-SUT-37' has been undefined
vmware-jump:~ # virsh -c esx://root@unreal7.qe.nue2.suse.org/?no_verify=1\&authfile=/tmp/libvirtauth-RoG0 list --all
2024-08-12 08:40:41.354+0000: 16334: info : libvirt version: 9.0.0
2024-08-12 08:40:41.354+0000: 16334: info : hostname: vmware-jump.qe.nue2.suse.org
2024-08-12 08:40:41.354+0000: 16334: warning : esxUtil_ParseUri:144 : Ignoring unexpected query parameter 'authfile'
Id Name State
--------------------------------
321 openQA-SUT-35 running
vmware-jump:~ #
Updated by szarate 3 months ago
- Related to action #164814: [qe-core] esxi7.qa.suse.cz is missing SL-Micro.x86_64-6.0-Default-VMware-GM.vmdk added
Updated by rfan1 3 months ago
Here comes my update:
I checked with Roy Cai, and he accessed exi7 on 08/05 and copied some image there, and didn't touch the test code.
And now. 'openQA-SUT-8' is not seen anymore.
vmware-jump:~ # virsh -c esx://root@esxi7.qa.suse.cz/?no_verify=1\&authfile=/tmp/libvirtauth-6YPm list --all |grep openQA-SUT
2024-08-13 12:53:45.732+0000: 5101: info : libvirt version: 9.0.0
2024-08-13 12:53:45.732+0000: 5101: info : hostname: vmware-jump.qe.nue2.suse.org
2024-08-13 12:53:45.732+0000: 5101: warning : esxUtil_ParseUri:144 : Ignoring unexpected query parameter 'authfile'
417 openQA-SUT-5 running
- openQA-SUT-3 shut off
Updated by rcai 3 months ago ยท Edited
The latest job has passed the worker-35 test.
Investigate why the worker-35 resources are being occupied.
Looking at the code: before creating worker-35, the code will undefine any existing worker-35 by the end of job.
[2024-08-13T14:46:02.761164+02:00] [debug] [pid:41467] [run_ssh_cmd(virsh -c esx://root@unreal7.qe.nue2.suse.org/?no_verify=1&authfile=/tmp/libvirtauth-crvK undefine --snapshots-metadata openQA-SUT-35)] stdout:
The domain 'openQA-SUT-35' has been undefined.
So this situation is quite peculiar, let's observe it further, no one has touched the machine durning testing.
Updated by rcai 3 months ago
Some additional points:
I reviewed the code from bootloader_svirt.pm. If a VM with the same name already exists and is not explicitly destroyed or undefined before creating a new one, it could lead to conflicts or unexpected behavior. Although the VM is supposed to be destroyed or undefined after the module's execution, if the job is canceled or interrupted prematurely, a resource conflict may occur when the next job tries to start a VM with the same name.
I suggested creating a ticket to investigate the issue. It involves adding the steps to destroy and undefine a VM in the bootloader_svirt.pm module before creating a new one.