[sle][functional][u][s390x[kvm] test fails in bootloader_zkvm - "Cannot allocate memory" when instantiating the virtual machine
openQA test in scenario sle-15-SP1-Installer-DVD-s390x-qa_userspace_apache2_mod_perl@s390x-kvm-sle12 fails in
Fails since (at least) Build 128.1 (current job)
Last good: 126.1 (or more recent)
Always latest result in this scenario: latest
#1 Updated by oorlov almost 3 years ago
Just for the statistics. There are more than one such fail.
Please, follow the link to see all the failed 'bootloader_zkvm' modules due to the same issue in 128.1 build: https://openqa.suse.de/tests/overview?arch=&failed_modules=bootloader_zkvm&distri=sle&version=15-SP1&build=128.1&groupid=110#
#4 Updated by okurz over 2 years ago
- Subject changed from [functional][u][s390x[kvm] test fails in bootloader_zkvm - "Cannot allocate memory" when instantiating the virtual machine to [sle][functional][u][s390x[kvm] test fails in bootloader_zkvm - "Cannot allocate memory" when instantiating the virtual machine
- Status changed from New to Feedback
- Assignee set to mgriessmeier
- Priority changed from Normal to Low
- Target version changed from Milestone 25+ to Milestone 23
From duplicate ticket #46337 from mgriessmeier:
the actual issue is here (reading logs helps ;) ):
[2019-01-17T13:10:13.451 CET] [debug] Command executed: virsh define /var/lib/libvirt/images/openQA-SUT-3.xml [2019-01-17T13:10:13.588 CET] [debug] Command's stderr: error: Failed to define domain from /var/lib/libvirt/images/openQA-SUT-3.xml error: cannot fork child process: Cannot allocate memory
Nick and me already took a look - libvirtd process is consuming a lot of memory, it was running for 130 days, we should track this somehow.
but for now - after restarting libvirtd, the memory usage normalized.
keeping on feedback - low, for tracking
#7 Updated by mgriessmeier over 2 years ago
- Status changed from Resolved to In Progress
- Assignee deleted (
- Priority changed from Low to Normal
happening now on our sle15 kvm machines aka s390p7 hypervisor
#10 Updated by mgriessmeier over 2 years ago
please keep in mind, restarting libvirtd during tests is killing running machines =)
so maybe a cronjob on the host itself?
which might be also helpful is:
#13 Updated by SLindoMansilla over 2 years ago
- Status changed from In Progress to Feedback
#15 Updated by coolgw over 2 years ago
If you check following group, you will see a lot of failed on bootloaderzvm
I just select some failed cases:
#16 Updated by SLindoMansilla over 2 years ago
There are different causes for all of those failures.
Some of them are failing for the cause targeted in this ticket, "cannot allocate memory", which I can see happening in openqaworker5. But, I cannot find any more failure, so, maybe someone already executed
systemctl restart libvirtd?
If it happens again "cannot allocate memory", someone with access to that worker has to restart the service.
#17 Updated by coolgw over 2 years ago
Issue trigger again and i found it happen on openqaworker5 again, could we just add one script on this machine and do reboot operation once memory issue happen? So we can workround this.
#19 Updated by SLindoMansilla over 2 years ago
PR to generate a file with stderr output of virsh command: https://github.com/os-autoinst/os-autoinst/pull/1173
#22 Updated by coolgw over 2 years ago
#25 Updated by SLindoMansilla over 2 years ago
#30 Updated by SLindoMansilla about 2 years ago
- Status changed from In Progress to Workable
@ SLindoMansilla @ szarate okurz I saw the PR already merged, then the automatic restart can work now or still some part need to be done?
For the automatic restart, we need:
- An agreement to apply the same a approach for the same machines
- Create salt states to automatically deploy that task (bash script, cronjob, systemd timer, etc)
This will take long time. foursixnine is working on something while we get to that point. He will automatically collect worker data to show statistics and/or send notifications (including not enough memory), so that a human can react to it, before a QA reviewer take a look at the job.
#33 Updated by SLindoMansilla about 2 years ago
#34 Updated by SLindoMansilla about 2 years ago
Preparing the package: https://build.opensuse.org/package/show/devel:openSUSE:QA:QSF/auto-restart-libvirtd
#35 Updated by SLindoMansilla about 2 years ago
Preparing the salt state: https://gitlab.suse.de/openqa/salt-states-openqa/merge_requests/157
#38 Updated by SLindoMansilla about 2 years ago
Salt state was merged: https://gitlab.suse.de/openqa/salt-states-openqa/merge_requests/157
Updated readme was merged: https://github.com/openSUSE/auto-restart-libvirtd/pull/1/files
#43 Updated by okurz over 1 year ago
Please see https://gitlab.suse.de/openqa/salt-states-openqa/merge_requests/264 where I suggest to remove the auto-restart-libvirtd part again as I am convinced it is completely ineffective in the current form for multiple reasons. Unless the package is used elsewhere I also suggest to remove https://build.opensuse.org/project/show/devel:openSUSE:QA:QSF again to save ressources.
#44 Updated by SLindoMansilla over 1 year ago
Package deleted from OBS: https://build.opensuse.org/project/show/devel:openSUSE:QA:QSF