tickets #25170
closedopenQA ppc64le workers bad kvm setup
0%
Description
Since last Friday the openQA ppc64le workers are failing with KVM access
error as per (1)
I assume there is a setup problem on the related Power8 host.
(1) https://openqa.opensuse.org/tests/482640/file/autoinst-log.txt¶
08:16:54.8389 65429 Error connecting to host :
IO::Socket::INET: connect: Connection refused
08:16:55.8393 65429 qemu didn't start
08:16:55.8393 65429 QEMU: Could not access KVM kernel module: Permission
denied
08:16:55.8393 65429 QEMU: failed to initialize KVM: Permission denied
08:16:55.8395 65429 awaiting death of commands process¶
--
Michel Normand
Updated by okurz over 7 years ago
I checked the machine "power8" and it seems the device /dev/kvm has lost the permission to allow access for the group "kvm" in which the user "_openqa-worker" runs. I don't know what changed that. So now I changed that manually by giving the group "kvm" r/w-access to /dev/kvm but I assume some udev rule should ensure this? As a workaround that should work for now.
Updated by okurz over 7 years ago
retriggered all recent openSUSE Tumbleweed power jobs
Updated by michel_mno over 7 years ago
okurz thank you for the bypass, I assume somebody restarted the machine last friday without checking for related /dev/kvm
Updated by michel_mno about 7 years ago
With the new 20171211 trial of PowerPC (restarted after 1 month) it seems we have again the same missing setup of /dev/kvm as previously reported.
I assume the Power8 workers were restarted and we still do have /dev/kvm accessed only by root (default)
The last autoinst-log.txt for 20171211 snapshot is: from https://openqa.opensuse.org/tests/557984/file/autoinst-log.txt
Updated by dimstar about 7 years ago
michel_mno wrote:
With the new 20171211 trial of PowerPC (restarted after 1 month) it seems we have again the same missing setup of /dev/kvm as previously reported.
I assume the Power8 workers were restarted and we still do have /dev/kvm accessed only by root (default)
Last reboot was 55 days ago
as for udev, there is ./rules.d/80-kvm.rules:KERNEL=="kvm", MODE="0666", GROUP="kvm" which should very well cover the case here (and which matches what my local machine, x86_64, has as permission on /dev/kvm
In any case I reset the permissions manually for today, but this does need more investigation
Updated by tampakrap about 7 years ago
- Category set to Servers hosted in NBG
- Assignee set to dimstar
Updated by michel_mno over 6 years ago
new occurence, tracked by https://progress.opensuse.org/issues/36559
Updated by okurz over 6 years ago
- Related to tickets #36559: openQA ppc64le workers bad kvm setup added
Updated by dimstar over 6 years ago
tampakrap wrote:
anything left here or can we close it?
Would be good to finally understand how this keeps on falling over. The issue does show up every now and then
Updated by michel_mno over 6 years ago
Updated by okurz over 6 years ago
I think what happens is that any time a package upgrade also touches e.g. the qemu package – as happened yesterday – the permissions of /dev/kvm reset to 600
Updated by michel_mno about 6 years ago
new occurence for TW snapshot 20181118 as per
https://openqa.opensuse.org/tests/800586/file/autoinst-log.txt
So need same bypass as before for /dev/kvm access.
[2018-11-19T16:38:55.158 UTC] [debug] Backend process died, backend errors are reported below in the following lines can't open qmp at /usr/lib/os-autoinst/OpenQA/Qemu/Proc.pm line 402.
[2018-11-19T16:38:55.159 UTC] [info] ::: OpenQA::Qemu::Proc::save_state: Saving QEMU state to qemu_state.json
last frame
[2018-11-19T16:38:55.160 UTC] [info] ::: OpenQA::Qemu::Proc::save_state: Saving QEMU state to qemu_state.json
[2018-11-19T16:38:55.161 UTC] [debug] QEMU: QEMU emulator version 2.9.1(openSUSE Leap 42.3)
[2018-11-19T16:38:55.161 UTC] [debug] QEMU: Copyright (c) 2003-2017 Fabrice Bellard and the QEMU Project developers
[2018-11-19T16:38:55.161 UTC] [debug] QEMU: Could not access KVM kernel module: Permission denied
[2018-11-19T16:38:55.161 UTC] [debug] QEMU: failed to initialize KVM: Permission denied
Updated by okurz about 6 years ago
- Related to action #32563: [functional][u] fix salt for power added
Updated by okurz about 6 years ago
- Related to action #44432: [sle][functional][u][ppc64le] test fails in bootloader - smt is still on after reboot on qa-power8-5 added
Updated by okurz about 6 years ago
#44432 might help us to understand what we would need to do for o3
Updated by szarate about 6 years ago
@okurz, I guess we could script the setup for these machines up there. But I wonder why and how the udev rule failed to be loaded...
Updated by michel_mno almost 6 years ago
again same issue on 20190130:
https://openqa.opensuse.org/tests/842728/file/autoinst-log.txt
[2019-01-30T12:47:19.837 UTC] [debug] QEMU: Could not access KVM kernel module: Permission denied
[2019-01-30T12:47:19.837 UTC] [debug] QEMU: failed to initialize KVM: Permission denied
Updated by okurz almost 6 years ago
Fixed manually with chmod g+rwX /dev/kvm
on the machine. The machine did not reboot. This seems to be caused by a kernel upgrade triggered at 2019-01-30 12:01:22 (from /var/log/zypp/history
). But who triggered the kernel upgrade?
Updated by okurz almost 6 years ago
- Status changed from New to Feedback
- Assignee changed from dimstar to okurz
ok, that's the daily automatic update, see /etc/cron.daily/suse.de-abuild-online-update . I was not even aware we had that enabled :D
I patched the file /etc/cron.daily/suse.de-abuild-online-update
with
function cleanup_and_exit() {
+ # 2019-01-30: okurz: https://progress.opensuse.org/issues/25170
+ # workaround for missing permissions on /dev/kvm
+ chmod g+rwX /dev/kvm
Updated by okurz almost 6 years ago
- Priority changed from Normal to Low
Well, seems like we did not have more problems since then at least. Keeping ticket open to remind us that we only did local changes that we have not versioned anywhere else which we should do eventually, e.g. with salt recipes.
Updated by okurz over 5 years ago
- Status changed from Feedback to Resolved
remaining tasks covered in #43934