Project

General

Profile

Actions

tickets #25170

closed

openQA ppc64le workers bad kvm setup

Added by michel_mno over 6 years ago. Updated over 4 years ago.

Status:
Resolved
Priority:
Low
Assignee:
Category:
Servers hosted in NBG
Target version:
-
Start date:
Due date:
% Done:

0%

Estimated time:

Description

Since last Friday the openQA ppc64le workers are failing with KVM access
error as per (1)

I assume there is a setup problem on the related Power8 host.

(1) https://openqa.opensuse.org/tests/482640/file/autoinst-log.txt

08:16:54.8389 65429 Error connecting to host :
IO::Socket::INET: connect: Connection refused
08:16:55.8393 65429 qemu didn't start
08:16:55.8393 65429 QEMU: Could not access KVM kernel module: Permission
denied
08:16:55.8393 65429 QEMU: failed to initialize KVM: Permission denied

08:16:55.8395 65429 awaiting death of commands process

--
Michel Normand


Related issues 3 (0 open3 closed)

Related to openSUSE admin - tickets #36559: openQA ppc64le workers bad kvm setupResolvedokurz2018-05-27

Actions
Related to openQA Project - action #32563: [functional][u] fix salt for powerResolvedszarate2018-02-28

Actions
Related to openQA Infrastructure - action #44432: [sle][functional][u][ppc64le] test fails in bootloader - smt is still on after reboot on qa-power8-5Resolvedszarate2018-11-28

Actions
Actions #1

Updated by dimstar over 6 years ago

  • Private changed from Yes to No
Actions #2

Updated by okurz over 6 years ago

I checked the machine "power8" and it seems the device /dev/kvm has lost the permission to allow access for the group "kvm" in which the user "_openqa-worker" runs. I don't know what changed that. So now I changed that manually by giving the group "kvm" r/w-access to /dev/kvm but I assume some udev rule should ensure this? As a workaround that should work for now.

Actions #3

Updated by okurz over 6 years ago

retriggered all recent openSUSE Tumbleweed power jobs

Actions #4

Updated by michel_mno over 6 years ago

okurz thank you for the bypass, I assume somebody restarted the machine last friday without checking for related /dev/kvm

Actions #5

Updated by michel_mno over 6 years ago

With the new 20171211 trial of PowerPC (restarted after 1 month) it seems we have again the same missing setup of /dev/kvm as previously reported.

I assume the Power8 workers were restarted and we still do have /dev/kvm accessed only by root (default)

The last autoinst-log.txt for 20171211 snapshot is: from https://openqa.opensuse.org/tests/557984/file/autoinst-log.txt

Actions #6

Updated by dimstar over 6 years ago

michel_mno wrote:

With the new 20171211 trial of PowerPC (restarted after 1 month) it seems we have again the same missing setup of /dev/kvm as previously reported.

I assume the Power8 workers were restarted and we still do have /dev/kvm accessed only by root (default)

Last reboot was 55 days ago
as for udev, there is ./rules.d/80-kvm.rules:KERNEL=="kvm", MODE="0666", GROUP="kvm" which should very well cover the case here (and which matches what my local machine, x86_64, has as permission on /dev/kvm

In any case I reset the permissions manually for today, but this does need more investigation

Actions #7

Updated by tampakrap over 6 years ago

  • Category set to Servers hosted in NBG
  • Assignee set to dimstar
Actions #8

Updated by michel_mno almost 6 years ago

Actions #9

Updated by okurz almost 6 years ago

  • Related to tickets #36559: openQA ppc64le workers bad kvm setup added
Actions #10

Updated by tampakrap almost 6 years ago

anything left here or can we close it?

Actions #11

Updated by dimstar almost 6 years ago

tampakrap wrote:

anything left here or can we close it?

Would be good to finally understand how this keeps on falling over. The issue does show up every now and then

Actions #12

Updated by michel_mno over 5 years ago

This is a recurrent problem:

#25170
#36559
#40031
#40919

Actions #13

Updated by okurz over 5 years ago

I think what happens is that any time a package upgrade also touches e.g. the qemu package – as happened yesterday – the permissions of /dev/kvm reset to 600

Actions #14

Updated by michel_mno over 5 years ago

new occurence for TW snapshot 20181118 as per
https://openqa.opensuse.org/tests/800586/file/autoinst-log.txt
So need same bypass as before for /dev/kvm access.

[2018-11-19T16:38:55.158 UTC] [debug] Backend process died, backend errors are reported below in the following lines can't open qmp at /usr/lib/os-autoinst/OpenQA/Qemu/Proc.pm line 402.
[2018-11-19T16:38:55.159 UTC] [info] ::: OpenQA::Qemu::Proc::save_state: Saving QEMU state to qemu_state.json
last frame
[2018-11-19T16:38:55.160 UTC] [info] ::: OpenQA::Qemu::Proc::save_state: Saving QEMU state to qemu_state.json
[2018-11-19T16:38:55.161 UTC] [debug] QEMU: QEMU emulator version 2.9.1(openSUSE Leap 42.3)
[2018-11-19T16:38:55.161 UTC] [debug] QEMU: Copyright (c) 2003-2017 Fabrice Bellard and the QEMU Project developers
[2018-11-19T16:38:55.161 UTC] [debug] QEMU: Could not access KVM kernel module: Permission denied
[2018-11-19T16:38:55.161 UTC] [debug] QEMU: failed to initialize KVM: Permission denied
Actions #15

Updated by michel_mno over 5 years ago

manual action by DimStar => workers OK

Actions #16

Updated by okurz over 5 years ago

  • Related to action #32563: [functional][u] fix salt for power added
Actions #17

Updated by okurz over 5 years ago

  • Related to action #44432: [sle][functional][u][ppc64le] test fails in bootloader - smt is still on after reboot on qa-power8-5 added
Actions #18

Updated by okurz over 5 years ago

#44432 might help us to understand what we would need to do for o3

Actions #19

Updated by szarate over 5 years ago

@okurz, I guess we could script the setup for these machines up there. But I wonder why and how the udev rule failed to be loaded...

Actions #20

Updated by michel_mno about 5 years ago

again same issue on 20190130:
https://openqa.opensuse.org/tests/842728/file/autoinst-log.txt

[2019-01-30T12:47:19.837 UTC] [debug] QEMU: Could not access KVM kernel module: Permission denied
[2019-01-30T12:47:19.837 UTC] [debug] QEMU: failed to initialize KVM: Permission denied

Actions #21

Updated by okurz about 5 years ago

Fixed manually with chmod g+rwX /dev/kvm on the machine. The machine did not reboot. This seems to be caused by a kernel upgrade triggered at 2019-01-30 12:01:22 (from /var/log/zypp/history). But who triggered the kernel upgrade?

Actions #22

Updated by okurz about 5 years ago

  • Status changed from New to Feedback
  • Assignee changed from dimstar to okurz

ok, that's the daily automatic update, see /etc/cron.daily/suse.de-abuild-online-update . I was not even aware we had that enabled :D

I patched the file /etc/cron.daily/suse.de-abuild-online-update with

function cleanup_and_exit() {
+    # 2019-01-30: okurz: https://progress.opensuse.org/issues/25170
+    # workaround for missing permissions on /dev/kvm
+    chmod g+rwX /dev/kvm
Actions #23

Updated by okurz about 5 years ago

  • Priority changed from Normal to Low

Well, seems like we did not have more problems since then at least. Keeping ticket open to remind us that we only did local changes that we have not versioned anywhere else which we should do eventually, e.g. with salt recipes.

Actions #24

Updated by okurz over 4 years ago

  • Status changed from Feedback to Resolved

remaining tasks covered in #43934

Actions

Also available in: Atom PDF