Project

General

Profile

action #89920

coordination #80142: [saga][epic] Scale out: Redundant/load-balancing deployments of openQA, easy containers, containers on kubernetes

coordination #80150: [epic] Scale out openQA: Easier openQA setup

Extend existing openQA-in-openQA tests as a learning exercise to know where our instructions or beginner situation can be improved

Added by okurz 9 months ago. Updated 3 months ago.

Status:
New
Priority:
Low
Assignee:
-
Category:
Feature requests
Target version:
Start date:
2021-03-11
Due date:
% Done:

0%

Estimated time:
Difficulty:

Description

Motivation

See #89620 and #12134 and such. Some team members are missing experience with actually using openQA "writing tests" and such. A good opportunity could be to extend the openQA-in-openQA tests to cover apparmor. Simple as making sure apparmor is enabled and that at best the issue that https://github.com/os-autoinst/openQA/pull/3780 tries to fix is reproduced in the old version related to #89620

This is where we would be able to learn where our software+docs are still insufficient.

Acceptance criteria

  • AC1: Docs or software are improved for the case of "I just want to extend existing tests"
  • AC2: Existing openQA-in-openQA tests have been extended correspondingly

Suggestions

  • Try to follow open.qa/docs in the role of a "unexperienced user" and record where you are lost :)

History

#1 Updated by okurz 9 months ago

  • Parent task set to #80150

#2 Updated by kraih 9 months ago

  • Assignee set to kraih

I'm one of those team members that has never extended an existing test. This seems like an interesting exercise. I'll take a look and make notes along the way.

#3 Updated by openqa_review 9 months ago

  • Due date set to 2021-03-26

Setting due date based on mean cycle time of SUSE QE Tools

#4 Updated by cdywan 8 months ago

  • Due date deleted (2021-03-26)

Resetting the due date due to hackweek, and I think it's not in progress yet anyway

#5 Updated by openqa_review 8 months ago

  • Due date set to 2021-04-10

Setting due date based on mean cycle time of SUSE QE Tools

#6 Updated by cdywan 8 months ago

  • Due date deleted (2021-04-10)

Not yet

#7 Updated by openqa_review 8 months ago

  • Due date set to 2021-04-27

Setting due date based on mean cycle time of SUSE QE Tools

#8 Updated by okurz 7 months ago

  • Due date deleted (2021-04-27)

for now no due-date on "Workable", see https://github.com/os-autoinst/scripts/pull/71

#9 Updated by kraih 6 months ago

I started setting up a new VM for this only to discover that openQA in openQA tests don't appear to work in a VM (various kvm errors). :)

#10 Updated by kraih 5 months ago

I do have a first observation though about our documentation. It has gotten very large, and there is a lot of detailed information that someone just starting out doesn't really need right away. I think we would benefit from a second, more condensed version of the documentation, just for getting started with test development. Like a playbook that only contains the essential steps for installing openQA, writing your very first test case, and running it locally. The key point would be to make this playbook as short as possible, and keep all the detailed context information in the main documentation.

#11 Updated by okurz 5 months ago

kraih wrote:

I think we would benefit from a second, more condensed version of the documentation, just for getting started with test development.

That is supposed to be https://github.com/os-autoinst/openQA/blob/master/docs/GettingStarted.asciidoc . This document is included in the complete one-document format as well. Maybe we can rework the titles a bit and move content around to make that more obvious? We can also explicitly mention just that document on http://open.qa/documentation/

#12 Updated by okurz 5 months ago

  • Status changed from Workable to New

moving all tickets without size confirmation by the team back to "New". The team should move the tickets back after estimating and agreeing on a consistent size

#13 Updated by kraih 4 months ago

  • Subject changed from Extend existing openQA-in-openQA tests as a learning exercise to know where our instructions or beginner situation can be improved to Extend existing openQA-in-openQA tests as a learning exercise to know where our instructions or beginner situation can be improved size:L

#14 Updated by kraih 4 months ago

  • Status changed from New to In Progress

#15 Updated by openqa_review 4 months ago

  • Due date set to 2021-08-13

Setting due date based on mean cycle time of SUSE QE Tools

#16 Updated by kraih 4 months ago

  • Status changed from In Progress to Blocked

This one is currently a little blocked since i first need to set up a new development machine. The openQA in openQA tests require some CPU features that are not available in my current VM based setup.

#17 Updated by kraih 4 months ago

  • Status changed from Blocked to Workable

#18 Updated by kraih 4 months ago

I previously forgot to keep a record of the exact error i ran into, here it is:

[2021-08-16T17:10:35.703 CEST] [debug] QEMU: Please use wait=off instead
[2021-08-16T17:10:35.703 CEST] [warn] !!! : qemu-system-x86_64: error: failed to set MSR 0x48d to 0x5600000016
[2021-08-16T17:10:35.703 CEST] [warn] !!! : qemu-system-x86_64: ../target/i386/kvm/kvm.c:2753: kvm_buf_set_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.
[2021-08-16T17:10:35.704 CEST] [debug] sending magic and exit
[2021-08-16T17:10:35.704 CEST] [debug] received magic close
[2021-08-16T17:10:35.705 CEST] [debug] backend process exited: 0
failed to start VM at /usr/lib/os-autoinst/backend/driver.pm line 126.

It seems to be related to nested virtualization. I've run many other tests (successfully) in the same VM, but never seen this before.

#19 Updated by okurz 4 months ago

wow, never seen that in before. I wonder what the Please use wait=off instead means.

About the assertion I agree that it seems to be related to nested virtualization. There is https://bugs.launchpad.net/qemu/+bug/1661386 . I assume you run an openSUSE VM within vmware. A comment in the bug suggest to supply "-cpu host,pmu=off". Try to trigger the openQA test with QEMUCPU=host,pmu=off.

#20 Updated by kraih 4 months ago

Running the same job with QEMU_NO_KVM=1 results in:

[2021-08-16T17:30:39.585 CEST] [debug] QEMU: Please use wait=off instead
[2021-08-16T17:30:39.585 CEST] [warn] !!! : qemu-system-x86_64: CPU model 'host' requires KVM
[2021-08-16T17:30:39.585 CEST] [debug] sending magic and exit
[2021-08-16T17:30:39.586 CEST] [debug] received magic close
[2021-08-16T17:30:39.588 CEST] [debug] backend process exited: 0
failed to start VM at /usr/lib/os-autoinst/backend/driver.pm line 126.

#21 Updated by kraih 4 months ago

okurz wrote:

About the assertion I agree that it seems to be related to nested virtualization. There is https://bugs.launchpad.net/qemu/+bug/1661386 . I assume you run an openSUSE VM within vmware. A comment in the bug suggest to supply "-cpu host,pmu=off". Try to trigger the openQA test with QEMUCPU=host,pmu=off.

Ok, i do see -cpu host,pmu=off in the command line, so it was passed. But it does not appear to make any difference in the result.

[2021-08-16T17:34:11.103 CEST] [debug] starting: /usr/bin/qemu-system-x86_64 -only-migratable -chardev ringbuf,id=serial0,logfile=serial0,logappend=on -serial chardev:serial0 -audiodev none,id=snd0 -device intel-hda -device hda-output,audiodev=snd0 -global isa-fdc.fdtypeA=none -m 2048 -cpu host,pmu=off -netdev user,id=qanet0 -device virtio-net,netdev=qanet0,mac=52:54:00:12:34:56 -boot once=d -device usb-ehci -device usb-tablet -smp 1 -enable-kvm -no-shutdown -vnc :91,share=force-shared -device virtio-serial -chardev pipe,id=virtio_console,path=virtio_console,logfile=virtio_console.log,logappend=on -device virtconsole,chardev=virtio_console,name=org.openqa.console.virtio_console -chardev socket,path=qmp_socket,server,nowait,id=qmp_socket,logfile=qmp_socket.log,logappend=on -qmp chardev:qmp_socket -S -device virtio-scsi-pci,id=scsi0 -blockdev driver=file,node-name=hd0-overlay0-file,filename=/home/sri/work/openQA/openqa/pool/1/raid/hd0-overlay0,cache.no-flush=on -blockdev driver=qcow2,node-name=hd0-overlay0,file=hd0-overlay0-file,cache.no-flush=on -device virtio-blk,id=hd0-device,drive=hd0-overlay0,serial=hd0
[2021-08-16T17:34:11.105 CEST] [debug] Waiting for 0 attempts
[2021-08-16T17:34:11.179 CEST] [debug] Waiting for 1 attempts
[2021-08-16T17:34:11.179 CEST] [info] ::: backend::baseclass::die_handler: Backend process died, backend errors are reported below in the following lines:
  QEMU terminated before QMP connection could be established. Check for errors below
[2021-08-16T17:34:11.179 CEST] [info] ::: OpenQA::Qemu::Proc::save_state: Saving QEMU state to qemu_state.json
[2021-08-16T17:34:11.180 CEST] [debug] Passing remaining frames to the video encoder
[2021-08-16T17:34:11.216 CEST] [debug] Waiting for video encoder to finalize the video
[2021-08-16T17:34:11.216 CEST] [debug] The built-in video encoder (pid 104862) terminated
[2021-08-16T17:34:11.216 CEST] [debug] QEMU: QEMU emulator version 6.0.0 (openSUSE Tumbleweed)
[2021-08-16T17:34:11.216 CEST] [debug] QEMU: Copyright (c) 2003-2021 Fabrice Bellard and the QEMU Project developers
[2021-08-16T17:34:11.217 CEST] [warn] !!! : qemu-system-x86_64: -chardev socket,path=qmp_socket,server,nowait,id=qmp_socket,logfile=qmp_socket.log,logappend=on: warning: short-form boolean option 'server' deprecated
[2021-08-16T17:34:11.217 CEST] [debug] QEMU: Please use server=on instead
[2021-08-16T17:34:11.217 CEST] [warn] !!! : qemu-system-x86_64: -chardev socket,path=qmp_socket,server,nowait,id=qmp_socket,logfile=qmp_socket.log,logappend=on: warning: short-form boolean option 'nowait' deprecated
[2021-08-16T17:34:11.217 CEST] [debug] QEMU: Please use wait=off instead
[2021-08-16T17:34:11.217 CEST] [warn] !!! : qemu-system-x86_64: error: failed to set MSR 0x48d to 0x5600000016
[2021-08-16T17:34:11.217 CEST] [warn] !!! : qemu-system-x86_64: ../target/i386/kvm/kvm.c:2753: kvm_buf_set_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.
[2021-08-16T17:34:11.217 CEST] [debug] sending magic and exit
[2021-08-16T17:34:11.220 CEST] [debug] backend process exited: 0
[2021-08-16T17:34:11.220 CEST] [debug] received magic close
failed to start VM at /usr/lib/os-autoinst/backend/driver.pm line 126.

This is the latest Tumbleweed running in vmware fusion 12.

Edit: Updated with the complete QEMU output for completeness sake.

#22 Updated by kraih 4 months ago

okurz wrote:

I wonder what the Please use wait=off instead means.

Just an unrelated deprecation warning.

#23 Updated by kraih 4 months ago

  • Due date changed from 2021-08-13 to 2021-09-30

#24 Updated by kraih 4 months ago

  • Due date deleted (2021-09-30)

#25 Updated by okurz 3 months ago

  • Subject changed from Extend existing openQA-in-openQA tests as a learning exercise to know where our instructions or beginner situation can be improved size:L to Extend existing openQA-in-openQA tests as a learning exercise to know where our instructions or beginner situation can be improved
  • Status changed from Workable to New
  • Assignee deleted (kraih)
  • Priority changed from Normal to Low
  • Target version changed from Ready to future

so it seems this ticket was actually never estimated. And with kraih hitting the mentioned problems we should re-evaluate.

Also available in: Atom PDF