Project

General

Profile

Actions

action #134357

closed

[tools] kvm test run fails on OW21 and OW22 - but works on OW4

Added by dimstar over 1 year ago. Updated 7 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Bugs in existing tests
Start date:
2023-08-17
Due date:
% Done:

0%

Estimated time:
Difficulty:
Tags:

Description

Observation

openQA test in scenario microos-Tumbleweed-MicroOS-Image-x86_64-microos_qemu@64bit fails in
kvm

Did not really spot what is going on - but whenever this runs on OW4, it passes, while it fails on OW21/OW22

Test suite description

Reproducible

Fails since (at least) Build 20230816

Expected result

Last good: 20230815 (or more recent)

Further details

Always latest result in this scenario: latest


Related issues 1 (0 open1 closed)

Related to openQA Tests (public) - action #134285: [tools] Openqa bootstrap: test fails in test_results size:M auto_review:"Test died: no candidate needle with tag.+openqa-testresult"Resolvedlivdywan2023-08-15

Actions
Actions #1

Updated by dimstar over 1 year ago

  • Related to action #134285: [tools] Openqa bootstrap: test fails in test_results size:M auto_review:"Test died: no candidate needle with tag.+openqa-testresult" added
Actions #2

Updated by favogt over 1 year ago

With the qemu monitor and debug logs it's visible that KVM never makes any progress. It tries to run the vcpu but it never runs the first instruction. IP is stuck at FFF0.

I suspect a kernel bug, probably on the (openQA) VM host side. Using the latest TW kernel on the worker didn't help though.

Actions #3

Updated by dimstar over 1 year ago

Some experiments hint at a kernel bug on the host: https://bugzilla.opensuse.org/show_bug.cgi?id=1214537

Actions #4

Updated by favogt over 1 year ago

  • Status changed from New to Rejected

Confirmed as product bug meanwhile.

I used ow19 for performing some tests (and building test kernels. The workers are nice kernel build hosts!).
We could identify that the latest upstream kernel 6.5-rc7 works (while -rc6 is broken). I'll keep ow19 reserved for a bit still to test a kernel with backported fixes.

Actions #5

Updated by okurz over 1 year ago

  • Status changed from Rejected to Blocked
  • Assignee set to okurz
  • Target version set to Ready

Then let's keep this ticket open as tracker until we can again run the latest stable distribution kernel. I will track the bug. Should be downgrade and lock the older kernel version on more hosts? What if we upgrade more workers to Leap 15.5?

Actions #6

Updated by favogt over 1 year ago

okurz wrote in #note-5:

Then let's keep this ticket open as tracker until we can again run the latest stable distribution kennel t. I will track the bug. Should be downgrade and lock the older kernel version on more hosts? What if we upgrade more workers to Leap 15.5?

15.4 and 15.5 are equally affected. As a workaround, kernel 6.5-rc7+ can be installed or spec_rstack_overflow=off added to the cmdline.

Actions #7

Updated by openqa_review over 1 year ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: extra_tests_vagrant
https://openqa.opensuse.org/tests/3558849#step/sshfs/1

To prevent further reminder comments one of the following options should be followed:

  1. The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
  2. The openQA job group is moved to "Released" or "EOL" (End-of-Life)
  3. The bugref in the openQA scenario is removed or replaced, e.g. label:wontfix:boo1234

Expect the next reminder at the earliest in 28 days if nothing changes in this ticket.

Actions #8

Updated by maritawerner over 1 year ago

  • Subject changed from kvm test run fails on OW21 and OW22 - but works on OW4 to [tools] kvm test run fails on OW21 and OW22 - but works on OW4
Actions #9

Updated by okurz about 1 year ago

  • Target version changed from Ready to Tools - Next
Actions #10

Updated by dheidler about 1 year ago

FTR: Workarounds are currently applied on openqaworker21-26

Working kernel versions for 15.4 and 15.5:

kernel-default-5.14.21-150400.24.74.1.x86_64
kernel-default-5.14.21-150500.55.12.1.x86_64

zypper in --oldpackage kernel-default{,-extra,-optional}-5.14.21-150500.55.12.1.x86_64
zypper al -m bsc#1214537 'kernel*'
Actions #11

Updated by okurz 7 months ago

  • Status changed from Blocked to New
  • Target version changed from Tools - Next to Ready

https://bugzilla.opensuse.org/show_bug.cgi?id=1214537 is resolved and fvogt stated that he already removed zypper locks. Will check state on workers

Actions #12

Updated by okurz 7 months ago

  • Status changed from New to Resolved

No package locks left on o3 x86_64 machines, recent kernel installed. https://openqa.opensuse.org/tests/latest?arch=x86_64&distri=microos&flavor=MicroOS-Image&machine=64bit&test=microos_qemu&version=Tumbleweed is all green. all good

Actions

Also available in: Atom PDF