action #134357
closed[tools] kvm test run fails on OW21 and OW22 - but works on OW4
0%
Description
Observation¶
openQA test in scenario microos-Tumbleweed-MicroOS-Image-x86_64-microos_qemu@64bit fails in
kvm
Did not really spot what is going on - but whenever this runs on OW4, it passes, while it fails on OW21/OW22
Test suite description¶
Reproducible¶
Fails since (at least) Build 20230816
Expected result¶
Last good: 20230815 (or more recent)
Further details¶
Always latest result in this scenario: latest
Updated by dimstar over 1 year ago
- Related to action #134285: [tools] Openqa bootstrap: test fails in test_results size:M auto_review:"Test died: no candidate needle with tag.+openqa-testresult" added
Updated by favogt over 1 year ago
With the qemu monitor and debug logs it's visible that KVM never makes any progress. It tries to run the vcpu but it never runs the first instruction. IP is stuck at FFF0.
I suspect a kernel bug, probably on the (openQA) VM host side. Using the latest TW kernel on the worker didn't help though.
Updated by dimstar over 1 year ago
Some experiments hint at a kernel bug on the host: https://bugzilla.opensuse.org/show_bug.cgi?id=1214537
Updated by favogt over 1 year ago
- Status changed from New to Rejected
Confirmed as product bug meanwhile.
I used ow19 for performing some tests (and building test kernels. The workers are nice kernel build hosts!).
We could identify that the latest upstream kernel 6.5-rc7 works (while -rc6 is broken). I'll keep ow19 reserved for a bit still to test a kernel with backported fixes.
Updated by okurz over 1 year ago
- Status changed from Rejected to Blocked
- Assignee set to okurz
- Target version set to Ready
Then let's keep this ticket open as tracker until we can again run the latest stable distribution kernel. I will track the bug. Should be downgrade and lock the older kernel version on more hosts? What if we upgrade more workers to Leap 15.5?
Updated by favogt over 1 year ago
okurz wrote in #note-5:
Then let's keep this ticket open as tracker until we can again run the latest stable distribution kennel t. I will track the bug. Should be downgrade and lock the older kernel version on more hosts? What if we upgrade more workers to Leap 15.5?
15.4 and 15.5 are equally affected. As a workaround, kernel 6.5-rc7+ can be installed or spec_rstack_overflow=off
added to the cmdline.
Updated by openqa_review over 1 year ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: extra_tests_vagrant
https://openqa.opensuse.org/tests/3558849#step/sshfs/1
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released" or "EOL" (End-of-Life)
- The bugref in the openQA scenario is removed or replaced, e.g.
label:wontfix:boo1234
Expect the next reminder at the earliest in 28 days if nothing changes in this ticket.
Updated by maritawerner over 1 year ago
- Subject changed from kvm test run fails on OW21 and OW22 - but works on OW4 to [tools] kvm test run fails on OW21 and OW22 - but works on OW4
Updated by okurz about 1 year ago
- Target version changed from Ready to Tools - Next
Still blocked on https://bugzilla.opensuse.org/show_bug.cgi?id=1214537
Updated by dheidler about 1 year ago
FTR: Workarounds are currently applied on openqaworker21-26
Working kernel versions for 15.4 and 15.5:
kernel-default-5.14.21-150400.24.74.1.x86_64
kernel-default-5.14.21-150500.55.12.1.x86_64
zypper in --oldpackage kernel-default{,-extra,-optional}-5.14.21-150500.55.12.1.x86_64
zypper al -m bsc#1214537 'kernel*'
Updated by okurz 7 months ago
- Status changed from Blocked to New
- Target version changed from Tools - Next to Ready
https://bugzilla.opensuse.org/show_bug.cgi?id=1214537 is resolved and fvogt stated that he already removed zypper locks. Will check state on workers
Updated by okurz 7 months ago
- Status changed from New to Resolved
No package locks left on o3 x86_64 machines, recent kernel installed. https://openqa.opensuse.org/tests/latest?arch=x86_64&distri=microos&flavor=MicroOS-Image&machine=64bit&test=microos_qemu&version=Tumbleweed is all green. all good