action #68510
closedtest fails in logpackages on some aarch64 workers
0%
Description
Observation¶
openQA test in scenario opensuse-15.2-DVD-aarch64-create_hdd_gnome-x11@aarch64 fails in
logpackages
Since few days, installation test from ISO fail in logpackages on few aarch64 workers (AWS, based on SLE15SP1, and honeycomb lx2k, based on Tumbleweed) for both Leap 15.2 and Tumbleweed.
Test suite description¶
Reproducible¶
Fails since (at least) Build 208.3 (current job)
Expected result¶
Last good: 208.3 (or more recent)
Further details¶
Always latest result in this scenario: latest
Updated by ggardet_arm over 4 years ago
- Subject changed from test fails in logpackages on AWS worker only to test fails in logpackages on some aarch64 workers
- Description updated (diff)
It now occurs on my honeycomb lx2k machine, based on Tumbleweed: https://openqa.opensuse.org/tests/1320555#step/logpackages/4
Updated by okurz over 4 years ago
regarding https://progress.opensuse.org/issues/68510 what the test modules should do is send "ctrl-alt-f" and the screen should show a console. https://openqa.opensuse.org/tests/1320555/file/autoinst-log.txt shows that there is
[2020-06-30T08:44:06.213 UTC] [debug] tests/installation/logpackages.pm:31 called testapi::select_console
[2020-06-30T08:44:06.214 UTC] [debug] <<< testapi::select_console(testapi_console="install-shell")
Instead like in
https://openqa.opensuse.org/tests/1320555#step/logpackages/2
we see that there is no apparent screen change. Another tty key press should be done in the post_fail_hook but there as well no effect:
https://openqa.opensuse.org/tests/1320555#step/logpackages/7
https://openqa.opensuse.org/tests/1320555#investigation shows that there are really no relevant test changes between "last good" and "first bad" but that of course is comparing worker "hctw01:10" against "openqa-aarch16:15". You could compare if there are relevant changes, e.g. installed package updates, on the worker host where you encounter these problems. Maybe a kernel upgrade or qemu? What you can also try is to trigger a job explicitly on the affected worker hosts and use the developer mode to pause at the start of "logpackages" and interactively try if the tty switch is effective.
Updated by ggardet_arm over 4 years ago
Unfortunately, developer mode does not work as they are remote workers, without incoming connection permitted.
aws worker (broken):
- OS: SLE15SP1
- os-autoinst: 4.6.1592908950.5038d8c2-458.1 (was working with 4.6.1592836702.7b31205c, only diff is https://github.com/os-autoinst/os-autoinst/pull/1449 so unlikely the cause)
- openQA-worker: 4.6.1593437825.61ee7d716-2920.1
- qemu: 3.1.1.1-9.21.4
- qemu-uefi-aarch64: 202002-3.2
HoneyComb LX2K worker (broken):
- OS: Tumbleweed
- os-autoinst: 4.6.1592908950.5038d8c2-1.1
- openQA-worker: 4.6.1592865598.88eb8f02c-1.1
- qemu: 5.0.0-3.1
- qemu-uefi-aarch64: 202005-1.1
D05 (working):
- OS: Leap 15.1
- os-autoinst: 4.6.1592908950.5038d8c2-1.1
- openQA-worker: ?
- qemu: 3.1.1.1
- qemu-uefi-aarch64: ?
Updated by ggardet_arm over 4 years ago
I managed to VNC to qemu (on the HoneyComb LX2K, as it is a local machine for me) and the tty switch actually happens, but openQA still see the old screen!
So we have a screen refresh problem apparently!
Updated by mkittler over 4 years ago
A screen refreshing problem has actually recently been reported. There's also a workaround: https://github.com/os-autoinst/os-autoinst/pull/1451
However, it seemed specific to SLE and not clearly related.
Updated by ggardet_arm over 4 years ago
- Related to action #68474: similarity calculation can return NaN added
Updated by ggardet_arm over 4 years ago
- Status changed from New to Resolved
This PR fixes the problem for the SLE15SP1 based AWS system and also for the Tumbleweed based machine, displaying WARNING: cv::norm() returned NaN (poo#68474)
in the log for both.
- SLE15SP1: https://openqa.opensuse.org/tests/1320671
- Tumbleweed: https://openqa.opensuse.org/tests/1320672