Project

General

Profile

Actions

action #64030

closed

[kernel][ltp]spvm] test fails in boot_ltp - Not finished boot

Added by pcervinka about 4 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Bugs in existing tests
Target version:
QE Kernel - QE Kernel Done
Start date:
2020-03-02
Due date:
% Done:

0%

Estimated time:
Difficulty:

Description

Observation

openQA test in scenario sle-15-SP2-Online-ppc64le-ltp_openposix_spvm@ppc64le-spvm fails in
boot_ltp

Test suite description

Reproducible

Fails since (at least) Build 148.1 (current job)

Expected result

Last good: 146.1 (or more recent)

Further details

Always latest result in this scenario: latest

Actions #1

Updated by pcervinka about 4 years ago

[2020-02-27T02:03:51.220 CET] [debug] tests/kernel/boot_ltp.pm:36 called testapi::record_info
[2020-02-27T02:03:51.221 CET] [debug] <<< testapi::record_info(title="INFO", output="normal boot or boot with params", result="ok")
[2020-02-27T02:03:51.222 CET] [debug] tests/kernel/boot_ltp.pm:38 called opensusebasetest::wait_boot -> lib/opensusebasetest.pm:980 called opensusebasetest::handle_grub -> lib/opensusebasetest.pm:784 called opensusebasetest::wait_grub -> lib/opensusebasetest.pm:603 called testapi::assert_screen
.
.
.
.
[2020-02-29T03:42:33.285 CET] [debug] no match: 3.6s, best candidate: linux-login-bsc1055103-20170822 (0.00)
[2020-02-29T03:42:34.287 CET] [debug] no match: 2.6s, best candidate: linux-login-bsc1055103-20170822 (0.00)
[2020-02-29T03:42:35.283 CET] [debug] no match: 1.6s, best candidate: linux-login-bsc1055103-20170822 (0.00)
[2020-02-29T03:42:38.289 CET] [debug] WARNING: check_asserted_screen took 2.12 seconds for 30 candidate needles - make your needles more specific
[2020-02-29T03:42:38.289 CET] [debug] no match: 0.6s, best candidate: linux-login-bsc1055103-20170822 (0.00)
[2020-02-29T03:42:40.419 CET] [debug] WARNING: check_asserted_screen took 2.12 seconds for 30 candidate needles - make your needles more specific
[2020-02-29T03:42:40.419 CET] [debug] no match: -1.5s, best candidate: linux-login-bsc1055103-20170822 (0.00)
[2020-02-29T03:42:41.197 CET] [debug] >>> testapi::_check_backend_response: match=emergency-mode,emergency-shell,linux-login timed out after 500 (assert_screen)
[2020-02-29T03:42:41.468 CET] [debug] no candidate needle with tag(s) 'linux-login, emergency-shell, emergency-mode' matched
Actions #2

Updated by pcervinka about 4 years ago

  • Status changed from New to In Progress
  • Priority changed from Normal to High
Actions #3

Updated by pcervinka about 4 years ago

Randomly started since 143.1 build, there was added nfs-client to recommends in ltp spec file around that time.

Actions #4

Updated by pcervinka about 4 years ago

Couldn't reproduce locally. It could be related to recently added nfs-client or something else. The first step in troubleshooting is to increase boot time:
https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/9674

Actions #5

Updated by pcervinka about 4 years ago

PR merged, i will restart spvm jobs.

Actions #6

Updated by pcervinka about 4 years ago

Added TIMEOUT_SCALE=3 to ltp*spvm test suites.

Actions #7

Updated by pcervinka about 4 years ago

Timeout didn't help at all https://openqa.suse.de/tests/3951013#.
We will have to add systemctl list-job to problem detection to list jobs in progress to see what is pending during the boot.

Actions #8

Updated by pcervinka about 4 years ago

  • Subject changed from [kernel][ltp]spvm] test fails in boot_ltp - NFSD: starting 90-seconds grace period timeout to [kernel][ltp]spvm] test fails in boot_ltp - Not finished boot
Actions #9

Updated by pcervinka about 4 years ago

$ systemctl list-jobs
JOB UNIT TYPE STATE

250 getty.target start waiting
109 multi-user.target start waiting
265 systemd-update-utmp-runlevel.service start waiting
108 graphical.target start waiting
256 getty@tty1.service start waiting
260 plymouth-quit-wait.service start running
251 serial-getty@hvc0.service start waiting
259 after-local.service start waiting

8 jobs listed.

Actions #10

Updated by pcervinka about 4 years ago

Removed TIMEOUT_SCALE=3 from ltp*spvm test suites, as it doesn't help and makes no sense.

Actions #11

Updated by pcervinka about 4 years ago

  • Priority changed from High to Normal

Very strnage, couldn't reproduce in lab again, also it didn't happen in latest run on osd 151.2.

Actions #12

Updated by pcervinka about 4 years ago

  • Status changed from In Progress to Feedback

I noticed during log check that we install X with Gnome, which is not needed. As a result I updated default_kernel_spvm test suite on osd with `PATTERNS=base,minimal'.
Verification run: http://10.100.12.105/tests/4126#step/installation_overview/25

Let's wait for next build on osd.

Actions #13

Updated by jlausuch about 4 years ago

Looks like your quick fix worked. Latest builds are OK in OSD from 151.2

Actions #14

Updated by pcervinka about 4 years ago

  • Status changed from Feedback to Resolved

Looks good so far. If it happens again, we will start new investigation.

Actions #15

Updated by metan about 4 years ago

  • Target version changed from 445 to 457
Actions #16

Updated by pcervinka over 3 years ago

  • Target version changed from 457 to QE Kernel Done
Actions

Also available in: Atom PDF