Project

General

Profile

Actions

action #122143

closed

[qe-core][functional] test fails in bootloader because grub rescue mode entered due to network issue

Added by zluo almost 2 years ago. Updated over 1 year ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Bugs in existing tests
Target version:
Start date:
2022-12-19
Due date:
% Done:

0%

Estimated time:
Difficulty:

Description

Observation

we have this issue quite long time and there is no workaround or fix on it yet.

openQA test in scenario sle-15-SP5-Online-ppc64le-textmode+role_textmode@ppc64le-hmc fails in
bootloader

Test suite description

Maintainers: QE Core, mgriessmeier

Like default but explicitly select the system role "textmode".

Reproducible

Fails since (at least) Build 61.1 (current job)

Expected result

Last good: 58.1 (or more recent)

Further details

Always latest result in this scenario: latest


Related issues 1 (0 open1 closed)

Related to openQA Project - action #120570: [qe-core][functional][tools] test fails in bootloader because root device is not ready and it leads to kernel panic size:MResolvedokurz

Actions
Actions #1

Updated by zluo almost 2 years ago

  • Related to action #120570: [qe-core][functional][tools] test fails in bootloader because root device is not ready and it leads to kernel panic size:M added
Actions #2

Updated by zluo almost 2 years ago

grenache pvm has trouble, so after reboot it looks to be normal, however it has a new issue with reading file under /boot:

https://openqa.suse.de/tests/10214478#step/bootloader/19

Actions #3

Updated by zluo almost 2 years ago

  • Subject changed from [qe-core][functional] test fails in bootloader because shutdown of lpar takes too long to [qe-core][functional] test fails in bootloader because grub rescue mode entered due to network issue

change the title now to work on a workaround for the issue with grub rescue mode.

Actions #4

Updated by zluo almost 2 years ago

to catch up the case of grub rescue mode is not a problem:
https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/16120

but re-trying reset lpar netboot still not really working, so I think the root issue is at setup/allocation of pvm.
If I triggered 10 job, all worked fine without any issue, but if I triggered more than 20 jobs, the sporadic issue happens.
if issue like loading initrd gets error, this cannot be resolved by re-try.

Actions #5

Updated by rfan1 almost 2 years ago

zluo wrote:

to catch up the case of grub rescue mode is not a problem:
https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/16120

but re-trying reset lpar netboot still not really working, so I think the root issue is at setup/allocation of pvm.
If I triggered 10 job, all worked fine without any issue, but if I triggered more than 20 jobs, the sporadic issue happens.
if issue like loading initrd gets error, this cannot be resolved by re-try.

One minor comment, can we consider to renew the NBP file on dhcp/install server? [it is built on Sep/2020]
lrwxrwxrwx 1 root root 30 Sep 21 2020 grub2 -> boot/powerpc-ieee1275/core.elf

qanet:/srv/tftp/ppc64le/boot/grub2-ieee1275/powerpc-ieee1275 got updated now, thanks for report.

Actions #6

Updated by zluo over 1 year ago

  • Status changed from In Progress to Resolved
Actions

Also available in: Atom PDF