Project

General

Profile

action #122143

[qe-core][functional] test fails in bootloader because grub rescue mode entered due to network issue

Added by zluo 3 months ago. Updated 30 days ago.

Status:
In Progress
Priority:
Normal
Assignee:
Category:
Bugs in existing tests
Target version:
Start date:
2022-12-19
Due date:
% Done:

0%

Estimated time:
Difficulty:

Description

Observation

we have this issue quite long time and there is no workaround or fix on it yet.

openQA test in scenario sle-15-SP5-Online-ppc64le-textmode+role_textmode@ppc64le-hmc fails in
bootloader

Test suite description

Maintainers: QE Core, mgriessmeier

Like default but explicitly select the system role "textmode".

Reproducible

Fails since (at least) Build 61.1 (current job)

Expected result

Last good: 58.1 (or more recent)

Further details

Always latest result in this scenario: latest


Related issues

Related to openQA Tests - action #120570: [qe-core][functional] test fails in bootloader because root device is not ready and it leads to kernel panicFeedback

History

#1 Updated by zluo 3 months ago

  • Related to action #120570: [qe-core][functional] test fails in bootloader because root device is not ready and it leads to kernel panic added

#2 Updated by zluo 3 months ago

grenache pvm has trouble, so after reboot it looks to be normal, however it has a new issue with reading file under /boot:

https://openqa.suse.de/tests/10214478#step/bootloader/19

#3 Updated by zluo 3 months ago

  • Subject changed from [qe-core][functional] test fails in bootloader because shutdown of lpar takes too long to [qe-core][functional] test fails in bootloader because grub rescue mode entered due to network issue

change the title now to work on a workaround for the issue with grub rescue mode.

#4 Updated by zluo 3 months ago

to catch up the case of grub rescue mode is not a problem:
https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/16120

but re-trying reset lpar netboot still not really working, so I think the root issue is at setup/allocation of pvm.
If I triggered 10 job, all worked fine without any issue, but if I triggered more than 20 jobs, the sporadic issue happens.
if issue like loading initrd gets error, this cannot be resolved by re-try.

#5 Updated by rfan1 30 days ago

zluo wrote:

to catch up the case of grub rescue mode is not a problem:
https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/16120

but re-trying reset lpar netboot still not really working, so I think the root issue is at setup/allocation of pvm.
If I triggered 10 job, all worked fine without any issue, but if I triggered more than 20 jobs, the sporadic issue happens.
if issue like loading initrd gets error, this cannot be resolved by re-try.

One minor comment, can we consider to renew the NBP file on dhcp/install server? [it is built on Sep/2020]
lrwxrwxrwx 1 root root 30 Sep 21 2020 grub2 -> boot/powerpc-ieee1275/core.elf

Also available in: Atom PDF