Project

General

Profile

Actions

action #174673

open

[qa-tools][powerVM] netboot failed at loading NBP file sporadically

Added by rfan1 4 days ago. Updated 4 days ago.

Status:
Feedback
Priority:
Normal
Assignee:
Category:
Support
Start date:
2024-12-23
Due date:
2025-01-06 (Due in 10 days)
% Done:

0%

Estimated time:

Description

Description

I can see this issue from last Friday. I am not sure if any network or file transfer issue between install server and SUT.

But based on my openQA and manual tests. it can be seen many times when powerVM machines start the netboot even it is a sporadic issue.

May I ask for your kindly help to check this issue?

  1. Is the NBP file update to date?
  2. or some network issue between SUT and install server?
  3. or some performance issue?

Observation

openQA test in scenario sle-16.0-agama-installer-ppc64le-agama-powervm@ppc64le-hmc fails in
boot_agama

Test suite description

The base test suite is used for job templates defined in YAML documents. It has no settings of its own.

Reproducible

Fails since (at least) Build rfan1220

Expected result

Last good: (unknown) (or more recent)

Further details

Always latest result in this scenario: latest


Files

Actions #1

Updated by rfan1 4 days ago

Please refer to the attached screen shot, seems NBP file downloading is done but it fails to start the grub menu. it is a sporadic issue, but I can hit this with ~30% ratio.

Actions #2

Updated by rfan1 4 days ago

  • Subject changed from [qa-tools][powerVM] netboot failed at loading nbp file to [qa-tools][powerVM] netboot failed at loading NBP file sporadically
  • Description updated (diff)
Actions #3

Updated by dawei_pang 4 days ago

This is a randomly powerpc issue for a long time, probably it is related to powerpc firmware.

Use the followed OFW commands manually to workaround this issue

0 > SET_NVRAM_DEFAULTS 
 SMS Macro Operation Succeeded.
 ok
0 > RESET_PARTITION 
Rebooting...

Then nessberry LPAR2 is able to PXE boot into grub successfully

Actions #4

Updated by okurz 4 days ago

  • Due date set to 2025-01-06
  • Category set to Support
  • Status changed from New to Feedback
  • Assignee set to okurz
  • Target version set to Ready

@rfan1 can it be that the agama test scenario is missing some special test code to call SET_NVRAM_DEFAULTS which other test code already has?

Actions #5

Updated by rfan1 4 days ago

okurz wrote in #note-4:

@rfan1 can it be that the agama test scenario is missing some special test code to call SET_NVRAM_DEFAULTS which other test code already has?

The answer is no, the agama tests use the same net boot process as other sle products. [nbp file, grub.cfg]. the difference is loading different kernel/initrd files to start the installation.

But, I can see below code

    # Restore LPAR's NVRAM defaults if SET_NVRAM_DEFAULTS setting is present
    if (get_var('SET_NVRAM_DEFAULTS')) {
        # Boot into open firmware (of) first to issue a SET_NVRAM_DEFAULTS command
        enter_cmd("chsysstate -r lpar -m $hmc_machine_name -o on -b of --id $lpar_id ");
        enter_cmd("mkvterm -m $hmc_machine_name --id $lpar_id");
        assert_screen 'openfirmware-prompt', 60;
        enter_cmd('SET_NVRAM_DEFAULTS');
        assert_screen 'openfirmware-prompt';
        # Exit from LPAR's console, shutdown LPAR and continue as usual
        enter_cmd('~~.');
        assert_screen 'terminate-openfirmware-session';
        send_key 'y';
        assert_screen 'powerhmc-ssh', 60;
        enter_cmd("chsysstate -r lpar -m $hmc_machine_name -o shutdown --immed --id $lpar_id ");
        check_lpar_is_down($hmc_machine_name, $lpar_id);
    }

Let me try if it can help.

Actions #6

Updated by rfan1 4 days ago

Seems the issue is gone with or without the parameter SET_NVRAM_DEFAULTS
http://openqa.suse.de/tests/overview?distri=sle&version=16.0&build=rfan1223_2

Let me monitor it in next few days :)

Actions

Also available in: Atom PDF