action #65663
closed[sles][functional][u][sporadic] test fails in bootloader - lpar is not in "Not activate state"
0%
Description
Observation¶
openQA test in scenario sle-15-SP2-Online-ppc64le-textmode@ppc64le-hmc fails in
bootloader
Reproducible¶
Fails since (at least) Build 178.1 (current job)
Expected result¶
Last good: 176.1 (or more recent)
Further details¶
Always latest result in this scenario: latest
Suggestions¶
investigate why this happens sporadic at early stage, timeout issue for pvm-bootmenu?
Updated by SLindoMansilla about 4 years ago
- Description updated (diff)
- Status changed from New to Workable
- Target version set to Milestone 30
- Estimated time set to 42.00 h
Updated by asmorodskyi about 4 years ago
isn't this https://bugzilla.suse.com/show_bug.cgi?id=1169840 ?
Updated by zluo about 4 years ago
no. this is different as no qcow2 image used at all.
I have checked the issue in bug report, actually it was issue with setup on openQA, not a product bug. See my comment there.
Updated by zluo about 4 years ago
- Status changed from Workable to In Progress
- Assignee set to zluo
checking
Updated by zluo about 4 years ago
check later:
Updated by asmorodskyi about 4 years ago
zluo wrote:
no. this is different as no qcow2 image used at all.
I have checked the issue in bug report, actually it was issue with setup on openQA, not a product bug. See my comment there.
so according to comments which you comment got in the bug I would say that this is bug which I mention
Updated by zluo about 4 years ago
asmorodskyi wrote:
zluo wrote:
no. this is different as no qcow2 image used at all.
I have checked the issue in bug report, actually it was issue with setup on openQA, not a product bug. See my comment there.
so according to comments which you comment got in the bug I would say that this is bug which I mention
No, you can see the installation uses iso and and other installation mentioned in bug report uses qcow2 and tries to boot up.
the issue of this ticket is sporadic, you can see 2 failures of 51 test runs on osd (see above the link).
Updated by zluo about 4 years ago
https://openqa.suse.de/tests/4147606#next_previous shows clearly the sporadic performance issue on workers.
grenache-1:22, grenache-1:26 fails still at bootloader
grenache-1:21 fails at scc_registration
grenache-1:25 fails at welcome
only couple of tests runs could not run successfully. This is configuration issue on grenache then. The better to solve this sporadic issue is reduce the amount of workers. With increase timeout in this case it won't help: SMS is not show up, it fails later at other test modules.
Will open another ticket and assign it to tools team.
Updated by zluo about 4 years ago
- Related to action #65963: [sle][functional][u] performance issue of ppc64le workers on grenache added
Updated by zluo about 4 years ago
- Subject changed from [sles][functional][u]test fails in bootloader - bootmenu doesn't show up to [sles][functional][u](sporadic)test fails in bootloader - bootmenu doesn't show up
Updated by zluo about 4 years ago
- Subject changed from [sles][functional][u](sporadic)test fails in bootloader - bootmenu doesn't show up to [sles][functional][u][sporadic] test fails in bootloader - bootmenu doesn't show up
Updated by zluo almost 4 years ago
since this is a sporadic issue and it seems to be related to workers on grenache (poo#65963), keep it open for further observations.
Updated by openqa_review almost 4 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: textmode@ppc64le-hmc
https://openqa.suse.de/tests/4247103
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released"
- The label in the openQA scenario is removed
Updated by zluo almost 4 years ago
the real issue:
shutdown of lpar is not successful and this is not detected by needle match and activating lpar couldn't work because shutdown is still going on. checking of activation of lpar is not correct. I found the message about this failure: partition is not in "Not activate state"
Updated by zluo almost 4 years ago
https://openqa.suse.de/tests/4283842#step/bootloader/8:
the issue can be detected. So the question is still how can we process with it.
If lpar cannot be activated, then this is an setup issue on pvm. I am not sure about why this happens sporadic.
Can it be the case that the number of lpar less than workers?
Updated by zluo almost 4 years ago
- Subject changed from [sles][functional][u][sporadic] test fails in bootloader - bootmenu doesn't show up to [sles][functional][u][sporadic] test fails in bootloader - partition is not in "Not activate state"
Updated by zluo almost 4 years ago
- Subject changed from [sles][functional][u][sporadic] test fails in bootloader - partition is not in "Not activate state" to [sles][functional][u][sporadic] test fails in bootloader - lpar is not in "Not activate state"
Updated by zluo almost 4 years ago
https://openqa.suse.de/tests/4288970#step/bootloader/8
tried with following changes, add wait time, but this is still not working:
sub boot_hmc_pvm {
my $hmc_machine_name = get_required_var('HMC_MACHINE_NAME');
my $lpar_id = get_required_var('LPAR_ID');
my $hmc = select_console 'powerhmc-ssh';
my $max_wait_time = 6;
# detach possibly attached terminals - might be left over
type_string "rmvterm -m $hmc_machine_name --id $lpar_id && echo 'DONE'\n";
assert_screen 'pvm-vterm-closed';
# power off the machine if it's still running - and don't give it a 2nd chance
type_string "chlparstate -m $hmc_machine_name -o shutdown --id $lpar_id -w $max_wait_time && echo 'LPAR SUCCESSFULLY SHUT DOWN'\n";
assert_screen [qw(pvm-poweroff-successful pvm-poweroff-not-running)], 180;
# proceed with normal boot if is system already installed, use sms boot for installation
my $bootmode = get_var('BOOT_HDD_IMAGE') ? "norm" : "sms";
type_string "chsysstate -r lpar -m $hmc_machine_name -o on -b ${bootmode} --id $lpar_id && echo 'LPAR SUCCESSFULLY BOOTED'\n";
assert_screen [qw(pvm-poweron-successful lpar-still-activated)], 90;
die "lpar $lpar_id cannot be activated in $max_wait_time minutes. Please try to restart the test as workaround" if match_has_tag('lpar-still-activated');
# don't wait for it, otherwise we miss the menu
type_string "mkvterm -m $hmc_machine_name --id $lpar_id\n";
# skip further preperations if system is already installed
return if get_var('BOOT_HDD_IMAGE');
get_into_net_boot;
prepare_pvm_installation;
}
Updated by zluo almost 4 years ago
https://openqa.suse.de/tests/4291915#step/bootloader/5, create a needle for checking lpar-is-running.
Updated by zluo almost 4 years ago
https://openqa.suse.de/tests/4291920#step/bootloader/8 create needle lpar-is-down
Updated by zluo almost 4 years ago
https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/10401 ready for review.
Updated by zluo almost 4 years ago
- Status changed from In Progress to Resolved
https://openqa.suse.de/tests/4322944#step/bootloader/6 resolved now.