action #80452

[qe-core][qem] Problems with aarch64 RAID 15SP1/SP2 QU tests - **Suggested Backport**

Added by tjyrinki_suse about 2 years ago. Updated almost 2 years ago.

Bugs in existing tests
Target version:
Start date:
Due date:
% Done:


Estimated time:


Passed for 15SP1 16 days ago:

Failed in a later build:

(same build, aarch64 specific for playground)

But associated with change last Thursday like:
that led to setup_libyui error:
and with manual schedule omitting the setup_libyui to raid_gpt error:

Then Rodion mentioned YAML schedule should not be used and moved back to non-YAML (even though the passing tests 16 days earlier were using the YAML schedule):

But that lead to different errors, which in turn were partially fixed by:

Before now George started looking at, at least RAID 0, 5 and 10 were proven to have passed at least once for 15SP1 Build 47.3, while 1 and 6 remained problemtic.

For 15SP2, everything passed 5 days ago at but similarly failures with latest build (rerun can get it further though):

SP2 is still using YAML.


#1 Updated by tjyrinki_suse about 2 years ago

  • Description updated (diff)

#2 Updated by tjyrinki_suse about 2 years ago

  • Description updated (diff)

#3 Updated by tjyrinki_suse about 2 years ago

  • Priority changed from Normal to High

#4 Updated by tjyrinki_suse about 2 years ago

  • Category set to Bugs in existing tests

#5 Updated by geor almost 2 years ago

  • Subject changed from [qe-core][qem] Problems with aarch64 RAID 15SP1/SP2 QU tests to [qe-core][qem] Problems with aarch64 RAID 15SP1/SP2 QU tests - **Suggested Backport**
  • Status changed from In Progress to Resolved
  • % Done changed from 0 to 100

Some reneedling was required; The aarch64 RAID jobs of 15SP1 and 15SP2 QUs should be passing consistently now.

Concerning the irregular reconnect_mgmt_console failure, this issue is actually caused in await_install.
The reboot message at the end of the installation has a default timeout of 10 seconds.
In some archs like aarch64 and s390, it happens that await_install module's needle check is not catching up with the 10 second timeout, and reboot is not cancelled.
This results in the machine rebooting when it should not, and failing in the next module, reconnect_mgmt_console.

In order to fix this issue, PR_1 and PR2 were introduced.
This allows for the following usage, as seen in lib/
push @params, 'reboot_timeout=' . get_var('REBOOT_TIMEOUT', 0) unless (is_leap('<15.2') || is_sle('<15-SP2'));

The above line, by default, pushes in the list of bootparams the reboot_timeout=0 which, for 15-SP2 that contains the two aforementioned PR changes, removes the timeout on the reboot message and openQA will have time to catch up.

However, in 15-SP1 this boot parameter is not checked, so there is no straightforward way of changing or disabling the timeout.
The suggested approach here is to request a backport of this for yast in SLE 15-SP1.

Since it is not likely that there will ever be a new 15-SP1 QU release, the backport approach remains a suggestion for now.

Also available in: Atom PDF