Project

General

Profile

Actions

action #115190

closed

coordination #130072: [epic] Handle openQA adaptions in Yam squad - SLE 15 SP6

[Research 24h] boot process stuck sporadically after install in mru-install-multipath-remote

Added by JRivrain over 1 year ago. Updated 5 months ago.

Status:
Resolved
Priority:
Normal
Target version:
Start date:
2022-08-10
Due date:
% Done:

0%

Estimated time:

Description

Observation

System fails to boot after install, it was hard to say if was an infra issue, so we tried on a different environment to see if it happens again, and it did but then passed the next time.

But then it failed again. In one of the runs, I had blocked state boot at "a start job is starting for dev/mapp..." (then it's truncated) and I couldn't ssh there. See attachement.

We need to figure out which update triggers it. There are several updates I have on my highly suspect list (in order of suspicion) :

libmount and others : https://download.suse.de/ibs/SUSE:/Maintenance:/25242/SUSE_Updates_SLE-SERVER_12-SP3-LTSS-TERADATA_x86_64/x86_64/
systemd: https://download.suse.de/ibs/SUSE:/Maintenance:/24472/SUSE_Updates_SLE-SERVER_12-SP3-LTSS-TERADATA_x86_64/
kernel-default: https://download.suse.de/ibs/SUSE:/Maintenance:/25294/SUSE_Updates_SLE-SERVER_12-SP3-LTSS-TERADATA_x86_64/x86_64/
openiscsi: https://download.suse.de/ibs/SUSE:/Maintenance:/25336/SUSE_Updates_SLE-SERVER_12-SP3-LTSS-TERADATA_x86_64/x86_64/

The other updates are very unlikely to provoke this kind of problem.

We need to try running the jobs several times (as it is sporadic) with those updates excluded separately, then report a bug once faulty update is identified.

openQA test in scenario sle-12-SP3-Server-DVD-Updates-x86_64-mru-install-multipath-remote@64bit fails in
boot_to_desktop

Test suite description

Testsuite maintained at https://gitlab.suse.de/qa-maintenance/qam-openqa-yml. NETWORK_INIT_PARAM=ifcfg=eth0=dhcp4 to skip dhcp-question which isproblematic on 12sp1

Reproducible

Fails since (at least) Build 20220809-1

Expected result

Last good: 20220808-1 (or more recent)

Further details

Always latest result in this scenario: latest


Files

dev_mapper.png (17.3 KB) dev_mapper.png blocked state JRivrain, 2022-08-10 16:15
Actions #1

Updated by JRivrain over 1 year ago

  • Project changed from openQA Tests to qe-yam
  • Category deleted (Bugs in existing tests)
  • Priority changed from Normal to Urgent
Actions #2

Updated by JERiveraMoya over 1 year ago

  • Tags set to YaST
  • Subject changed from [QAM-Yast] boot process stuck sporadically after install in mru-install-multipath-remote to boot process stuck sporadically after install in mru-install-multipath-remote
  • Priority changed from Urgent to Low
  • Target version set to Current

All fine, thanks for the investigation, I think this test is not suitable to be running in maintenance updates, the more bother us this sporadic issue, the more reasons to move it to development group and investigate why the test suite was created, feature where was required, etc.

Actions #3

Updated by amanzini over 1 year ago

  • Description updated (diff)
Actions #4

Updated by amanzini over 1 year ago

Seems to impact only 12SP3. moved test to MU development group in order to investigate

Actions #5

Updated by coolgw over 1 year ago

Actions #6

Updated by JERiveraMoya over 1 year ago

  • Target version changed from Current to future
Actions #7

Updated by JERiveraMoya over 1 year ago

  • Target version deleted (future)
Actions #8

Updated by JERiveraMoya 9 months ago

  • Tags deleted (YaST)
  • Subject changed from boot process stuck sporadically after install in mru-install-multipath-remote to [Research 24h] boot process stuck sporadically after install in mru-install-multipath-remote
  • Status changed from New to Workable
  • Priority changed from Low to Normal

we could try to revisit this and investigate it, but this scenario is not suitable for running daily in maintenance so it got deactivated a long time ago, let's refresh what we know about the failure.
https://openqa.suse.de/tests/overview?arch=&flavor=&machine=&test=mru-install-multipath-remote_dev&modules=&module_re=&distri=sle&version=&build=20230720-1&groupid=446#

Actions #9

Updated by JERiveraMoya 9 months ago

  • Target version set to Current
  • Parent task set to #130072
Actions #10

Updated by syrianidou_sofia 5 months ago

  • Status changed from Workable to In Progress
  • Assignee set to syrianidou_sofia
Actions #11

Updated by syrianidou_sofia 5 months ago ยท Edited

  • Status changed from In Progress to Resolved

Both mentioned sles versions where the issue was found are currently out of support. I checked the tests for sles 12 sp5 but the particular test is in development Yast maintenance updates job group and is not showing any relevant issue. Last builds failed due to scc timeout. Previous builds were passing green: latest build

Actions

Also available in: Atom PDF