Project

General

Profile

action #94264

[migration] test fails in boot_to_desktop in mru multipath tests sometimes

Added by mgrifalconi almost 2 years ago. Updated about 1 year ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Bugs in existing tests
Target version:
Start date:
Due date:
% Done:

100%

Estimated time:
Difficulty:

Description

Observation

openQA test in scenario sle-12-SP3-Server-DVD-Updates-x86_64-mru-install-multipath-remote@64bit fails in
boot_to_desktop

Test suite description

NETWORK_INIT_PARAM=ifcfg=eth0=dhcp4 to skip dhcp-question which isproblematic on 12sp1

Reproducible

Fails since (at least) Build 20210618-2 (current job)

Expected result

Last good: 20210617-1 (or more recent)

Further details

Always latest result in this scenario: latest

History

#1 Updated by openqa_review almost 2 years ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: mru-install-multipath-remote
https://openqa.suse.de/tests/6392904

To prevent further reminder comments one of the following options should be followed:

  1. The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
  2. The openQA job group is moved to "Released" or "EOL" (End-of-Life)
  3. The label in the openQA scenario is removed

#2 Updated by tjyrinki_suse almost 2 years ago

  • Status changed from New to Workable
  • Start date deleted (2021-06-18)

#3 Updated by tjyrinki_suse almost 2 years ago

This is "Workable" only in the sense that it'd need to be studied what is causing these problems, there is no one clear reason for the failures.

#4 Updated by openqa_review almost 2 years ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: mru-install-multipath-remote
https://openqa.suse.de/tests/7020193

To prevent further reminder comments one of the following options should be followed:

  1. The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
  2. The openQA job group is moved to "Released" or "EOL" (End-of-Life)
  3. The label in the openQA scenario is removed

#5 Updated by szarate almost 2 years ago

I think it's a bug, https://openqa.suse.de/tests/7020193#step/boot_to_desktop/7 7 minutes it's a long long time.

#6 Updated by openqa_review over 1 year ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: mru-install-multipath-remote
https://openqa.suse.de/tests/7174233

To prevent further reminder comments one of the following options should be followed:

  1. The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
  2. The openQA job group is moved to "Released" or "EOL" (End-of-Life)
  3. The bugref in the openQA scenario is removed or replaced, e.g. label:wontfix:boo1234

#7 Updated by tjyrinki_suse over 1 year ago

  • Project changed from openQA Tests to qe-yam
  • Subject changed from [qe-core] test fails in boot_to_desktop around 50% of times to [qe-yast] test fails in boot_to_desktop around 50% of times
  • Category deleted (Bugs in existing tests)
  • Status changed from Workable to New

Moving to QE Yast as now it has been seen it happens in multipath cases every time:

https://openqa.suse.de/tests/7174233#next_previous

Putting to new as should be triaged to Workable by PO.

#8 Updated by oorlov over 1 year ago

Timo, why do you think it is related to YaST? Could you please share a link to the continuous failure, as you last link leads to 404.

#9 Updated by mgrifalconi over 1 year ago

oorlov here is one example of sporadic failure still happening https://openqa.suse.de/tests/7307038#next_previous

#10 Updated by openqa_review over 1 year ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: mru-install-multipath-remote
https://openqa.suse.de/tests/7624827

To prevent further reminder comments one of the following options should be followed:

  1. The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
  2. The openQA job group is moved to "Released" or "EOL" (End-of-Life)
  3. The bugref in the openQA scenario is removed or replaced, e.g. label:wontfix:boo1234

#11 Updated by oorlov over 1 year ago

  • Project changed from qe-yam to openQA Tests
  • Subject changed from [qe-yast] test fails in boot_to_desktop around 50% of times to test fails in boot_to_desktop around 50% of times

IMHO, it is not related to YaST. It happens on boot.

If any assistance from QE YaST team is required, we are happy to help, but we did not create this test suite and don't have expertise in this area.

#12 Updated by okurz over 1 year ago

  • Category set to Bugs in existing tests

#13 Updated by maritawerner over 1 year ago

I have no idea whom to assign that ticket. It seems that it is moving back and forth from qe-core and qe-yast?

#14 Updated by tjyrinki_suse over 1 year ago

  • Subject changed from test fails in boot_to_desktop around 50% of times to [migration] test fails in boot_to_desktop in mru multipath tests around 50% of times

multipath test modules and testing are maintained by QE Yast as such, but these mru*multipath test suites are another former maintenance department developed test suites.

The failing module however boot_to_desktop has Maintainer: yuwang@suse.com who is in QE Migration, maybe she can help to at least understand the topic? I mean it doesn't really matter whose this ticket should be as such, but we should find people who understand what is going on with it and how it eg differs to other multipath using tests which do not have the same failure.

#15 Updated by mgrifalconi over 1 year ago

  • Priority changed from Normal to High

Increasing prio since it is happening very often and blocking updates

#16 Updated by coolo over 1 year ago

  • Subject changed from [migration] test fails in boot_to_desktop in mru multipath tests around 50% of times to [migration] test fails in boot_to_desktop in mru multipath tests all the times

Fixing the subject

#18 Updated by tinawang123 over 1 year ago

For boot_to_desktop, it just wait system ready, then match the expected needles. It has one step to wait_boot. At here, the system is not started, as after 500s, it cannot get expected needle, so it returned fail. The real reason, it didn't match the needle, is because it cannot boot successfully or in 500s. My branch changed the wait time to 1000s, the system seems stopped at start I/O driver.
So I think it maybe a bug. The system cannot reboot

#19 Updated by rfan1 over 1 year ago

tinawang123
I can see the error messages related to "dracut initqueue iscsiadm cannot connect to iSCSI daemon", IMO, most likely a bug.

#20 Updated by tjyrinki_suse over 1 year ago

I studied it further and filed a bug at: https://bugzilla.suse.com/show_bug.cgi?id=1193512

Note that this ticket still includes references to many other initially similar looking problems, however all of the others seems sporadic ie do not happen every time. If it is to be that the bug filed is valid, this ticket continues to be also valid regarding the test issues on sporadic failures. They may still be about multipath - so potentially for QE Yast - even though they manifest in boot_to_desktop after the installation. This is because the booting only tends to fail after these multipath testing jobs, not others, so maybe the multipath setup could be somehow more stable.

#22 Updated by openqa_review over 1 year ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: mru-install-multipath-remote
https://openqa.suse.de/tests/7901251

To prevent further reminder comments one of the following options should be followed:

  1. The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
  2. The openQA job group is moved to "Released" or "EOL" (End-of-Life)
  3. The bugref in the openQA scenario is removed or replaced, e.g. label:wontfix:boo1234

#23 Updated by dzedro over 1 year ago

  • Status changed from New to In Progress
  • Assignee set to dzedro

#24 Updated by tjyrinki_suse over 1 year ago

  • Subject changed from [migration] test fails in boot_to_desktop in mru multipath tests all the times to [migration] test fails in boot_to_desktop in mru multipath tests sometimes

Mention this is about sometimes failing, permanent problems would be due to other problems.

#25 Updated by szarate about 1 year ago

with the test still failing so often, let's unschedule it, I'll give it a look and see later during the sprint what to we wanna do with it (i.e fixing it) and the fact that is a multi machine test that's failing, makes me think that its the same or a similar issue that haunts the multimachine sap tests - https://progress.opensuse.org/issues/95824

See also: https://suse.slack.com/archives/C02CANHLANP/p1649658797981269

#26 Updated by dzedro about 1 year ago

I created https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/14743
Failure with emergency shell is happening only on 12-SP3. [1][2]
I asked Thomas about releasing the fix also on SLE12.
But 12 SP3 is soon EOS, so I guess it's not worth the work to add workaround ...
I hope it will not happen so often and RETRY will be good enough workaround!

[1] https://bugzilla.suse.com/show_bug.cgi?id=1193512
[2] https://openqa.suse.de/tests/8578189#step/boot_to_desktop/23

#27 Updated by szarate about 1 year ago

  • Sprint set to QE-Core: May Sprint (May 11 - Jun 08)
  • Tags set to bugbusters
  • Target version set to QE-Core: Ready

#28 Updated by dzedro about 1 year ago

  • Status changed from In Progress to Resolved
  • % Done changed from 0 to 100

There was no failure, could be just luck, but I'm confident the main issues should be fixed.

#29 Updated by szarate about 1 year ago

  • Sprint changed from QE-Core: May Sprint (May 11 - Jun 08) to QE-Core: April Sprint (Apr 13 - May 11)

#31 Updated by dzedro about 1 year ago

On 12-SP3 it is happening "always", it is soon EOS so it will be gone.

Also available in: Atom PDF