action #94264
[migration] test fails in boot_to_desktop in mru multipath tests sometimes
100%
Description
Observation¶
openQA test in scenario sle-12-SP3-Server-DVD-Updates-x86_64-mru-install-multipath-remote@64bit fails in
boot_to_desktop
Test suite description¶
NETWORK_INIT_PARAM=ifcfg=eth0=dhcp4 to skip dhcp-question which isproblematic on 12sp1
Reproducible¶
Fails since (at least) Build 20210618-2 (current job)
Expected result¶
Last good: 20210617-1 (or more recent)
Further details¶
Always latest result in this scenario: latest
History
#1
Updated by openqa_review 11 months ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: mru-install-multipath-remote
https://openqa.suse.de/tests/6392904
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released" or "EOL" (End-of-Life)
- The label in the openQA scenario is removed
#2
Updated by tjyrinki_suse 9 months ago
- Status changed from New to Workable
- Start date deleted (
2021-06-18)
#3
Updated by tjyrinki_suse 9 months ago
This is "Workable" only in the sense that it'd need to be studied what is causing these problems, there is no one clear reason for the failures.
#4
Updated by openqa_review 9 months ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: mru-install-multipath-remote
https://openqa.suse.de/tests/7020193
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released" or "EOL" (End-of-Life)
- The label in the openQA scenario is removed
#5
Updated by szarate 9 months ago
I think it's a bug, https://openqa.suse.de/tests/7020193#step/boot_to_desktop/7 7 minutes it's a long long time.
#6
Updated by openqa_review 8 months ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: mru-install-multipath-remote
https://openqa.suse.de/tests/7174233
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released" or "EOL" (End-of-Life)
- The bugref in the openQA scenario is removed or replaced, e.g.
label:wontfix:boo1234
#7
Updated by tjyrinki_suse 8 months ago
- Project changed from openQA Tests to qe-yast
- Subject changed from [qe-core] test fails in boot_to_desktop around 50% of times to [qe-yast] test fails in boot_to_desktop around 50% of times
- Category deleted (
Bugs in existing tests) - Status changed from Workable to New
Moving to QE Yast as now it has been seen it happens in multipath cases every time:
https://openqa.suse.de/tests/7174233#next_previous
Putting to new as should be triaged to Workable by PO.
#9
Updated by mgrifalconi 8 months ago
oorlov here is one example of sporadic failure still happening https://openqa.suse.de/tests/7307038#next_previous
#10
Updated by openqa_review 7 months ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: mru-install-multipath-remote
https://openqa.suse.de/tests/7624827
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released" or "EOL" (End-of-Life)
- The bugref in the openQA scenario is removed or replaced, e.g.
label:wontfix:boo1234
#11
Updated by oorlov 6 months ago
- Project changed from qe-yast to openQA Tests
- Subject changed from [qe-yast] test fails in boot_to_desktop around 50% of times to test fails in boot_to_desktop around 50% of times
IMHO, it is not related to YaST. It happens on boot.
If any assistance from QE YaST team is required, we are happy to help, but we did not create this test suite and don't have expertise in this area.
#13
Updated by maritawerner 6 months ago
I have no idea whom to assign that ticket. It seems that it is moving back and forth from qe-core and qe-yast?
#14
Updated by tjyrinki_suse 6 months ago
- Subject changed from test fails in boot_to_desktop around 50% of times to [migration] test fails in boot_to_desktop in mru multipath tests around 50% of times
multipath test modules and testing are maintained by QE Yast as such, but these mru*multipath test suites are another former maintenance department developed test suites.
The failing module however boot_to_desktop has Maintainer: yuwang@suse.com who is in QE Migration, maybe she can help to at least understand the topic? I mean it doesn't really matter whose this ticket should be as such, but we should find people who understand what is going on with it and how it eg differs to other multipath using tests which do not have the same failure.
#15
Updated by mgrifalconi 6 months ago
- Priority changed from Normal to High
Increasing prio since it is happening very often and blocking updates
#17
Updated by tjyrinki_suse 6 months ago
I messaged yuwang. More recent examples https://openqa.suse.de/tests/7739697 https://openqa.suse.de/tests/7765246
#18
Updated by tinawang123 6 months ago
For boot_to_desktop, it just wait system ready, then match the expected needles. It has one step to wait_boot. At here, the system is not started, as after 500s, it cannot get expected needle, so it returned fail. The real reason, it didn't match the needle, is because it cannot boot successfully or in 500s. My branch changed the wait time to 1000s, the system seems stopped at start I/O driver.
So I think it maybe a bug. The system cannot reboot
#19
Updated by rfan1 6 months ago
tinawang123
I can see the error messages related to "dracut initqueue iscsiadm cannot connect to iSCSI daemon", IMO, most likely a bug.
#20
Updated by tjyrinki_suse 6 months ago
I studied it further and filed a bug at: https://bugzilla.suse.com/show_bug.cgi?id=1193512
Note that this ticket still includes references to many other initially similar looking problems, however all of the others seems sporadic ie do not happen every time. If it is to be that the bug filed is valid, this ticket continues to be also valid regarding the test issues on sporadic failures. They may still be about multipath - so potentially for QE Yast - even though they manifest in boot_to_desktop after the installation. This is because the booting only tends to fail after these multipath testing jobs, not others, so maybe the multipath setup could be somehow more stable.
#22
Updated by openqa_review 5 months ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: mru-install-multipath-remote
https://openqa.suse.de/tests/7901251
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released" or "EOL" (End-of-Life)
- The bugref in the openQA scenario is removed or replaced, e.g.
label:wontfix:boo1234
#24
Updated by tjyrinki_suse 3 months ago
- Subject changed from [migration] test fails in boot_to_desktop in mru multipath tests all the times to [migration] test fails in boot_to_desktop in mru multipath tests sometimes
Mention this is about sometimes failing, permanent problems would be due to other problems.
#25
Updated by szarate about 1 month ago
with the test still failing so often, let's unschedule it, I'll give it a look and see later during the sprint what to we wanna do with it (i.e fixing it) and the fact that is a multi machine test that's failing, makes me think that its the same or a similar issue that haunts the multimachine sap tests - https://progress.opensuse.org/issues/95824
See also: https://suse.slack.com/archives/C02CANHLANP/p1649658797981269
#26
Updated by dzedro about 1 month ago
I created https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/14743
Failure with emergency shell is happening only on 12-SP3. [1][2]
I asked Thomas about releasing the fix also on SLE12.
But 12 SP3 is soon EOS, so I guess it's not worth the work to add workaround ...
I hope it will not happen so often and RETRY will be good enough workaround!
[1] https://bugzilla.suse.com/show_bug.cgi?id=1193512
[2] https://openqa.suse.de/tests/8578189#step/boot_to_desktop/23
#27
Updated by szarate about 1 month ago
- Sprint set to QE-Core: May Sprint (May 11 - Jun 08)
- Tags set to bugbusters
- Target version set to QE-Core: Ready
#28
Updated by dzedro about 1 month ago
- Status changed from In Progress to Resolved
- % Done changed from 0 to 100
There was no failure, could be just luck, but I'm confident the main issues should be fixed.
#30
Updated by mgrifalconi 11 days ago
Happening again: https://openqa.suse.de/tests/8749580#