action #33202
closed[sle][functional][s390x][zkvm][u][hard] test fails in boot_to_desktop - still insufficient error reporting, black screen with mouse cursor - we all hate it (was: I hate it)
0%
Description
Observation¶
openQA test in scenario sle-12-SP4-Server-DVD-s390x-sched_stress@zkvm fails in
boot_to_desktop
Acceptance criteria¶
- AC1 No black screen without pop-up with a hint what went wrong, what is running and what we actually see
Suggestions¶
Put the wallpaper/dialog with the hint mentioned above
Reproducible¶
Fails since (at least) Build 0234 (current job)
Expected result¶
Last good: 0164 (or more recent)
Further details¶
Always latest result in this scenario: latest
Updated by okurz over 6 years ago
- Due date changed from 2018-04-24 to 2018-04-10
Seems we have a bit more capacity in the current sprint S13 as well as the upcoming one. Let's see how you are going to handle that! ;)
Updated by riafarov over 6 years ago
- Subject changed from [sle][functional][s390x][zkvm][u]test fails in boot_to_desktop - still insufficient error reporting, black screen with mouse cursor - I hate it to [sle][functional][s390x][zkvm][u]test fails in boot_to_desktop - still insufficient error reporting, black screen with mouse cursor - we all hate it (was: I hate it)
- Description updated (diff)
- Status changed from New to Workable
Updated by riafarov over 6 years ago
- Subject changed from [sle][functional][s390x][zkvm][u]test fails in boot_to_desktop - still insufficient error reporting, black screen with mouse cursor - we all hate it (was: I hate it) to [sle][functional][s390x][zkvm][u][hard] test fails in boot_to_desktop - still insufficient error reporting, black screen with mouse cursor - we all hate it (was: I hate it)
Updated by okurz over 6 years ago
- Related to action #34003: [tools] Better logging and error handling in case of remote console connections in consoles or backends, e.g. ssh added
Updated by okurz over 6 years ago
- Related to action #33199: [sle][functional][s390x][zkvm][u][hard] test fails in kdump_and_crash - system does not shutdown or reboot? what is happening? better output needed? added
Updated by okurz over 6 years ago
- Related to action #34609: [sle][functional][u][medium] Improve Implementation of workaround for bsc#1083646 and debug output in reconnect_s390 on S390-KVM added
Updated by mgriessmeier over 6 years ago
- Due date changed from 2018-04-10 to 2018-04-24
Updated by okurz over 6 years ago
- Due date changed from 2018-04-24 to 2018-05-08
- Target version changed from Milestone 15 to Milestone 16
We hate it and we will continue to hate it, gosh it's hard
Updated by okurz over 6 years ago
- Blocks action #32683: [sle][functional][u][medium] Implement proper post_fail_hook for boot_to_desktop added
Updated by okurz over 6 years ago
- Related to deleted (action #34003: [tools] Better logging and error handling in case of remote console connections in consoles or backends, e.g. ssh)
Updated by okurz over 6 years ago
- Blocked by action #34003: [tools] Better logging and error handling in case of remote console connections in consoles or backends, e.g. ssh added
Updated by okurz over 6 years ago
- Due date changed from 2018-05-08 to 2018-06-05
- Status changed from Workable to Blocked
- Assignee set to okurz
- Target version changed from Milestone 16 to Milestone 17
blocked by #34003 which we would like to make the tools team aware about
Updated by mgriessmeier over 6 years ago
- Status changed from Blocked to In Progress
- Assignee changed from okurz to mgriessmeier
not blocked anymore, since we now have the debug output and have a hint what's going on
submitted https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/5061 to check if the SUT is able to ping the worker before selecting the first console
Updated by okurz over 6 years ago
… what if ping is not installed? -> https://openqa.suse.de/tests/1707606#step/boot_to_desktop/20 , read: all userspace tests now fail to pass "boot_to_desktop"
Updated by pvorel over 6 years ago
okurz wrote:
… what if ping is not installed? -> https://openqa.suse.de/tests/1707606#step/boot_to_desktop/20 , read: all userspace tests now fail to pass "boot_to_desktop"
I reported this particular problem to #36745
Updated by pvorel over 6 years ago
- Related to action #36745: [openqa][sle][functional][u][s390x][zkvm][kernel] Broken boot due "Test died: no candidate needle with tag(s) 'password-prompt' matched" added
Updated by pvorel over 6 years ago
pvorel wrote:
okurz wrote:
… what if ping is not installed? -> https://openqa.suse.de/tests/1707606#step/boot_to_desktop/20 , read: all userspace tests now fail to pass "boot_to_desktop"
I reported this particular problem to #36745
Actually this is caused by something else than ping problem => poo#36745 might or might not be related to this.
Updated by mgriessmeier over 6 years ago
- Due date changed from 2018-06-05 to 2018-06-19
working on this in upcoming sprint
Updated by mgriessmeier over 6 years ago
- Status changed from In Progress to Blocked
resolve #26044 first... (even if it's the same solution)
Updated by mgriessmeier over 6 years ago
- Status changed from Blocked to Feedback
https://progress.opensuse.org/issues/26044 was resolved
It included also a potential fix for the boot_to_desktop which hopefully will not appear again
so setting to feedback now for tracking some future jobs before resolving
Updated by okurz over 6 years ago
that's good. But keep in mind that this ticket is about "error reporting" not about fixing the underlying issue, e.g. simulate the old problem again and make it super-obvious to test reviewers what the problem is, not fix it :)
Updated by mgriessmeier over 6 years ago
okurz wrote:
that's good. But keep in mind that this ticket is about "error reporting" not about fixing the underlying issue, e.g. simulate the old problem again and make it super-obvious to test reviewers what the problem is, not fix it :)
http://opeth.suse.de/tests/2412#step/boot_to_desktop/15 that's enough error reporting imho... don't know if it makes sense to add a "Worker cannot connect to SUT" message...
Updated by okurz over 6 years ago
See for example https://openqa.suse.de/tests/1752808#step/boot_to_desktop/24 which shows a message "ssh: connect to host 10.161.145.16 port 22: No route to host". This is already helpful but the next thumbnail says "No candidate with tag 'password-prompt' matched" and I think we can still enhance the debugging here a tiny bit, e.g. a post_fail_hook that can provide more hints on what might have gone wrong.
Updated by okurz over 6 years ago
- Blocks action #33865: [sles][functional][s390x][easy][y] Enable yast2_ncurses testsuite for s390x added
Updated by okurz over 6 years ago
I blocked #33865 by this now. See https://openqa.suse.de/tests/1759484#step/boot_to_desktop/17 as an example. We see an error message about "No route to host" but not much more.
Updated by SLindoMansilla over 6 years ago
- Blocks action #36754: [qe-core][functional][systemd][medium] test fails in systemd_testsuite - needs further investigation added
Updated by mgriessmeier over 6 years ago
- Status changed from Feedback to In Progress
apparently still some corner cases around
Updated by okurz over 6 years ago
- Target version changed from Milestone 17 to Milestone 17
Updated by mgriessmeier over 6 years ago
so, there are still some occurences of this issue around - mainly in userspace regression tests, where the corresponding qcows seems to take longer until one is able to connect to them... e.g https://openqa.suse.de/tests/1764688
next suggestions are:
- implement the retry loop in perl to get better feedback
- increase the amount of retries to 10
will work on this still in the next sprint to get it hopefully finally solved
Updated by mgriessmeier over 6 years ago
- Due date changed from 2018-06-19 to 2018-07-03
Updated by nicksinger over 6 years ago
Is this fail related in any way? System seems to boot fine and X started. However there is only a black screen visible without any hints to any errors.
Updated by okurz over 6 years ago
Not exactly the same but related. It does not show a "black screen with mouse cursor" but on top a box with a text message about "Failed" and "Error connecting to host : IO::Socket::INET: connect: Connection timed out at /usr/lib/os-autoinst/testapi.pm line 1385." That's already a bit better than the original issue but still far from easily understandable what the test is trying to achieve, what was expected and what is seen instead and what could be potential error sources
Updated by mgriessmeier over 6 years ago
provided another PR which is a more robust way of checking if the ssh-server in the SUT is (already) available.
it also provides better feedback to the reviewer what is actually going wrong.
Updated by mgriessmeier over 6 years ago
- Due date changed from 2018-07-03 to 2018-07-17
PR still in discussion - will track in next sprint and solve it there
Updated by mgriessmeier over 6 years ago
- Due date changed from 2018-07-17 to 2018-07-31
move due to hackweek
Updated by okurz over 6 years ago
- Target version changed from Milestone 17 to Milestone 18
Updated by mgriessmeier over 6 years ago
- Due date changed from 2018-07-31 to 2018-08-14
I've rewritten my PR with Santis proposal of using the already fulfilled dependency of IO::Socket::INET, unfortunately I had no success to verify until now due to some issues with os-autoinst where ettore and Santi are working on right now.
So hopefully I will solve this today
Updated by mgriessmeier over 6 years ago
- Status changed from In Progress to Feedback
PR is merged - let's see how much it breaks =)
Updated by mgriessmeier over 6 years ago
- Status changed from Feedback to Resolved
nothing broke, PR is in place and working as expected
Updated by okurz over 6 years ago
- Copied to action #39809: [functional][u][s390x] ssh connection check shows red border misleading that something is wrong when there is not -> should be no red border added
Updated by okurz almost 6 years ago
- Related to action #48260: [sle][functional][u][s390x][kvm] test fails in reboot_after_installation - "The console isn't responding correctly. Maybe half-open socket?" added