action #16520
closed[qam][opensuse][sle][functional] enhance logging and debugging in case of failed shutdown, e.g. press 'esc' on plymouth splash screen
0%
Description
Observation¶
openQA test in scenario opensuse-Tumbleweed-DVD-x86_64-gnome-image@64bit fails in
shutdown
Reproducible¶
Fails since (at least) Build 20170204 (current job)
Expected result¶
Last good: 20170203 (or more recent)
Further details¶
Always latest result in this scenario: latest
The test fails 'randomly' - so it just appears that once in a while the shutdown takes 'a bit more than the configured timeout'; obviously, we'd expect this not to happen (and this likely is a product bug). But openQA should help us here with the debugging, by at least pressing 'ESC' when the timeout is reached (post-fail-hook), which should disable plymouth and give at least some pointers where we're hanging / waiting (don't forget one more screenshot after pressing ESC)
Updated by okurz about 8 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: qam-minimal-full
http://openqa.suse.de/tests/786614
Updated by okurz about 8 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: qam-minimal-full
http://openqa.suse.de/tests/786614
Updated by okurz almost 8 years ago
- Subject changed from test fails in shutdown to [qam][opensuse] test fails in shutdown
Updated by atighineanu almost 8 years ago
Updated by pgeorgiadis almost 8 years ago
Updated by okurz almost 8 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: qam-minimal-full
https://openqa.suse.de/tests/874792
Updated by vpelcak almost 8 years ago
Test died: console sut is not activated. at /var/lib/openqa/cache/openqa.suse.de/tests/sle/tests/x11/shutdown.pm line 131.¶
It seems that just action click was not performed.
Updated by okurz almost 8 years ago
- Subject changed from [qam][opensuse] test fails in shutdown to [qam][opensuse] enhance logging and debugging in case of failed shutdown, e.g. press 'esc' on plymouth splash screen
- Category changed from Bugs in existing tests to Enhancement to existing tests
the last message is actually about #17658 so a different one. See what happens when people open tickets which are way too generic in their subject ;-)
I updated ticket accordingly to the original description
Updated by okurz almost 8 years ago
- Related to action #14068: [tools] Gather more system information and logs in case of boot/reboot times out added
Updated by okurz over 7 years ago
- Subject changed from [qam][opensuse] enhance logging and debugging in case of failed shutdown, e.g. press 'esc' on plymouth splash screen to [qam][opensuse][sle][functional] enhance logging and debugging in case of failed shutdown, e.g. press 'esc' on plymouth splash screen
- Priority changed from Normal to High
https://freedesktop.org/wiki/Software/systemd/Debugging/ should be a helpful reference as well.
https://bugzilla.suse.com/show_bug.cgi?id=1055462 was a bug I created recently for a SLE15 failure which I subsequently closed with WORKSFORME because the problem of not shutting down in time was a sporadic one and I could not provide useful debugging information. Thinking about better post fail hook handling here should be helpful.
Updated by okurz over 7 years ago
Latest example of "openQA-in-openQA" test scenario failing: https://openqa.opensuse.org/tests/494083#step/shutdown/1 The test states that the system does not shutdown in time but without any more helpful information. That's very hard to debug as in: No one will care unless we improve the debugging output in the test code.
Updated by okurz over 7 years ago
Updated by sebchlad over 7 years ago
@okurz: as you have added this to QA SLE Functional scrum team 19.09 - is this important for QA SLE: SLE15 testing?
I do not question this is useful, especially for openSuse. I would like to check however that this is indeed QA SLE Functional scope.
Updated by okurz over 7 years ago
Did you read #16520#note-10 ? The answer for SLE is there, it's "yes".
Updated by nicksinger over 7 years ago
Unfortunately many of the posted links are already archived and cannot be viewed anymore. But from what I've seen we have x11/shutdown and shutdown/shutdown. x11/shutdown already contains the improved post_fail_hook mentioned by @dimstar - see: https://github.com/os-autoinst/os-autoinst-distri-opensuse/blob/master/tests/x11/shutdown.pm#L40
From the name of the test ("x86_64-gnome-image@64bit") i assume it should make use of x11/shutdown, not shutdown/shutdown. Same goes for the openQA-in-openQA test. However - I'll try to uniform this to avoid duplicated code.
Updated by JERiveraMoya over 7 years ago
I found this case: https://openqa.opensuse.org/tests/519991#step/shutdown/4 when it will never press 'esc' because $self->{await_shutdown}
is still 0 and function power_action
is failing internally on assert_shutdown
. Make sense to change in the post_fail_hook to send_key('esc') unless $self->{await_shutdown};
?
Updated by nicksinger over 7 years ago
- Status changed from New to In Progress
- Assignee set to nicksinger
https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/3858 just copies over the post-fail-hook of x11/shutdown.pm into shutdown/shutdown.pm. For now this should be sufficient to see at least behind plymouth. I'll create a new subticket for the "debug shutdown" epic to merge both files into one but want to talk to @okurz first about that idea.
Updated by nicksinger over 7 years ago
- Status changed from In Progress to Feedback
Merged. Lets see how this turns out in production :)
Updated by nicksinger over 7 years ago
- Status changed from Feedback to Resolved
No fallout on OSD so hopefully it helps us more next time it fails.