Project

General

Profile

Actions

action #16520

closed

[qam][opensuse][sle][functional] enhance logging and debugging in case of failed shutdown, e.g. press 'esc' on plymouth splash screen

Added by dimstar about 7 years ago. Updated over 6 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Enhancement to existing tests
Start date:
2017-02-06
Due date:
2017-11-08
% Done:

0%

Estimated time:
Difficulty:

Description

Observation

openQA test in scenario opensuse-Tumbleweed-DVD-x86_64-gnome-image@64bit fails in
shutdown

Reproducible

Fails since (at least) Build 20170204 (current job)

Expected result

Last good: 20170203 (or more recent)

Further details

Always latest result in this scenario: latest

The test fails 'randomly' - so it just appears that once in a while the shutdown takes 'a bit more than the configured timeout'; obviously, we'd expect this not to happen (and this likely is a product bug). But openQA should help us here with the debugging, by at least pressing 'ESC' when the timeout is reached (post-fail-hook), which should disable plymouth and give at least some pointers where we're hanging / waiting (don't forget one more screenshot after pressing ESC)


Related issues 1 (0 open1 closed)

Related to openQA Tests - action #14068: [tools] Gather more system information and logs in case of boot/reboot times outResolvedokurz2016-09-23

Actions
Actions #1

Updated by okurz about 7 years ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: qam-minimal-full
http://openqa.suse.de/tests/786614

Actions #2

Updated by okurz about 7 years ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: qam-minimal-full
http://openqa.suse.de/tests/786614

Actions #3

Updated by okurz about 7 years ago

  • Subject changed from test fails in shutdown to [qam][opensuse] test fails in shutdown
Actions #6

Updated by okurz about 7 years ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: qam-minimal-full
https://openqa.suse.de/tests/874792

Actions #7

Updated by vpelcak almost 7 years ago

Test died: console sut is not activated. at /var/lib/openqa/cache/openqa.suse.de/tests/sle/tests/x11/shutdown.pm line 131.

It seems that just action click was not performed.

Actions #8

Updated by okurz almost 7 years ago

  • Subject changed from [qam][opensuse] test fails in shutdown to [qam][opensuse] enhance logging and debugging in case of failed shutdown, e.g. press 'esc' on plymouth splash screen
  • Category changed from Bugs in existing tests to Enhancement to existing tests

the last message is actually about #17658 so a different one. See what happens when people open tickets which are way too generic in their subject ;-)

I updated ticket accordingly to the original description

Actions #9

Updated by okurz almost 7 years ago

  • Related to action #14068: [tools] Gather more system information and logs in case of boot/reboot times out added
Actions #10

Updated by okurz over 6 years ago

  • Subject changed from [qam][opensuse] enhance logging and debugging in case of failed shutdown, e.g. press 'esc' on plymouth splash screen to [qam][opensuse][sle][functional] enhance logging and debugging in case of failed shutdown, e.g. press 'esc' on plymouth splash screen
  • Priority changed from Normal to High

https://freedesktop.org/wiki/Software/systemd/Debugging/ should be a helpful reference as well.

https://bugzilla.suse.com/show_bug.cgi?id=1055462 was a bug I created recently for a SLE15 failure which I subsequently closed with WORKSFORME because the problem of not shutting down in time was a sporadic one and I could not provide useful debugging information. Thinking about better post fail hook handling here should be helpful.

Actions #11

Updated by okurz over 6 years ago

  • Target version set to Milestone 11
Actions #12

Updated by okurz over 6 years ago

Latest example of "openQA-in-openQA" test scenario failing: https://openqa.opensuse.org/tests/494083#step/shutdown/1 The test states that the system does not shutdown in time but without any more helpful information. That's very hard to debug as in: No one will care unless we improve the debugging output in the test code.

Actions #14

Updated by sebchlad over 6 years ago

@okurz: as you have added this to QA SLE Functional scrum team 19.09 - is this important for QA SLE: SLE15 testing?
I do not question this is useful, especially for openSuse. I would like to check however that this is indeed QA SLE Functional scope.

Actions #15

Updated by okurz over 6 years ago

Did you read #16520#note-10 ? The answer for SLE is there, it's "yes".

Actions #16

Updated by okurz over 6 years ago

  • Due date set to 2017-11-08
Actions #17

Updated by nicksinger over 6 years ago

Unfortunately many of the posted links are already archived and cannot be viewed anymore. But from what I've seen we have x11/shutdown and shutdown/shutdown. x11/shutdown already contains the improved post_fail_hook mentioned by @dimstar - see: https://github.com/os-autoinst/os-autoinst-distri-opensuse/blob/master/tests/x11/shutdown.pm#L40

From the name of the test ("x86_64-gnome-image@64bit") i assume it should make use of x11/shutdown, not shutdown/shutdown. Same goes for the openQA-in-openQA test. However - I'll try to uniform this to avoid duplicated code.

Actions #18

Updated by JERiveraMoya over 6 years ago

I found this case: https://openqa.opensuse.org/tests/519991#step/shutdown/4 when it will never press 'esc' because $self->{await_shutdown} is still 0 and function power_action is failing internally on assert_shutdown. Make sense to change in the post_fail_hook to send_key('esc') unless $self->{await_shutdown}; ?

Actions #19

Updated by nicksinger over 6 years ago

  • Status changed from New to In Progress
  • Assignee set to nicksinger

https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/3858 just copies over the post-fail-hook of x11/shutdown.pm into shutdown/shutdown.pm. For now this should be sufficient to see at least behind plymouth. I'll create a new subticket for the "debug shutdown" epic to merge both files into one but want to talk to @okurz first about that idea.

Actions #20

Updated by nicksinger over 6 years ago

  • Status changed from In Progress to Feedback

Merged. Lets see how this turns out in production :)

Actions #21

Updated by nicksinger over 6 years ago

  • Status changed from Feedback to Resolved

No fallout on OSD so hopefully it helps us more next time it fails.

Actions

Also available in: Atom PDF