Project

General

Profile

action #46076

[sle][functional][u][medium] test fails in shutdown on minimalx

Added by jorauch over 2 years ago. Updated about 2 years ago.

Status:
Resolved
Priority:
Low
Assignee:
Category:
New test
Target version:
SUSE QA - Milestone 27
Start date:
2019-01-14
Due date:
% Done:

0%

Estimated time:
Difficulty:
medium

Description

Observation

openQA test in scenario sle-15-SP1-Installer-DVD-x86_64-create_minimalX@64bit fails in
shutdown

Everytime I try to run the shutdown module on minimalx it gets stuck, the linked test is an create_hdd_gnome with DESKTOP=minimalx.
This is NOT a simple needle issue, I already tried to create some, also simply setting SHUTDOWN_NEEDS_AUTH=1 does not work.
The adaption needs to be done in the power_utils.

We should also discuss why this is not being tested on openQA.

Reproducible

Basically everytime we execute the shutdown module on minimalx

Fails since (at least) Build 136.2

Expected result

Obviously the shutdown module should work on all desktop managers

Suggestions

  • Add shutdown in main.pm for minimalx (0.1h - 0.5h)
  • try clicking through the shutdown process (0.5h -1.0h)
  • if necessary workaroung bugs (0.0h-2.0h)

Related issues

Related to openQA Tests - action #48227: [sle[[functional][u] test fails in shutdown - need to investigateRejected2019-02-21

Related to openQA Tests - action #43880: [functional][u][s390x][sporadic] test fails in shutdown on s390xResolved2018-11-16

Has duplicate openQA Tests - action #49067: [functional][u] test fails in shutdown: user is not even logged out of X11 WM sessionRejected2019-03-12

Blocks openQA Tests - action #32074: [sle][functional][icewm][raspi][easy][u] Check basic desktop behaviorRejected2018-02-21

Blocks openQA Tests - action #34774: [sle][functional][u][sporadic] test fails in reboot_gnome - Slower archs (aarch64 and ppc64le), missing ctrl-alt-del, logout-dialog not shownRejected2018-04-12

History

#1 Updated by jorauch over 2 years ago

  • Blocks action #32074: [sle][functional][icewm][raspi][easy][u] Check basic desktop behavior added

#2 Updated by okurz over 2 years ago

Thanks for the report. However, if we never had a test executing "shutdown" on icewm the ticket is rather a request for new test coverage, not a regression, right?

I checked "minimalx" on openSUSE Tumbleweed, https://openqa.opensuse.org/tests/829126, and also there I can not see "shutdown" being scheduled.

#3 Updated by jorauch over 2 years ago

In theory we have the code, but it is never used and it does not work (on SLE15), so I would rather vote for 'enhancement'
My primary intention is to investigate and fix the shutdown module, scheduling it a side bonus to avoid running in something like this again

#4 Updated by mgriessmeier over 2 years ago

  • Category changed from Bugs in existing tests to New test

if shutdown is not tested yet on minimalX - I would also go with new test =)

#5 Updated by jorauch over 2 years ago

  • Subject changed from [sle][functional][u][easy] test fails in shutdown on minimalx to [sle][functional][u][medium] test fails in shutdown on minimalx
  • Status changed from Workable to In Progress

#6 Updated by jorauch over 2 years ago

  • Description updated (diff)

#7 Updated by jorauch over 2 years ago

Actually is relevant.
There are at least two password dialogs, I will implement a workaround with a softfail

#8 Updated by jorauch over 2 years ago

We have at least 3x root password prompts so far, one for a wall message, one for shutdown while other users are logged in and one for setting the powertarget.
I suppose that we simply shut it down over the console until the bug is fixed

#9 Updated by jorauch over 2 years ago

First part which is fixing the code
https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/6551

Still needs to be scheduled in main.pm

#10 Updated by mgriessmeier over 2 years ago

  • Blocks action #34774: [sle][functional][u][sporadic] test fails in reboot_gnome - Slower archs (aarch64 and ppc64le), missing ctrl-alt-del, logout-dialog not shown added

#11 Updated by jorauch over 2 years ago

This PR schedules the shutdown tests on minimalx:
https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/6570

Now waiting for merges and verification in production to close this ticket.
Setting to feedback meanwhile

#12 Updated by jorauch over 2 years ago

  • Status changed from In Progress to Resolved

#13 Updated by okurz over 2 years ago

  • Status changed from Resolved to Feedback

Please stick to our https://progress.opensuse.org/projects/openqatests/wiki#Definition-of-DONE and keep the ticket open until the according PR is merged, deployed and we have verification runs on production.

https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/6551

#14 Updated by okurz over 2 years ago

  • Target version changed from Milestone 22 to Milestone 23

#15 Updated by jorauch over 2 years ago

PR merged, waiting for verification in production

#17 Updated by jorauch over 2 years ago

Looks like the test failed to select the console properly:
https://openqa.suse.de/tests/2479298#step/shutdown/26

#18 Updated by jorauch over 2 years ago

  • Related to action #48227: [sle[[functional][u] test fails in shutdown - need to investigate added

#19 Updated by jorauch over 2 years ago

  • Status changed from Feedback to Workable
  • Priority changed from Normal to High

#20 Updated by lnussel over 2 years ago

  • Due date set to 2019-02-25
  • Priority changed from High to Immediate

what's going on here? the fixes related to shutdown look massively broken. stagings are red

#21 Updated by okurz over 2 years ago

I see shutdown on minimalx previously working on
https://openqa.opensuse.org/tests/858731#step/shutdown/7
and now failing in
https://openqa.opensuse.org/tests/861496#step/shutdown/10
As currently there seem to be too many failures related to shutdown on minimalx I removed it again from the schedule as we had discussed already on friday but then did not revert:
https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/6881
already merged, will retrigger the failed staging tests.

#22 Updated by okurz over 2 years ago

I retriggered about 100 jobs, mainly "cryptlvm" and "minimalx", not only Leap 15.1 staging but also Factory staging, aarch64 build validation jobs as well as maintenance tests. The outreach was wider than I thought.

jorauch based on this we can now more carefully test the different scenarios when you want to bring back the test module. Let's talk next week how to continue as well.

#23 Updated by jorauch over 2 years ago

  • Priority changed from Immediate to High

As you reverted the changes, can we set back the priority?
Will take a look at this today anyway, the behaviour seems very inconsistent

#24 Updated by okurz over 2 years ago

  • Due date deleted (2019-02-25)
  • Priority changed from High to Normal
  • Target version changed from Milestone 23 to Milestone 24

Yes, thank you for that. Actually with another fix, by DimStar, in https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/6887 we can set it to "Normal" and remove the Due-Date. So actually you can relax ;)

#25 Updated by jorauch over 2 years ago

Could you please clarify at what point we are?

  • Is the shutdown module now scheduled?
  • What needs to be done here at all?

#26 Updated by okurz over 2 years ago

jorauch wrote:

Could you please clarify at what point we are?

  • Is the shutdown module now scheduled?

yes, in case of INSTALLONLY

  • What needs to be done here at all?

Add it back to the cases of !INSTALLONLY and make sure it works in a stable manner in the scenarios where it was reported as failing. Also, work around bugs with proper record_soft_failure.

#27 Updated by jorauch over 2 years ago

  • Status changed from Workable to In Progress

Will investigate the shutdown and why it breaks sometimes

#28 Updated by jorauch over 2 years ago

We have three interesting variables here:

  • cryptlvm
  • INSTALLONLY
  • DISTRI

#29 Updated by jorauch over 2 years ago

Tumbleweed staging:

With cryptlvm:
http://pinky.arch.suse.de/tests/107#

Without cryptlvm:
http://pinky.arch.suse.de/tests/106#

Seems like the shutdown works as it should?

#30 Updated by jorauch over 2 years ago

SLE cryptlvm:
http://pinky.arch.suse.de/tests/108#

Seeing this, I would assume that it is related to SLE <-> Opensuse

#31 Updated by jorauch over 2 years ago

  • Status changed from In Progress to Feedback

To me it looks like the fix is working as intended.

http://pinky.arch.suse.de/tests/111#

http://pinky.arch.suse.de/tests/110

Both with the workaround enabled, I think we can safely reschedule the module.
Maybe we had performance issues so the 2 minutes were not enough?

#32 Updated by okurz over 2 years ago

Looks promising so far. Do you have a PR where you referenced these tests?

I guess if you can make http://pinky.arch.suse.de/tests/108#step/shutdown/23 not fail you are good to go :)

#33 Updated by jorauch over 2 years ago

Test#23 is with disabled workaround, the last both are basically the upstream code

#35 Updated by jorauch over 2 years ago

PPC + Cryptlvm + timeout 360 sec
https://openqa.suse.de/tests/2509212#

#37 Updated by jorauch over 2 years ago

With a 3 minute timeout it seems to work as intended, I am not yet sure what are the consequences of this

#38 Updated by jorauch over 2 years ago

  • Status changed from Feedback to In Progress

As discussed with okurz we should put the workaround after shutdown_x11

Or should we?

#39 Updated by jorauch over 2 years ago

Here we have another failure (360sec timeout):
https://openqa.suse.de/tests/2515499#step/shutdown/10

#40 Updated by jorauch over 2 years ago

After sleeping a night over the results I come to the conclusion that we ran in the wrong direction here, we should go back to the initial apporach of detecting unwanted popups instead of wanting to discover a failed shutdown. This avoids a lot of trouble with corner cases

#41 Updated by jorauch over 2 years ago

As discussed in review:

Steps to shutdown:

initiate shutdown
press shutdown
press ok on confirmation
check if we get unwanted popups {
if no popups:
either continue
if popups:
change to console
poweroff
}

#42 Updated by jorauch over 2 years ago

  • Status changed from In Progress to Feedback

#43 Updated by okurz over 2 years ago

  • Related to action #43880: [functional][u][s390x][sporadic] test fails in shutdown on s390x added

#44 Updated by okurz over 2 years ago

merged, please carefully track in production as well as update the blocked tickets.

#45 Updated by jorauch over 2 years ago

  • Status changed from Feedback to Resolved

Fixed the only fail in other_de:
https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/6967

I guess we can consider this resolved?

#46 Updated by jorauch over 2 years ago

  • Status changed from Resolved to Workable

Somehow still broken:
https://openqa.suse.de/tests/2525175#step/shutdown/13

Behavior seems weirder than expected

#47 Updated by jorauch over 2 years ago

Looks like we get a pop up but the system is shutting down anyways

#48 Updated by jorauch over 2 years ago

  • Status changed from Workable to Feedback

The shutdown-auth needle matches on a normal root password popup, will delete the needle, we should keep this in mind for the next reviews

#50 Updated by okurz over 2 years ago

merged

#51 Updated by jorauch over 2 years ago

  • Status changed from Feedback to In Progress

https://openqa.suse.de/tests/2531374#step/shutdown/9

Actually it shuts down despite the popup, we could use check_shutdown in the branch, but the timeout would be somehow random

#52 Updated by jorauch over 2 years ago

Opensuse shuts down normally on SLE I have the following behaviour on the latest build:
http://pinky.arch.suse.de/tests/124#step/shutdown/24

I really do not understand when it just shuts down, when it shuts down with a popup and when the shutdown gets blocked

Maybe a detection for "Authentication is required to shut down while users are logged in" could help?

#53 Updated by jorauch over 2 years ago

We should wait at least until after the RC with this

My current idea would be to check for SHUTDOWN_NEEDS_AUTH and then in the error case use check_shutdown(120) and then manually shutting down

#54 Updated by dheidler over 2 years ago

  • Related to action #49067: [functional][u] test fails in shutdown: user is not even logged out of X11 WM session added

#55 Updated by okurz over 2 years ago

  • Priority changed from Normal to Low
  • Target version changed from Milestone 24 to Milestone 25

jorauch wrote:

We should wait at least until after the RC with this

I guess we can even wait for longer and do not need to hurry. If you think it's safer, we can implement it post-SLE15SP1GM

#56 Updated by jorauch over 2 years ago

  • Status changed from In Progress to Workable

I agree, maybe the behaviour has stabilized until then

Last state:
https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/7044

#57 Updated by SLindoMansilla over 2 years ago

  • Related to deleted (action #49067: [functional][u] test fails in shutdown: user is not even logged out of X11 WM session)

#58 Updated by SLindoMansilla over 2 years ago

  • Has duplicate action #49067: [functional][u] test fails in shutdown: user is not even logged out of X11 WM session added

#59 Updated by mgriessmeier about 2 years ago

  • Target version changed from Milestone 25 to Milestone 26

#60 Updated by mgriessmeier about 2 years ago

  • Target version changed from Milestone 26 to Milestone 27

@ jojo: let's discuss how to proceed here =)

#61 Updated by mgriessmeier about 2 years ago

  • Status changed from Workable to Resolved

this won't be looked at again - and since apparently no one cares or the issue doesn't happen anymore - I'm cleaning up old mess

Also available in: Atom PDF