Project

General

Profile

action #64466

[functional][y][hyper-v][timeboxed:16h] test fails in shutdown

Added by syrianidou_sofia over 1 year ago. Updated over 1 year ago.

Status:
Resolved
Priority:
High
Category:
Bugs in existing tests
Target version:
SUSE QA - Milestone 35+
Start date:
2020-03-12
Due date:
2020-05-19
% Done:

0%

Estimated time:
Difficulty:

Description

Sporadic failure in shutdown module. Two types of failures observed:

As per Yanis, it fails constantly on uefi. Seems that it's issues with openQA setup, as Yanis managed to shutdown the system and it worked just fine. So we need to identify the actions needed to fix it.

[ 675.237199] gnome-keyring-daemon[2603]: couldn't initialize slot with master password: The password or PIN is incorrect

[ 675.297939] gdm-password][4597]: gkr-pam: unlocked login keyring

[2020-03-12T12:29:22.895 CET] [debug] tests/shutdown/shutdown.pm:28 called power_action_utils::power_action -> lib/power_action_utils.pm:255 called testapi::select_console -> lib/susedistribution.pm:883 called x11utils::ensure_unlocked_desktop -> lib/x11utils.pm:126 called testapi::wait_still_screen
[2020-03-12T12:29:22.895 CET] [debug] <<< testapi::wait_still_screen(similarity_level=47, timeout=30, stilltime=1)
[ 675.471356] systemd[4428]: Stopped target Current graphical user session.

[ 675.803774] systemd[4428]: Stopped target GNOME X11 Session (session: gnome-login).

[ 675.914330] gdm-Xorg-:1[4408]: (II) event3 - Microsoft Vmbus HID-compliant Mouse: device removed

[ 675.994240] gdm-Xorg-:1[4408]: (II) event4 - Power Button: device removed

[ 676.108023] gdm-Xorg-:1[4408]: (II) event2 - AT Translated Set 2 keyboard: device removed

[ 676.167890] gdm-Xorg-:1[4408]: (II) event0 - AT Translated Set 2 keyboard: device removed

[ 676.223982] gdm-Xorg-:1[4408]: (II) event1 - TPPS/2 IBM TrackPoint: device removed

[ 676.276254] gdm-Xorg-:1[4408]: (II) UnloadModule: "libinput"

[ 676.316927] gdm-Xorg-:1[4408]: (II) UnloadModule: "libinput"

[ 676.358856] gdm-Xorg-:1[4408]: (II) UnloadModule: "libinput"

[ 676.418850] gdm-Xorg-:1[4408]: (II) UnloadModule: "libinput"

[ 676.487558] gdm-Xorg-:1[4408]: (II) UnloadModule: "libinput"

[ 676.546600] gdm-Xorg-:1[4408]: (II) Server terminated successfully (0). Closing log file.

[ 676.648098] gnome-session[4457]: gnome-session-binary[4457]: WARNING: Lost name on bus: org.gnome.SessionManager

[ 676.732171] gdm-launch-environment][4424]: pam_unix(gdm-launch-environment:session): session closed for user gdm

[ 676.834667] systemd[4428]: Stopped target GNOME Session.

[ 676.908222] gnome-session[4457]: Unable to init server: Could not connect: Connection refused

Observation

openQA test in scenario sle-15-SP2-Full-x86_64-skip_registration@svirt-hyperv fails in
shutdown

Reproducible

Fails since (at least) Build 101.1

Expected result

Last good: 154.1

History

#1 Updated by riafarov over 1 year ago

  • Due date set to 2020-04-07
  • Priority changed from Normal to High

#2 Updated by riafarov over 1 year ago

  • Target version set to Milestone 32

#3 Updated by riafarov over 1 year ago

  • Due date deleted (2020-04-07)

We cannot act on this right now, as we lack expertise in hyper-v setup.

#4 Updated by riafarov over 1 year ago

  • Target version changed from Milestone 32 to Milestone 35+

#5 Updated by riafarov over 1 year ago

  • Subject changed from [functional][y] test fails in shutdown to [functional][y][hyper-v] test fails in shutdown
  • Due date set to 2020-05-19

#6 Updated by riafarov over 1 year ago

  • Subject changed from [functional][y][hyper-v] test fails in shutdown to [functional][y][hyper-v][timeboxed:16h] test fails in shutdown
  • Description updated (diff)
  • Status changed from New to Workable

#7 Updated by syrianidou_sofia over 1 year ago

  • Status changed from Workable to In Progress
  • Assignee set to syrianidou_sofia

#8 Updated by syrianidou_sofia over 1 year ago

  • Status changed from In Progress to Feedback

The issue can be reproduced manually. After switching from tty2 or tty6 to tty7 and trying to login, there are problems with the graphics. A black screen is presented and can only be dissolved by screen items movement. In the test, it is expected that after login, the gnome desktop will appear and the screen will be asserted, but since no items are moved, that doesn't happen and there is "Stall detected" failure. The problem was revealed when the DESKTOP parameter was correctly set to "gnome" (previously it was "textmode") so, after module suseconnect_scc leaves system to text mode, module shutdown starts but calls power_action->select_console "x11" and that causes the switch from text mode to graphical. I have opened bug#1171290

#9 Updated by okurz over 1 year ago

I would be very careful with the current test results. I see a problem already in https://openqa.suse.de/tests/4203853#step/shutdown/1 when the test detects "screenlock" but the screenshot shows a tty console. And as after login into gnome in https://openqa.suse.de/tests/4203853#step/shutdown/7 the display manager asking for login appears again. This looks like gdm is crashing. That should be investigated. The post_fail_hook unfortunately did not manage to provide a tty console . And then also https://openqa.suse.de/tests/4203853#step/shutdown/16 is a false-match.

#10 Updated by riafarov over 1 year ago

okurz wrote:

I would be very careful with the current test results. I see a problem already in https://openqa.suse.de/tests/4203853#step/shutdown/1 when the test detects "screenlock" but the screenshot shows a tty console. And as after login into gnome in https://openqa.suse.de/tests/4203853#step/shutdown/7 the display manager asking for login appears again. This looks like gdm is crashing. That should be investigated. The post_fail_hook unfortunately did not manage to provide a tty console . And then also https://openqa.suse.de/tests/4203853#step/shutdown/16 is a false-match.

Hmm, feels like you didn't see last comment from Sofia. We have filed a bug after deeper investigation. Hyper-v causes many issues recently due to multiple reasons. Do you have some applicable suggestions?

#11 Updated by syrianidou_sofia over 1 year ago

  • Status changed from Feedback to Resolved

#12 Updated by okurz over 1 year ago

  • Status changed from Resolved to Feedback

Well, I answered to the comment from Sofia so I have seen it :) https://bugzilla.suse.com/show_bug.cgi?id=1171290 can be valid but still I saw two issues which are similar to other issues that have led to invalid bug reports in the past. Actionable suggestions:

#13 Updated by syrianidou_sofia over 1 year ago

Thanks for the hint Oliver. I manually created a VM on hyper-V, installed SLES with Gnome and I could still reproduce the issue, so I don't think this has to do with openQA.

#14 Updated by okurz over 1 year ago

That's good and convincing. But I merely stated other problems the mentioned openQA jobs have, not questioning validity of the bug report. However, manual reproduction is often a good approach to show in bug reports it is not "just an test issue" as some might believe

#15 Updated by riafarov over 1 year ago

  • Status changed from Feedback to Resolved

@Oliver, let's not mix all the issues in one ticket, and this one is timeboxed. We have figured out one issue and we will proceed in case we see that issue is different from what is in the bug, as well as get feedback in the bug too.

Also available in: Atom PDF