Project

General

Profile

Actions

action #38537

closed

[functional][u][svirt-xen-pv][easy] test fails in boot_to_desktop - and no post_fail_hook at all…

Added by okurz almost 6 years ago. Updated over 5 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Enhancement to existing tests
Target version:
SUSE QA - Milestone 18
Start date:
2018-07-18
Due date:
2018-09-11
% Done:

0%

Estimated time:
2.00 h
Difficulty:
easy

Description

Observation

openQA test in scenario sle-12-SP4-Server-DVD-x86_64-extra_tests_on_gnome@svirt-xen-pv fails in
boot_to_desktop

Reproducible

Fails since (at least) Build 0255

Expected result

Last good: 0286

Acceptance criteria

  • AC1: A post_fail_hook is executed that tries to investigate why the bootup is stuck even though the display manager screen shows up

Suggestions

  • Find other tickets about improving debugging on bootup problems
  • Make sure that the approach we already have for other backends is also applied here

Further details

Always latest result in this scenario: latest


Related issues 1 (0 open1 closed)

Blocks openQA Tests - action #38666: [functional][sle][svirt-xen-pv][u] test fails in consoletest_setup - does not switch to root consoleResolvedmichalnowak2018-07-202018-09-11

Actions
Actions #2

Updated by okurz almost 6 years ago

  • Blocks action #38666: [functional][sle][svirt-xen-pv][u] test fails in consoletest_setup - does not switch to root console added
Actions #3

Updated by mgriessmeier over 5 years ago

# Test died: System did not boot in 80 seconds. at /var/lib/openqa/cache/tests/sle/tests/boot/boot_to_desktop.pm line 24.

I can see a login screen of gnome - so maybe it's just a race condition and 80s is not enough, we have much more on other archs iirc

Actions #4

Updated by michalnowak over 5 years ago

@mgriessmeier: If you talk about extra_tests_on_gnome (https://openqa.suse.de/tests/1839212) this has been fixed in https://progress.opensuse.org/issues/38666. I understand this ticked as a call for working post_fail_hook.

Actions #5

Updated by okurz over 5 years ago

  • Subject changed from [functional][u][svirt-xen-pv] test fails in boot_to_desktop - and no post_fail_hook at all… to [functional][u][svirt-xen-pv][easy] test fails in boot_to_desktop - and no post_fail_hook at all…
  • Status changed from New to Workable
  • Estimated time set to 2.00 h
  • Difficulty set to easy
Actions #6

Updated by zluo over 5 years ago

  • Assignee set to zluo

take over

Actions #7

Updated by zluo over 5 years ago

  • Status changed from Workable to In Progress
Actions #8

Updated by zluo over 5 years ago

at moment I have problem with svirt-xen remote worker, it seems to be a connection issue.

rsync failed at /usr/lib/os-autoinst/consoles/sshVirtsh.pm line 333.
BYTES {"json_cmd_token":"rqCcSLGe","set_current_test":null}
[2018-08-23T14:10:04.0239 CEST] [debug] awaiting death of testpid 4229
[2018-08-23T14:10:04.0241 CEST] [debug] test process exited: 4229
[2018-08-23T14:10:04.0241 CEST] [debug] awaiting death of commands process
[2018-08-23T14:10:04.0242 CEST] [debug] commands process exited: 4228
[2018-08-23T14:10:04.0242 CEST] [debug] isotovideo done
[2018-08-23T14:10:04.0350 CEST] [debug] Connection to root@openqaw5-xen.qa.suse.de established
[2018-08-23T14:10:04.0565 CEST] [debug] Command's stderr:
error: failed to get domain 'openQA-SUT-21'
error: Domain not found

http://e13.suse.de/tests/7612

Actions #9

Updated by michalnowak over 5 years ago

The actual error is:

[2018-08-23T14:10:04.0173 CEST] [debug] Command's stderr:
rsync: link_stat "/var/lib/openqa/share/factory/hdd/SLES-12-SP4-x86_64-Build0342@svirt-xen-pv-gnome.qcow2" failed: No such file or directory (2)
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1165) [sender=3.1.0]

Currently no SLES-12-SP4-x86_64-Build*@svirt-xen-pv-gnome.qcow2 images are on openqa.suse.de so the rsync fails. Wait for create_hdd_gnome@svirt-xen-pv in SLES 12 SP4 Functional.

Actions #10

Updated by zluo over 5 years ago

Since I have iso and qcow2 image saved on my local openqa machine, I tried to use ISO and HDD_1 to assign them to clone test, but it still have same problem and requires qcow2 image.

http://e13.suse.de/tests/7618#settings

Actions #11

Updated by michalnowak over 5 years ago

The local storage matters only very little to svirt, assets on the host & openqa.suse.de's NFS is what really matters. See: https://github.com/os-autoinst/os-autoinst/blob/master/consoles/sshVirtsh.pm#L328.

Actions #12

Updated by zluo over 5 years ago

Actions #13

Updated by mgriessmeier over 5 years ago

zluo wrote:

https://openqa.suse.de/tests/latest?flavor=Server-DVD&test=extra_tests_on_gnome&machine=svirt-xen-pv&version=12-SP4&distri=sle&arch=x86_64#downloads

doesn't show qcow2 image, This is test run on osd yesterday. Why qcow2 image is aleady gone?

it got remove ~3 hours after the job was finished... I'll ask santi about the time we keep images here.

[2018-08-23T22:23:31.0039 CEST] [info] GRU: removing /var/lib/openqa/share/factory/hdd/SLES-12-SP4-x86_64-Build0352@svirt-xen-pv-gnome.qcow2

I will restart creation job

Actions #14

Updated by michalnowak over 5 years ago

Thanks Matthias. That HDD creation job just finished, @zluo should be able to leverage build 352 Xen PV image.

Actions #15

Updated by zluo over 5 years ago

now I can clone the test from osd :)

http://e13.suse.de/tests/7625

Actions #16

Updated by zluo over 5 years ago

http://e13.suse.de/tests/7636#step/boot_to_desktop/5

opensusebasetest.pm:

if (wait_serial qr/Reached target Shutdown/) {
record_info 'shutdown', 'At least we reached target Shutdown';
}

is working here as well, and I added checking Startup finished:

if (wait_serial qr/Startup finished/) {
record_info 'Startup', 'At least system starts up successfully';
}
Actions #17

Updated by zluo over 5 years ago

http://e13.suse.de/tests/7650#step/boot_to_desktop/5

works as before for boot_to_desktop (non svirt_xen)

Actions #18

Updated by zluo over 5 years ago

http://e13.suse.de/tests/7652#step/boot_to_desktop/6

shows post_fail_hook record_info is working:

sub post_fail_hook {
my ($self) = @_;
return if testapi::is_serial_terminal();    # in case it is VIRTIO_CONSOLE=1 nothing below make sense
# just output error if selected program doesn't exist instead of collecting all logs
# set current variables in x11_start_program
if (get_var('IN_X11_START_PROGRAM')) {
my $program = get_var('IN_X11_START_PROGRAM');
select_console 'log-console';
my $r = script_run "which $program";
if ($r != 0) {
record_info("no $program", "Could not find '$program' on the system", result => 'fail') && die "$program does not exist on the system";
}
}
return unless ($self->{in_boot_desktop} || $self->{in_wait_boot}) ;
if ((get_var('BACKEND') // '') eq 'svirt') {
record_info ('Startup', 'At least Startup is finished.' ) if (wait_serial qr/Startup finished/);
}
elsif (wait_serial qr/Reached target Shutdown/) {
record_info 'shutdown', 'At least we reached target Shutdown';
}
Actions #19

Updated by zluo over 5 years ago

http://e13.suse.de/tests/7708#step/user_defined_snapshot/21

user_defined_snapshot shows up record_info for shutdown...

Actions #21

Updated by zluo over 5 years ago

PR updated:

sub post_fail_hook {
my ($self) = @_;
return if testapi::is_serial_terminal();    # in case it is VIRTIO_CONSOLE=1 nothing below make sense
# just output error if selected program doesn't exist instead of collecting all logs
# set current variables in x11_start_program
if (get_var('IN_X11_START_PROGRAM')) {
my $program = get_var('IN_X11_START_PROGRAM');
select_console 'log-console';
my $r = script_run "which $program";
if ($r != 0) {
record_info("no $program", "Could not find '$program' on the system", result => 'fail') && die "$program does not exist on the system";
}
}
return unless ($self->{in_wait_boot} || $self->{in_boot_desktop});
if ($self->{in_wait_boot}) {
record_info('shutdown', 'At least we reached target Shutdown') if (wait_serial 'Reached target Shutdown');
}
elsif ($self->{in_boot_desktop}) {
record_info('Startup', 'At least Startup is finished.') if (wait_serial 'Startup finished');
}
# In case the system is stuck in shutting down or during boot up, press
# 'esc' just in case the plymouth splash screen is shown and we can not
# see any interesting console logs.
send_key 'esc';
save_screenshot;
}
Actions #22

Updated by zluo over 5 years ago

still waiting for PR to be merged...

Actions #23

Updated by mgriessmeier over 5 years ago

  • Due date changed from 2018-08-28 to 2018-09-11
Actions #24

Updated by zluo over 5 years ago

  • Status changed from In Progress to Resolved

PR merged, since this is a sporadic issue, set it as resolved for now.

Actions #25

Updated by zluo over 5 years ago

will of course to check this by chance on osd, even it happens really seldom...

Actions

Also available in: Atom PDF