action #38537
closed[functional][u][svirt-xen-pv][easy] test fails in boot_to_desktop - and no post_fail_hook at all…
Added by okurz almost 6 years ago. Updated over 5 years ago.
0%
Description
Observation¶
openQA test in scenario sle-12-SP4-Server-DVD-x86_64-extra_tests_on_gnome@svirt-xen-pv fails in
boot_to_desktop
Reproducible¶
Fails since (at least) Build 0255
Expected result¶
Last good: 0286
Acceptance criteria¶
- AC1: A post_fail_hook is executed that tries to investigate why the bootup is stuck even though the display manager screen shows up
Suggestions¶
- Find other tickets about improving debugging on bootup problems
- Make sure that the approach we already have for other backends is also applied here
Further details¶
Always latest result in this scenario: latest
Updated by michalnowak almost 6 years ago
Updated by okurz almost 6 years ago
- Blocks action #38666: [functional][sle][svirt-xen-pv][u] test fails in consoletest_setup - does not switch to root console added
Updated by mgriessmeier over 5 years ago
# Test died: System did not boot in 80 seconds. at /var/lib/openqa/cache/tests/sle/tests/boot/boot_to_desktop.pm line 24.
I can see a login screen of gnome - so maybe it's just a race condition and 80s is not enough, we have much more on other archs iirc
Updated by michalnowak over 5 years ago
@mgriessmeier: If you talk about extra_tests_on_gnome
(https://openqa.suse.de/tests/1839212) this has been fixed in https://progress.opensuse.org/issues/38666. I understand this ticked as a call for working post_fail_hook
.
Updated by okurz over 5 years ago
- Subject changed from [functional][u][svirt-xen-pv] test fails in boot_to_desktop - and no post_fail_hook at all… to [functional][u][svirt-xen-pv][easy] test fails in boot_to_desktop - and no post_fail_hook at all…
- Status changed from New to Workable
- Estimated time set to 2.00 h
- Difficulty set to easy
Updated by zluo over 5 years ago
at moment I have problem with svirt-xen remote worker, it seems to be a connection issue.
rsync failed at /usr/lib/os-autoinst/consoles/sshVirtsh.pm line 333.
BYTES {"json_cmd_token":"rqCcSLGe","set_current_test":null}
[2018-08-23T14:10:04.0239 CEST] [debug] awaiting death of testpid 4229
[2018-08-23T14:10:04.0241 CEST] [debug] test process exited: 4229
[2018-08-23T14:10:04.0241 CEST] [debug] awaiting death of commands process
[2018-08-23T14:10:04.0242 CEST] [debug] commands process exited: 4228
[2018-08-23T14:10:04.0242 CEST] [debug] isotovideo done
[2018-08-23T14:10:04.0350 CEST] [debug] Connection to root@openqaw5-xen.qa.suse.de established
[2018-08-23T14:10:04.0565 CEST] [debug] Command's stderr:
error: failed to get domain 'openQA-SUT-21'
error: Domain not found
Updated by michalnowak over 5 years ago
The actual error is:
[2018-08-23T14:10:04.0173 CEST] [debug] Command's stderr:
rsync: link_stat "/var/lib/openqa/share/factory/hdd/SLES-12-SP4-x86_64-Build0342@svirt-xen-pv-gnome.qcow2" failed: No such file or directory (2)
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1165) [sender=3.1.0]
Currently no SLES-12-SP4-x86_64-Build*@svirt-xen-pv-gnome.qcow2
images are on openqa.suse.de so the rsync
fails. Wait for create_hdd_gnome@svirt-xen-pv
in SLES 12 SP4 Functional.
Updated by zluo over 5 years ago
Since I have iso and qcow2 image saved on my local openqa machine, I tried to use ISO and HDD_1 to assign them to clone test, but it still have same problem and requires qcow2 image.
Updated by michalnowak over 5 years ago
The local storage matters only very little to svirt, assets on the host & openqa.suse.de's NFS is what really matters. See: https://github.com/os-autoinst/os-autoinst/blob/master/consoles/sshVirtsh.pm#L328.
Updated by zluo over 5 years ago
doesn't show qcow2 image, This is test run on osd yesterday. Why qcow2 image is aleady gone?
Updated by mgriessmeier over 5 years ago
zluo wrote:
doesn't show qcow2 image, This is test run on osd yesterday. Why qcow2 image is aleady gone?
it got remove ~3 hours after the job was finished... I'll ask santi about the time we keep images here.
[2018-08-23T22:23:31.0039 CEST] [info] GRU: removing /var/lib/openqa/share/factory/hdd/SLES-12-SP4-x86_64-Build0352@svirt-xen-pv-gnome.qcow2
I will restart creation job
Updated by michalnowak over 5 years ago
Thanks Matthias. That HDD creation job just finished, @zluo should be able to leverage build 352 Xen PV image.
Updated by zluo over 5 years ago
now I can clone the test from osd :)
Updated by zluo over 5 years ago
http://e13.suse.de/tests/7636#step/boot_to_desktop/5
opensusebasetest.pm:
if (wait_serial qr/Reached target Shutdown/) {
record_info 'shutdown', 'At least we reached target Shutdown';
}
is working here as well, and I added checking Startup finished:
if (wait_serial qr/Startup finished/) {
record_info 'Startup', 'At least system starts up successfully';
}
Updated by zluo over 5 years ago
http://e13.suse.de/tests/7650#step/boot_to_desktop/5
works as before for boot_to_desktop (non svirt_xen)
Updated by zluo over 5 years ago
http://e13.suse.de/tests/7652#step/boot_to_desktop/6
shows post_fail_hook record_info is working:
sub post_fail_hook {
my ($self) = @_;
return if testapi::is_serial_terminal(); # in case it is VIRTIO_CONSOLE=1 nothing below make sense
# just output error if selected program doesn't exist instead of collecting all logs
# set current variables in x11_start_program
if (get_var('IN_X11_START_PROGRAM')) {
my $program = get_var('IN_X11_START_PROGRAM');
select_console 'log-console';
my $r = script_run "which $program";
if ($r != 0) {
record_info("no $program", "Could not find '$program' on the system", result => 'fail') && die "$program does not exist on the system";
}
}
return unless ($self->{in_boot_desktop} || $self->{in_wait_boot}) ;
if ((get_var('BACKEND') // '') eq 'svirt') {
record_info ('Startup', 'At least Startup is finished.' ) if (wait_serial qr/Startup finished/);
}
elsif (wait_serial qr/Reached target Shutdown/) {
record_info 'shutdown', 'At least we reached target Shutdown';
}
Updated by zluo over 5 years ago
http://e13.suse.de/tests/7708#step/user_defined_snapshot/21
user_defined_snapshot shows up record_info for shutdown...
Updated by zluo over 5 years ago
Updated by zluo over 5 years ago
PR updated:
sub post_fail_hook {
my ($self) = @_;
return if testapi::is_serial_terminal(); # in case it is VIRTIO_CONSOLE=1 nothing below make sense
# just output error if selected program doesn't exist instead of collecting all logs
# set current variables in x11_start_program
if (get_var('IN_X11_START_PROGRAM')) {
my $program = get_var('IN_X11_START_PROGRAM');
select_console 'log-console';
my $r = script_run "which $program";
if ($r != 0) {
record_info("no $program", "Could not find '$program' on the system", result => 'fail') && die "$program does not exist on the system";
}
}
return unless ($self->{in_wait_boot} || $self->{in_boot_desktop});
if ($self->{in_wait_boot}) {
record_info('shutdown', 'At least we reached target Shutdown') if (wait_serial 'Reached target Shutdown');
}
elsif ($self->{in_boot_desktop}) {
record_info('Startup', 'At least Startup is finished.') if (wait_serial 'Startup finished');
}
# In case the system is stuck in shutting down or during boot up, press
# 'esc' just in case the plymouth splash screen is shown and we can not
# see any interesting console logs.
send_key 'esc';
save_screenshot;
}
Updated by mgriessmeier over 5 years ago
- Due date changed from 2018-08-28 to 2018-09-11
Updated by zluo over 5 years ago
- Status changed from In Progress to Resolved
PR merged, since this is a sporadic issue, set it as resolved for now.
Updated by zluo over 5 years ago
will of course to check this by chance on osd, even it happens really seldom...