action #41960

[sle] test fails in installation: Grub installation failed

Added by michalnowak over 1 year ago. Updated 10 months ago.

Status:ResolvedStart date:03/10/2018
Priority:NormalDue date:
Assignee:Julie_CAO% Done:

100%

Category:Bugs in existing tests
Target version:-
Difficulty:
Duration:

Description

Observation

openQA test in scenario sle-12-SP4-Server-DVD-x86_64-prj2_host_upgrade_sles11sp4_to_developing_kvm@64bit-ipmi fails in
installation

In post-installation steps Grub installation failed. However, there's not enough information, for that manual intervention was required.

History

#1 Updated by okurz about 1 year ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: prj2_host_upgrade_sles11sp4_to_developing_kvm
https://openqa.suse.de/tests/2441567

#2 Updated by okurz 12 months ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: prj4_guest_upgrade_sles11sp4_on_sles11sp4-kvm
https://openqa.suse.de/tests/2493467

#3 Updated by xlai 12 months ago

  • Assignee set to leli

Actually all sle11sp4 host installation are failed the same, see https://openqa.nue.suse.com/tests/2493463.

According to history jobs, there were successful run seven days ago, Feb 19. I checked code commit history, following two commits are related:

commit eb8c53550bbb3d33314bd6fae7abd8bef91c3cd4
Author: Michal Nowak mnowak@suse.com
Date: Wed Feb 20 10:23:19 2019 +0100

Replace wait_boot_on_local_disk

Regression from a42ed584bd400b0a37b6cb6df01b124cbb222edc.

Fails here:
* https://openqa.suse.de/tests/2478650#step/boot_to_desktop/3
* https://openqa.suse.de/tests/overview?distri=sle&version=15-SP1&build=173.1&groupid=111

Validation run: http://nilgiri.suse.cz/tests/300

commit a42ed584bd400b0a37b6cb6df01b124cbb222edc
Author: lemon.li leli@suse.com
Date: Wed Feb 20 10:55:13 2019 +0800

Boot from local disk on installation on aarch64

Since michal already fixed one regression by a42ed584bd400b0a37b6cb6df01b124cbb222edc, root cause is still there.

Lemon, would you please help to fix this blocking issue for virtualization beta4 milestone test on sle11sp4 host autoyast installation?

#4 Updated by leli 12 months ago

  • Assignee changed from leli to xlai

From the log:

#################
[2019-02-22T13:58:48.711 CET] [debug] >>> testapi::_handle_found_needle: found reboot-after-installation-by-autoyast-sle11sp4-20170724, similarity 1.00 @ 121/353
[2019-02-22T13:58:48.734 CET] [debug] IPMI:
[2019-02-22T13:58:48.735 CET] [debug] /var/lib/openqa/cache/openqa.suse.de/tests/sle/tests/autoyast/installation.pm:252 called opensusebasetest::wait_boot
[2019-02-22T13:58:48.735 CET] [debug] <<< testapi::select_console(testapi_console='sol', await_console=0)
/usr/lib/os-autoinst/consoles/vnc_base.pm:58:{
'hostname' => 'localhost',
'ikvm' => 0,
'port' => 44770
}
XIO: fatal IO error 11 (Resource temporarily unavailable) on X server ":48227"
after 15738 requests (15581 known processed) with 0 events remaining.
xterm: fatal IO error 11 (Resource temporarily unavailable) or KillClient on X server ":48227"
[2019-02-22T13:58:49.750 CET] [debug] Driver backend collected unknown process with pid 199683 and exit status: 1
[2019-02-22T13:58:49.751 CET] [debug] Driver backend collected unknown process with pid 199685 and exit status: 84
[2019-02-22T13:58:49.751 CET] [debug] Driver backend collected unknown process with pid 199695 and exit status: 0
[2019-02-22T13:58:49.878 CET] [debug] Connected to Xvnc - PID 205345
icewm PID is 205374
#####################

I compared the failed log with previous successful log. The failed job failed for it waste time on the IO error, so it can't catch the needle 'reboot-after-installation' any more.
What I think:
1. I can't find where show it is related with my push.
2. First to check the IO error.
3. I guess there is something not stable for the test flow, it expected to catch the second 'reboot-after-installation' (stage 2) in very short time after catch the first 'reboot-after-installation' (stage 1). ( From video, we can see the system will go from needle 'reboot-after-installation' to 'grub' very quickly).

#5 Updated by xlai 12 months ago

I checked code, again. This should be not related to lemon, sorry :(.

From git log tests/autoyast/installation.pm, seems introduced by :

commit 1c0adf0ef1778b5b330d12393378971525eaf4ab
Author: Jan Baier jbaier@suse.cz
Date: Mon Feb 18 19:07:25 2019 +0100

Utilize autoyast for baremetal testing

Speed up installation over IPMI by utilizing autoyast installation
procedure with QAM profiles. This should also fix issue with QR images
over svirt backend for s390x architecture.

#6 Updated by xlai 12 months ago

  • Assignee changed from xlai to Julie_CAO

#7 Updated by xlai 12 months ago

Have talked with julie offline, she will help to take it as high priority, since it blocks beta4 milestone test.

#8 Updated by Julie_CAO 12 months ago

I submit a PR to fix it.
https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/6920

but my local verification has not finished. I'll update the status.

#9 Updated by Julie_CAO 12 months ago

tests passed and waiting for merge.

#10 Updated by Julie_CAO 12 months ago

  • Status changed from New to In Progress
  • % Done changed from 0 to 100

PR merged.

tests on openQA failed due to other issues. Close this ticket.

#11 Updated by Julie_CAO 12 months ago

  • Status changed from In Progress to Resolved

#12 Updated by okurz 11 months ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: gi-guest_developing-on-host_sles11sp4-kvm
https://openqa.suse.de/tests/2568528

#13 Updated by okurz 11 months ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: prj2_host_upgrade_sles11sp4_to_developing_xen
https://openqa.suse.de/tests/2756048

#14 Updated by okurz 10 months ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: prj4_guest_upgrade_sles11sp4_on_sles11sp4-xen
https://openqa.suse.de/tests/2823849

Also available in: Atom PDF