Project

General

Profile

action #38963

[functional][y][fast] qemu backend rewrite: upgrade not possible anymore in many scenarios

Added by dimstar almost 2 years ago. Updated over 1 year ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Bugs in existing tests
Target version:
SUSE QA tests - Milestone 18
Start date:
2018-11-09
Due date:
% Done:

0%

Estimated time:
(Total: 0.00 h)
Difficulty:
Duration:

Description

Observation

openQA test in scenario opensuse-Tumbleweed-NET-x86_64-zdup-13.1-gnome@64bit fails in
yast2_bootloader

Reproducible

Fails since (at least) Build 20180727

Expected result

Last good: 20180726 (or more recent)

Further details

Always latest result in this scenario: latest

The rewrite of the qemu backend has changed various settings, and apparently also how the disks are attached/addressed.

In case of upgrades from 13.1 to Tumbleweed, the disk is no longer seen as an updatable disk (or in case of zdup, as the drive configured for the bootloader)

Potentially we can fixup the 13.1 base hdd to have valid fstab entries that might work post upgrade


Subtasks

action #43622: [migration] test fails in boot_to_desktop when migrating from sle11sp4 to leap15sp1 New


Related issues

Related to openQA Project - action #38813: Qemu backend rewrite falloutResolved2018-07-25

History

#1 Updated by dimstar almost 2 years ago

https://openqa.opensuse.org/tests/716475 is an update test (offline, using DVD to boot) - where the disk is not found at all (yast mentions that the disk names do not match what is in fstab)

#2 Updated by okurz almost 2 years ago

  • Assignee set to rpalethorpe

As discussed with rpalethorpe in person

#3 Updated by rpalethorpe almost 2 years ago

2018-07-30 07:42:07 <1> install(3413) [libstorage] SystemCmd.cc:416 stopwatch 0.015783s for "/sbin/udevadm info '/dev/disk/by-id/virtio-1-part2'"
2018-07-30 07:42:07 <1> install(3413) [libstorage] SystemCmd.cc:436 system() Returns:4
2018-07-30 07:42:07 <1> install(3413) [libstorage] SystemCmd.cc:678 stderr:Unknown device, --name=, --path=, or absolute path in /dev/ or /sys expected.
2018-07-30 07:42:07 <3> install(3413) [libstorage] SystemCmd.cc:97 THROW:    command '/sbin/udevadm info '/dev/disk/by-id/virtio-1-part2'' failed:
2018-07-30 07:42:07 <3> install(3413) [libstorage] SystemCmd.cc:97 
2018-07-30 07:42:07 <3> install(3413) [libstorage] SystemCmd.cc:97 stderr:
2018-07-30 07:42:07 <3> install(3413) [libstorage] SystemCmd.cc:97 Unknown device, --name=, --path=, or absolute path in /dev/ or /sys expected.
2018-07-30 07:42:07 <3> install(3413) [libstorage] SystemCmd.cc:97 
2018-07-30 07:42:07 <3> install(3413) [libstorage] SystemCmd.cc:97 exit code:
2018-07-30 07:42:07 <3> install(3413) [libstorage] SystemCmd.cc:97 4
2018-07-30 07:42:07 <3> install(3413) [libstorage] BlkDeviceImpl.cc:346 CAUGHT:  command '/sbin/udevadm info '/dev/disk/by-id/virtio-1-part2'' failed:
2018-07-30 07:42:07 <3> install(3413) [libstorage] BlkDeviceImpl.cc:346 
2018-07-30 07:42:07 <3> install(3413) [libstorage] BlkDeviceImpl.cc:346 stderr:
2018-07-30 07:42:07 <3> install(3413) [libstorage] BlkDeviceImpl.cc:346 Unknown device, --name=, --path=, or absolute path in /dev/ or /sys expected.
2018-07-30 07:42:07 <3> install(3413) [libstorage] BlkDeviceImpl.cc:346 
2018-07-30 07:42:07 <3> install(3413) [libstorage] BlkDeviceImpl.cc:346 exit code:
2018-07-30 07:42:07 <3> install(3413) [libstorage] BlkDeviceImpl.cc:346 4
2018-07-30 07:42:07 <3> install(3413) [libstorage] BlkDeviceImpl.cc:349 THROW:   device not found, name:/dev/disk/by-id/virtio-1-part2

The correct path is probably /dev/disk/by-id/virtio-hd0-part2. hd0 is the new drive serial number and 1 is the old serial. I don't think libstorage should rely on the drive id staying the same, someone could clone a disk or switch controllers which could change the drive serial number or transport (virtio).

So I think this is a product bug. However a workaround would be to simply regenerate the base image in OpenQA.

#4 Updated by rpalethorpe almost 2 years ago

#5 Updated by okurz almost 2 years ago

rpalethorpe hi, how do you plan to continue with this? I am also not sure if it is a product issue or not but still reproducible at least, e.g. see the latest failure in https://openqa.opensuse.org/tests/720032 . Do you plan to report a bug in bugzilla? In this case we should also think a workaround in the test to make the test module not fail. Maybe it is worth to crosscheck the test scenario with a custom/old os-autoinst version which does not include your changes to crosscheck how it behaves?

#6 Updated by rpalethorpe almost 2 years ago

The workaround is to just recreate the 13.1 images, but possibly we don't want to work around it.

Ideally someone who deals with Yast bugs on a regular basis should take this over.

Maybe it is worth to crosscheck the test scenario with a custom/old os-autoinst version which does not include your changes to crosscheck how it behaves?

I am pretty certain the problem is with Yast or libstorage, my changes just uncovered it. This should be reproducible by manually changing a drive serial number in virt manager, no need for custom autoinst versions.

#7 Updated by okurz almost 2 years ago

  • Subject changed from qemu backend rewrite: 13.1 upgrade not possible anymore to [functional][y][fast] qemu backend rewrite: 13.1 upgrade not possible anymore
  • Due date set to 2018-08-14
  • Assignee deleted (rpalethorpe)
  • Priority changed from Normal to High
  • Target version set to Milestone 18

hm, I see. Let's ask the QSF y-team then.

@y-team would you be able to carry over? rpalethorpe should be ready to help with questions

#9 Updated by riafarov almost 2 years ago

  • Subject changed from [functional][y][fast] qemu backend rewrite: 13.1 upgrade not possible anymore to [functional][y][fast] qemu backend rewrite: upgrade not possible anymore in many scenarios

#10 Updated by riafarov almost 2 years ago

rpalethorpe wrote:

The workaround is to just recreate the 13.1 images, but possibly we don't want to work around it.

Ideally someone who deals with Yast bugs on a regular basis should take this over.

Maybe it is worth to crosscheck the test scenario with a custom/old os-autoinst version which does not include your changes to crosscheck how it behaves?

I am pretty certain the problem is with Yast or libstorage, my changes just uncovered it. This should be reproducible by manually changing a drive serial number in virt manager, no need for custom autoinst versions.

Just to comment on this part, it's nothing to do with YaST, because the udev link has apparently changed from virtio-1 to virtio-hd0, as we basically inserted disk differently. One can question if udev should behave differently, or maybe qemu.
We have created image with one path, and then expect it to work using different one.
openSUSE scenario seems to be slightly different, but YaST get what's the in the system, e.g. fstab and if someone did such change, she or he has to ensure that system can boot properly.

#11 Updated by rpalethorpe almost 2 years ago

  • Assignee set to rpalethorpe

OK, so the problem is that a variable ID is used in the fstab file which is then used by Yast. There are probably some heuristics which can be used to guess which mount entries correlate to which drive, but it is not a quick fix. AFAICT this is only a problem on SLE11, Leap 13 and older. SLE12 appears to use UUIDs, at least on the VMs I have seen.

So I will try adding some option to os-autoinst which allows the drive serial number to be set.

#13 Updated by riafarov almost 2 years ago

  • Status changed from New to In Progress
  • Assignee changed from rpalethorpe to riafarov

Richie's change is on osd, so we can fix migration failures. I will take care of it.

#14 Updated by riafarov almost 2 years ago

  • Status changed from In Progress to Feedback

I've updated all affected test suites we are aware of by adding HDDSERIAL=1. We don't know when it will be deployed to o3, but should see if it got fixed on osd for migration scenarios from sle11.

#15 Updated by riafarov almost 2 years ago

  • Status changed from Feedback to Blocked

Issue resolved on OSD: https://openqa.suse.de/tests/overview?distri=sle&version=12-SP4&build=0339&groupid=151
For O3 we are blocked by deployment of new version of os-autoinst.

#16 Updated by riafarov almost 2 years ago

  • Due date changed from 2018-08-14 to 2018-08-28

#18 Updated by okurz almost 2 years ago

Great! Thank you

#19 Updated by szarate over 1 year ago

openQA test in scenario sle-15-SP1-Installer-DVD-x86_64-autoupgrade_sles11sp4_media@64bit fails in
boot_to_desktop

Also available in: Atom PDF