Project

General

Profile

Actions

action #38963

closed

[functional][y][fast] qemu backend rewrite: upgrade not possible anymore in many scenarios

Added by dimstar over 6 years ago. Updated almost 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Bugs in existing tests
Target version:
SUSE QA (private) - Milestone 18
Start date:
2018-11-09
Due date:
% Done:

100%

Estimated time:
(Total: 3.00 h)
Difficulty:

Description

Observation

openQA test in scenario opensuse-Tumbleweed-NET-x86_64-zdup-13.1-gnome@64bit fails in
yast2_bootloader

Reproducible

Fails since (at least) Build 20180727

Expected result

Last good: 20180726 (or more recent)

Further details

Always latest result in this scenario: latest

The rewrite of the qemu backend has changed various settings, and apparently also how the disks are attached/addressed.

In case of upgrades from 13.1 to Tumbleweed, the disk is no longer seen as an updatable disk (or in case of zdup, as the drive configured for the bootloader)

Potentially we can fixup the 13.1 base hdd to have valid fstab entries that might work post upgrade


Subtasks 1 (0 open1 closed)

action #43622: [migration] test fails in boot_to_desktop when migrating from sle11sp4 to sles 15sp1Resolvedleli2018-11-09

Actions

Related issues 1 (0 open1 closed)

Related to openQA Project (public) - action #38813: Qemu backend rewrite falloutResolvedrpalethorpe2018-07-25

Actions
Actions #1

Updated by dimstar over 6 years ago

https://openqa.opensuse.org/tests/716475 is an update test (offline, using DVD to boot) - where the disk is not found at all (yast mentions that the disk names do not match what is in fstab)

Actions #2

Updated by okurz over 6 years ago

  • Assignee set to rpalethorpe

As discussed with rpalethorpe in person

Actions #3

Updated by rpalethorpe over 6 years ago

2018-07-30 07:42:07 <1> install(3413) [libstorage] SystemCmd.cc:416 stopwatch 0.015783s for "/sbin/udevadm info '/dev/disk/by-id/virtio-1-part2'"
2018-07-30 07:42:07 <1> install(3413) [libstorage] SystemCmd.cc:436 system() Returns:4
2018-07-30 07:42:07 <1> install(3413) [libstorage] SystemCmd.cc:678 stderr:Unknown device, --name=, --path=, or absolute path in /dev/ or /sys expected.
2018-07-30 07:42:07 <3> install(3413) [libstorage] SystemCmd.cc:97 THROW:    command '/sbin/udevadm info '/dev/disk/by-id/virtio-1-part2'' failed:
2018-07-30 07:42:07 <3> install(3413) [libstorage] SystemCmd.cc:97 
2018-07-30 07:42:07 <3> install(3413) [libstorage] SystemCmd.cc:97 stderr:
2018-07-30 07:42:07 <3> install(3413) [libstorage] SystemCmd.cc:97 Unknown device, --name=, --path=, or absolute path in /dev/ or /sys expected.
2018-07-30 07:42:07 <3> install(3413) [libstorage] SystemCmd.cc:97 
2018-07-30 07:42:07 <3> install(3413) [libstorage] SystemCmd.cc:97 exit code:
2018-07-30 07:42:07 <3> install(3413) [libstorage] SystemCmd.cc:97 4
2018-07-30 07:42:07 <3> install(3413) [libstorage] BlkDeviceImpl.cc:346 CAUGHT:  command '/sbin/udevadm info '/dev/disk/by-id/virtio-1-part2'' failed:
2018-07-30 07:42:07 <3> install(3413) [libstorage] BlkDeviceImpl.cc:346 
2018-07-30 07:42:07 <3> install(3413) [libstorage] BlkDeviceImpl.cc:346 stderr:
2018-07-30 07:42:07 <3> install(3413) [libstorage] BlkDeviceImpl.cc:346 Unknown device, --name=, --path=, or absolute path in /dev/ or /sys expected.
2018-07-30 07:42:07 <3> install(3413) [libstorage] BlkDeviceImpl.cc:346 
2018-07-30 07:42:07 <3> install(3413) [libstorage] BlkDeviceImpl.cc:346 exit code:
2018-07-30 07:42:07 <3> install(3413) [libstorage] BlkDeviceImpl.cc:346 4
2018-07-30 07:42:07 <3> install(3413) [libstorage] BlkDeviceImpl.cc:349 THROW:   device not found, name:/dev/disk/by-id/virtio-1-part2

The correct path is probably /dev/disk/by-id/virtio-hd0-part2. hd0 is the new drive serial number and 1 is the old serial. I don't think libstorage should rely on the drive id staying the same, someone could clone a disk or switch controllers which could change the drive serial number or transport (virtio).

So I think this is a product bug. However a workaround would be to simply regenerate the base image in OpenQA.

Actions #4

Updated by rpalethorpe over 6 years ago

Actions #5

Updated by okurz over 6 years ago

@rpalethorpe hi, how do you plan to continue with this? I am also not sure if it is a product issue or not but still reproducible at least, e.g. see the latest failure in https://openqa.opensuse.org/tests/720032 . Do you plan to report a bug in bugzilla? In this case we should also think a workaround in the test to make the test module not fail. Maybe it is worth to crosscheck the test scenario with a custom/old os-autoinst version which does not include your changes to crosscheck how it behaves?

Actions #6

Updated by rpalethorpe over 6 years ago

The workaround is to just recreate the 13.1 images, but possibly we don't want to work around it.

Ideally someone who deals with Yast bugs on a regular basis should take this over.

Maybe it is worth to crosscheck the test scenario with a custom/old os-autoinst version which does not include your changes to crosscheck how it behaves?

I am pretty certain the problem is with Yast or libstorage, my changes just uncovered it. This should be reproducible by manually changing a drive serial number in virt manager, no need for custom autoinst versions.

Actions #7

Updated by okurz over 6 years ago

  • Subject changed from qemu backend rewrite: 13.1 upgrade not possible anymore to [functional][y][fast] qemu backend rewrite: 13.1 upgrade not possible anymore
  • Due date set to 2018-08-14
  • Assignee deleted (rpalethorpe)
  • Priority changed from Normal to High
  • Target version set to Milestone 18

hm, I see. Let's ask the QSF y-team then.

@y-team would you be able to carry over? rpalethorpe should be ready to help with questions

Actions #8

Updated by riafarov over 6 years ago

Actions #9

Updated by riafarov over 6 years ago

  • Subject changed from [functional][y][fast] qemu backend rewrite: 13.1 upgrade not possible anymore to [functional][y][fast] qemu backend rewrite: upgrade not possible anymore in many scenarios
Actions #10

Updated by riafarov over 6 years ago

rpalethorpe wrote:

The workaround is to just recreate the 13.1 images, but possibly we don't want to work around it.

Ideally someone who deals with Yast bugs on a regular basis should take this over.

Maybe it is worth to crosscheck the test scenario with a custom/old os-autoinst version which does not include your changes to crosscheck how it behaves?

I am pretty certain the problem is with Yast or libstorage, my changes just uncovered it. This should be reproducible by manually changing a drive serial number in virt manager, no need for custom autoinst versions.

Just to comment on this part, it's nothing to do with YaST, because the udev link has apparently changed from virtio-1 to virtio-hd0, as we basically inserted disk differently. One can question if udev should behave differently, or maybe qemu.
We have created image with one path, and then expect it to work using different one.
openSUSE scenario seems to be slightly different, but YaST get what's the in the system, e.g. fstab and if someone did such change, she or he has to ensure that system can boot properly.

Actions #11

Updated by rpalethorpe over 6 years ago

  • Assignee set to rpalethorpe

OK, so the problem is that a variable ID is used in the fstab file which is then used by Yast. There are probably some heuristics which can be used to guess which mount entries correlate to which drive, but it is not a quick fix. AFAICT this is only a problem on SLE11, Leap 13 and older. SLE12 appears to use UUIDs, at least on the VMs I have seen.

So I will try adding some option to os-autoinst which allows the drive serial number to be set.

Actions #13

Updated by riafarov over 6 years ago

  • Status changed from New to In Progress
  • Assignee changed from rpalethorpe to riafarov

Richie's change is on osd, so we can fix migration failures. I will take care of it.

Actions #14

Updated by riafarov over 6 years ago

  • Status changed from In Progress to Feedback

I've updated all affected test suites we are aware of by adding HDDSERIAL=1. We don't know when it will be deployed to o3, but should see if it got fixed on osd for migration scenarios from sle11.

Actions #15

Updated by riafarov over 6 years ago

  • Status changed from Feedback to Blocked

Issue resolved on OSD: https://openqa.suse.de/tests/overview?distri=sle&version=12-SP4&build=0339&groupid=151
For O3 we are blocked by deployment of new version of os-autoinst.

Actions #16

Updated by riafarov over 6 years ago

  • Due date changed from 2018-08-14 to 2018-08-28
Actions #18

Updated by okurz over 6 years ago

Great! Thank you

Actions #19

Updated by szarate about 6 years ago

openQA test in scenario sle-15-SP1-Installer-DVD-x86_64-autoupgrade_sles11sp4_media@64bit fails in
boot_to_desktop

Actions

Also available in: Atom PDF