Project

General

Profile

Actions

action #41093

closed

[opensuse][functional][u] test fails in virt_install - default network not available anymore

Added by mlin7442 over 6 years ago. Updated almost 6 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
Bugs in existing tests
Target version:
SUSE QA (private) - Milestone 22
Start date:
2018-09-14
Due date:
2018-10-23
% Done:

0%

Estimated time:
Difficulty:

Description

Observation

openQA test in scenario opensuse-Tumbleweed-DVD-x86_64-virtualization@64bit fails in
virt_install

Reproducible

Fails since (at least) Build 20180911

Expected result

Last good: 20180910 (or more recent)

Suggestions

  • Extend tests to gather the needed logs, see the mentioned bugs
  • Investigate if we need to implement a workaround for starting the according network, potentially with record_soft_failure pointing to an according bug
  • Fix the test

Further details

Always latest result in this scenario: latest


Related issues 1 (0 open1 closed)

Blocks openQA Tests (public) - action #45713: [functional][u] test for nested virtualization / qemu / typing over VNCResolveddheidler2019-01-02

Actions
Actions #1

Updated by ggardet_arm over 6 years ago

Fails also on aarch64.
The default network is not created. Bug or feature update?
We could add '--network none' to the virt-install command to get rid of the network configuration problem, if it is not due to a bug.

Actions #2

Updated by ggardet_arm over 6 years ago

We should have a default NAT configuration which is in package 'libvirt-daemon-config-network'.

Actions #3

Updated by okurz over 6 years ago

  • Subject changed from test fails in virt_install to [opensuse] test fails in virt_install - default network not available anymore
  • Assignee set to aginies
  • Priority changed from Normal to High

So this is reproducibly failing now and it is pretty obvious that test changes have not caused this.

@aginies as "virtualization expert" as well as the test module maintainer in our tests, what is your judgement: Do we need to change something in tests or should we consider this as a product bug?

Actions #4

Updated by aginies over 6 years ago

@aginies as "virtualization expert" as well as the test module maintainer in our tests, what is your judgement: Do we need to change something in tests or should we consider this as a product bug?

The default network should be active by default, so i would say there is something wrong if it is not the case. Sounds like a bug

Actions #5

Updated by mlin7442 over 6 years ago

note that, NM is now the default network management for desktop environment rather than wicked.

Actions #6

Updated by aginies over 6 years ago

mlin7442 wrote:

note that, NM is now the default network management for desktop environment rather than wicked.

NetworkManager? this is not supported at all.
Everything dealing with Hypervisor is on SLES product, with Wicked. NM is desktop only, and we don't support SLED being an hypervisor. So this is not a bug.

Actions #7

Updated by okurz over 6 years ago

Not sure I understand. You mention supported SLE cases which I understand but can we try to find a technical solution here please?

Actions #8

Updated by aginies over 6 years ago

okurz wrote:

Not sure I understand. You mention supported SLE cases which I understand but can we try to find a technical solution here please?

create a network an use it:
/etc/libvirt/qemu/networks/testing_network.xml

<network>
<name>testing_network</name>
<uuid>{UUID}</uuid>
<forward mode='nat'/>
<bridge name='{BRIDGE}' stp='on' delay='0'/>
<mac address='{NETMACHOST}'/>
<domain name='{NETWORKNAME}'/>
<ip address='{NETWORK}.1' netmask='255.255.255.0'>
<dhcp>
<range start='{NETWORK}.128' end='{NETWORK}.254'/>
<host mac="{MACA}" name="{NODENAME}" ip="{NETWORK}.101" />
etc....
</dhcp>
</ip>
</network>
</pre>

virsh net-autostart testing_network
virsh net-start testing_network

Actions #9

Updated by mlin7442 over 6 years ago

another 2 options perhaps:

1) boot from textmode image
2) boot from miniamlX image(generic desktop) which still have had wicked as the default network management

Actions #10

Updated by okurz over 6 years ago

  • Subject changed from [opensuse] test fails in virt_install - default network not available anymore to [opensuse][functional][u] test fails in virt_install - default network not available anymore

All in all I would describe this as a product regression of the decision to switch to NetworkManager by default, not a regression in the virtualization stack per se. The test suite is centered around graphical tools so "textmode" is not really a feasible option, "minimalX" as a starting base is a good idea, we should try that. However, I still think there is something we should do about the default behaviour in the product. At least a popup box to state: "NetworkManager detected, please try to define a network explicitly" or something like that could help

Actions #11

Updated by okurz over 6 years ago

  • Due date set to 2018-10-23
  • Target version set to Milestone 20

@mlin7442 so let's try to share responsibility a bit more: sysrich repeats the mantra "you break it, you fix it!", so who changed the network manager to be default? Let's involve that person or them so that we can find out better what a common preferred approach should be :)

Actions #12

Updated by lnussel over 6 years ago

So what do people expect from virt-install? If I have a desktop do I install virt-manager and use the gui to use virt-install to install VMs? If so virt-install should be made to work with NM without hacks out of the box.

Actions #13

Updated by RBrownSUSE over 6 years ago

@okurz as discussed yesterday, the network manager change was introduced as part of the SLE 15 style system roles. That was coordinated with the SLE functional testing team with the goal of making sure we didn't run into too many test failures due to that significant change in behaviour. This is one that got away, obviously, but I think you're going too far and risk being seen to use this situation as a rod to beat a dead horse into submission.

So, let's stop pointing figures at look at this holistically

First, consider that libvirt networking is abstracted away from both wicked and NM.
The 'default' network is provided by "libvirt-daemon-config-network"

The question to me becomes "Why does libvirt network 'default' get activated when wicked is the default network stack, but not with NM?"

Is the libvirt-daemon-config-network package not being installed?
Is this a bug that needs to be fixed in libvirt-daemon-config-network?
If libvirt-daemon-config-network is installed, why is the default.xml not being activated? This is probably a bug that needs to be fixed.

It's been 11 days since Max reported this issue and right now I would say that no one has actually looked at the bug in question..lets start with that, perhaps with my above points as a headstart, before worrying about changing tests or products.

  • Rich
Actions #14

Updated by okurz over 6 years ago

So … can I assume aginies to look into it? Not sure I got how exactly you want to coordinate further work.

Actions #15

Updated by RBrownSUSE about 6 years ago

RBrownSUSE wrote:

So, let's stop pointing figures at look at this holistically

First, consider that libvirt networking is abstracted away from both wicked and NM.
The 'default' network is provided by "libvirt-daemon-config-network"

The question to me becomes "Why does libvirt network 'default' get activated when wicked is the default network stack, but not with NM?"

Is the libvirt-daemon-config-network package not being installed?
Is this a bug that needs to be fixed in libvirt-daemon-config-network?
If libvirt-daemon-config-network is installed, why is the default.xml not being activated? This is probably a bug that needs to be fixed.

It's been 11 days since Max reported this issue and right now I would say that no one has actually looked at the bug in question..lets start with that, perhaps with my above points as a headstart, before worrying about changing tests or products.

Given no one else seems to have taken the initative and followed my above advice, I took the tiny steps of answering the simple questions above.

Is the libvirt-daemon-config-network package not being installed?

Correct, the libvirt-daemon-config-network package is not installed, neither by yast vm nor zypper in patterns-server-kvm_server

Is this a bug that needs to be fixed in libvirt-daemon-config-network?

Maybe

https://build.opensuse.org/package/view_file/Virtualization/libvirt/libvirt.spec?expand=1

libvirt-daemon-config-network is required by libvirt

libvirt is not installed by patterns-server-kvm_server (it is a suggests: https://build.opensuse.org/package/view_file/openSUSE:Factory/patterns-server/patterns-server.spec?expand=1)

yast vm installs patterns-server-kvm_server https://github.com/yast/yast-vm/blob/3fdc972d9323cd4c1c73898994723c9f5b517bd5/src/modules/VirtConfig.rb

This bug seems to be simply a case of either libvirt missing from patterns-server-kvm_server or some other more suitable dependency solution to ensure libvirt-daemon-config-network is actually installed

The bug has been filed https://bugzilla.opensuse.org/show_bug.cgi?id=1109832

Actions #16

Updated by okurz about 6 years ago

So the bug mentions https://build.opensuse.org/request/show/639622 which in turn was declined to favor https://build.opensuse.org/request/show/625134 which was accepted 5 days ago and the according SR to openSUSE:Factory would be https://build.opensuse.org/request/show/639838 which was also accepted 5 days ago. However, this does not fix the tests and I am not even sure if it was expected to do so: https://openqa.opensuse.org/tests/769647

@aginies I assume you are still aware of this? May I ask what you plan about this ticket and the failing test?

Actions #17

Updated by aginies about 6 years ago

@aginies I assume you are still aware of this? May I ask what you plan about this ticket and the failing test?

Sorry, i don't follow openSUSE openQA.
does the needed patterns installed? Its not done by yast2-vm under openSUSE (as the patterns were not available). You need to get patterns_server_kvm_tools installed to be able to get the default network setup (libvirt one).
Yast2-vm should be modified to handle this installation. I updated the bug.

Actions #18

Updated by okurz about 6 years ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: virtualization
https://openqa.opensuse.org/tests/779516

Actions #19

Updated by okurz about 6 years ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: virtualization
https://openqa.opensuse.org/tests/787342

Actions #20

Updated by okurz about 6 years ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: virtualization
https://openqa.opensuse.org/tests/799902

Actions #21

Updated by okurz about 6 years ago

  • Target version changed from Milestone 20 to Milestone 21

asked in #opensuse-factory for guidance by openSUSE RMs

Actions #22

Updated by okurz about 6 years ago

The SR for https://bugzilla.opensuse.org/show_bug.cgi?id=1109832 has been accepted 15 days ago, tests still fail though in https://openqa.opensuse.org/tests/805269#step/virt_install/24 with "network 'default' is not active". Both pattern kvm_tools and libvirt-daemon-config-network seem to be installed though.

Actions #23

Updated by aginies about 6 years ago

commented the bug.
Based on the fact there is all packages this sounds like the default network is not started.

Actions #24

Updated by okurz about 6 years ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: virtualization
https://openqa.opensuse.org/tests/815403

Actions #25

Updated by okurz almost 6 years ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: virtualization
https://openqa.opensuse.org/tests/821229

Actions #26

Updated by okurz almost 6 years ago

  • Blocks action #45713: [functional][u] test for nested virtualization / qemu / typing over VNC added
Actions #27

Updated by aginies almost 6 years ago

No one has replied to my comment on the bug. There is no way to fix that without log. There is also a way to get this fix in comment (starting the network by hand)

Actions #28

Updated by okurz almost 6 years ago

maxlin plans to take a look this week

Actions #29

Updated by okurz almost 6 years ago

  • Target version changed from Milestone 21 to Milestone 22
Actions #30

Updated by okurz almost 6 years ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: virtualization
https://openqa.opensuse.org/tests/832401

Actions #31

Updated by okurz almost 6 years ago

  • Description updated (diff)
  • Status changed from New to Workable
  • Assignee deleted (aginies)
  • Priority changed from High to Urgent

@aginies as this is not going forward but QSF-u considers the virtualization tests as important to not break Tumbleweed virtualization even further QSF-u will try to help with gathering the necessary logs.

Actions #32

Updated by agraul almost 6 years ago

  • Status changed from Workable to In Progress
  • Assignee set to agraul
Actions #33

Updated by agraul almost 6 years ago

  • Status changed from In Progress to Feedback

Together with @szarate I've created a new bug report (https://bugzilla.opensuse.org/show_bug.cgi?id=1123699) for the not automatically started 'default' network and provided a workaround (https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/6676).

What logs should be collected by the tests @aginies for the future?

Actions #34

Updated by okurz almost 6 years ago

@agraul PR merged, seems to be ok. Now the post_fail_hook in the following module https://openqa.opensuse.org/tests/843249#step/virt_install/39 failed *within the post_fail_hook. I think mgriessmeier implemented that "which" part and could help you with that. Please take a look there as well as why the test module actually fails.

Actions #35

Updated by agraul almost 6 years ago

  • Status changed from Feedback to In Progress

I've created a new PR which fixes the softfail check mentioned in comment 33 -> https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/6697

virt_install still fails, I am investigating it (looks like a needle mismatch)

Actions #36

Updated by agraul almost 6 years ago

  • Status changed from In Progress to Feedback
  • Assignee deleted (agraul)

okurz updated needles, https://openqa.opensuse.org/tests/844239# will show the results. I am unassigning myself as I have school the next two weeks.

Actions #37

Updated by okurz almost 6 years ago

  • Status changed from Feedback to Resolved
  • Assignee set to okurz

I created one (or two?) more missing needles and we have reached a soft-failed job in https://openqa.opensuse.org/tests/844791 (not failed anymore) which is good. As the bug has received more attention again and we have a soft-fail to remind about the bug I will close this ticket.

Actions

Also available in: Atom PDF