Project

General

Profile

Actions

action #121672

closed

openQA Tests - action #107062: Multiple failures due to network issues

[virtualization] Connectivity issues on worker8-vmware.oqa.suse.de

Added by mloviska over 1 year ago. Updated over 1 year ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
Start date:
2022-12-07
Due date:
% Done:

100%

Estimated time:

Description

After the security zone migration, we can see several network instabilities with test jobs running on worker8-vmware.oqa.suse.de.

Actions #1

Updated by mloviska over 1 year ago

  • Description updated (diff)
Actions #2

Updated by maritawerner over 1 year ago

  • Project changed from openQA Tests to openQA Infrastructure
Actions #4

Updated by okurz over 1 year ago

  • Subject changed from Connectivity issues on worker8-vmware.oqa.suse.de to [virtualization] Connectivity issues on worker8-vmware.oqa.suse.de
  • Assignee set to xlai
  • Target version set to future
Actions #5

Updated by xlai over 1 year ago

  • Status changed from New to Workable
  • Assignee changed from xlai to nanzhang
  • Priority changed from Normal to Urgent

@nanzhang Hi Nan, would you please have a look at the issue? I guess you may face the same issues in your vmware tests and already open tickets? If yes, would you pleaes give some info here and share which tickets can be closed so that all can concentrate one ticket per issue?
Besides, it will be great if you know how to fix above issue. But I guess the connectivity issues are infra issues, which you may not be able to fix either and we need to find help from eng-infra team or openqa tools team? If yes, please also state it clearly here, I will then re-assign it to proper team.

Actions #6

Updated by cachen over 1 year ago

Add my findings:

PING worker2.oqa.suse.de(worker2.oqa.suse.de (2a07:de40:a203:12:2e60:cff:fe73:2ac)) 56 data bytes
C
--- worker2.oqa.suse.de ping statistics ---
12 packets transmitted, 0 received, 100% packet loss

I am wondering if anything wrong in FQDN setting for those machines(qanet.qa.suse.de also not pingable to me, but IP works),or anything need to be set in SUT for connecting to these FQDN. @okurz Do you know if we already have eng-infra ticket to report FQDN doesn't works?

FYI, my /etc/resolv.conf automated with openvpn connected.
search suse.cz suse.de prv.suse.net suse.asia suse.com
nameserver 10.100.2.10
nameserver 10.100.2.8
nameserver 192.168.5.1

Actions #7

Updated by okurz over 1 year ago

cachen wrote:

Add my findings:

  • I failed to ping FQDN: worker2.oqa.suse.de with my openvpn connected … @okurz Do you know if we already have eng-infra ticket to report FQDN doesn't works?

FYI, my /etc/resolv.conf automated with openvpn connected.
search suse.cz suse.de prv.suse.net suse.asia suse.com
nameserver 10.100.2.10
nameserver 10.100.2.8
nameserver 192.168.5.1

I know that at least one person using an openVPN connection from the CZ zone managed to resolve FQDNs in the new zones. I suggest you open a ticket over sd.suse.com and put that information there. Maybe the nameservers in the 10.100.x.x range are out-of-sync or need a manual configuration update.

Actions #8

Updated by cachen over 1 year ago

okurz wrote:

cachen wrote:

Add my findings:

  • I failed to ping FQDN: worker2.oqa.suse.de with my openvpn connected … @okurz Do you know if we already have eng-infra ticket to report FQDN doesn't works?

FYI, my /etc/resolv.conf automated with openvpn connected.
search suse.cz suse.de prv.suse.net suse.asia suse.com
nameserver 10.100.2.10
nameserver 10.100.2.8
nameserver 192.168.5.1

I know that at least one person using an openVPN connection from the CZ zone managed to resolve FQDNs in the new zones. I suggest you open a ticket over sd.suse.com and put that information there. Maybe the nameservers in the 10.100.x.x range are out-of-sync or need a manual configuration update.

I got answer from Martin Caj, the reason that I failed ping/ssh these machine's FQDN is due to "gate.suse.cz does not works with ipv6 yet", 2 solutions: 1)using gate[1|2].suse.de, or 2)using ping/ssh with '-4' to these machines FQDN. Here I could see an enhancement in communication: what is changed in network, what can be impact, and what need to be adapted to these changes.

Since both worker8-vmware and worker2 in same 10.137.10.x network, the connection issue seems shouldn't to FQDN and ipv6. Reading the error that couldn't resolve worker2 host issue and scc connection issue, I am wondering whether the test guest has successful build its network(from dhcp by default?), if not, what is the failure? It will be good to have a check.

Actions #9

Updated by nanzhang over 1 year ago

  • Status changed from Workable to Feedback
  • Assignee changed from nanzhang to okurz

From my investigation, the root cause is that the VM can't get ip from dhcp service while booting up.
https://openqa.suse.de/tests/10087101#step/firstrun/11

It should be the same issue as the ticket - https://sd.suse.com/servicedesk/customer/portal/1/SD-106572

I set it back to Oliver, we need to have a solution after vlan migration.

Actions #10

Updated by mloviska over 1 year ago

Works for JeOS on VMware

Actions #11

Updated by nanzhang over 1 year ago

  • Status changed from Feedback to Resolved
  • % Done changed from 0 to 100

The new config allows pool of 100 Ip address in the oqa.suse.de, it can resolve the ip assignment issue for VM.

Actions

Also available in: Atom PDF