action #121306: [virtualization][hyperv] worker7-hyperv.oqa.suse.de can not be reached - openQA Infrastructure (public) - openSUSE Project Management Tool

Actions

Copy link

action #121306

closed

[virtualization][hyperv] worker7-hyperv.oqa.suse.de can not be reached

Added by rcai over 2 years ago. Updated over 2 years ago.

Status:

Resolved

Priority:

Immediate

Assignee:

Category:

Target version:

QA (public) - future

Start date:

2022-12-01

Due date:

% Done:

100%

Estimated time:

Description

##Backgroud
VLAN Migration
openqaw7-hyperv.qa.suse.de(10.162.0.101) -> worker7-hyperv.oqa.suse.de(10.137.10.7)

I debugged windows network card and disabled ExternalVirtualSwitch.
Then I can not ping it...ah
ping worker7-hyperv.oqa.suse.de
PING worker7-hyperv.oqa.suse.de (10.137.10.7) 56(84) bytes of data.
From 10.136.0.1 (10.136.0.1) icmp_seq=1 Destination Host Unreachable
From 10.136.0.1 (10.136.0.1) icmp_seq=2 Destination Host Unreachable
From 10.136.0.1 (10.136.0.1) icmp_seq=3 Destination Host Unreachable

##Before VLAN Migration
Web access: https://sp.openqaw7-hyperv.qa.suse.de
I can access windows desktop and make some configuration, including network configuration.
It's convenient to enable it again.

##After VLAN Migration
As far as you know, just only use jumpy server to access now.
ssh -4 jumpy@qe-xxx.suse.de -- ipmitool -I lanplus -H worker7-hyperv.qe-xxx-xx -U ADMIN -P xxxxxxx sol activate
but I can not access windows desktop by this method.

##Expected
It is better to access windows desktop after VLAN Migration and helpful to debug environment sometimes.

##Impact
It effects the job running of build 52.4 on this server(5 jobs)

Files

worker7-hyperv.png (70.7 KB) worker7-hyperv.png

nanzhang, 2022-12-05 13:20

Actions

Copy link

Updated by rcai over 2 years ago

Copied from action #116344: openqaw9-hyperv.qa.suse.de (flexo.qa.suse.cz) can not be reached size:M added

Actions

Copy link

Updated by rcai over 2 years ago

Copied from deleted (action #116344: openqaw9-hyperv.qa.suse.de (flexo.qa.suse.cz) can not be reached size:M)

Actions

Copy link

Updated by rcai over 2 years ago

Start date changed from 2022-09-08 to 2022-12-01

Actions

Copy link

Updated by okurz over 2 years ago

Subject changed from worker7-hyperv.oqa.suse.de can not be reached to [virtualization] worker7-hyperv.oqa.suse.de can not be reached
Assignee set to nanzhang
Target version deleted (~~Ready~~)

nanzhang is the maintainer and managing the migration

Actions

Copy link

Updated by okurz over 2 years ago

Target version set to future

Actions

Copy link

Updated by rcai over 2 years ago

Apply access to the server(with jumper server:qe-jumpy.suse.de)

Submit MR: https://gitlab.suse.de/rcai/salt/-/merge_requests/1

Actions

Copy link

Updated by rcai over 2 years ago

Merged https://gitlab.suse.de/OPS-Service/salt/-/merge_requests/2926.
Nan had access to server worker7-hyperv.oqa.suse.de

Actions

Copy link

#10

Updated by nanzhang over 2 years ago

File worker7-hyperv.png worker7-hyperv.png added

Actually I can do nothing for this problem, as I can't see anything on Remote Console Preview via ipmi web console (refer to screenshot worker7-hyperv.png). And also its host can't be pingable.

$ ping worker7-hyperv.oqa.suse.de
PING worker7-hyperv.oqa.suse.de (10.137.10.7) 56(84) bytes of data.
From 10.136.0.1 icmp_seq=1 Destination Host Unreachable
From 10.136.0.1 icmp_seq=2 Destination Host Unreachable
From 10.136.0.1 icmp_seq=3 Destination Host Unreachable

Actions

Copy link

#11

Updated by rcai over 2 years ago

Subject changed from [virtualization] worker7-hyperv.oqa.suse.de can not be reached to [virtualization][hyperv] worker7-hyperv.oqa.suse.de can not be reached

Actions

Copy link

#12

Updated by rcai over 2 years ago

Also file a SD ticket to trace it.
https://sd.suse.com/servicedesk/customer/portal/1/SD-106325

Actions

Copy link

#13

Updated by rcai over 2 years ago

% Done changed from 0 to 20

Actions

Copy link

#14

Updated by rcai over 2 years ago

Status changed from New to In Progress

Actions

Copy link

#15

Updated by xlai over 2 years ago

Priority changed from Urgent to Immediate

nanzhang wrote:

Actually I can do nothing for this problem, as I can't see anything on Remote Console Preview via ipmi web console (refer to screenshot worker7-hyperv.png). And also its host can't be pingable.

$ ping worker7-hyperv.oqa.suse.de
PING worker7-hyperv.oqa.suse.de (10.137.10.7) 56(84) bytes of data.
From 10.136.0.1 icmp_seq=1 Destination Host Unreachable
From 10.136.0.1 icmp_seq=2 Destination Host Unreachable
From 10.136.0.1 icmp_seq=3 Destination Host Unreachable

@okurz Hi Oliver, this issue is blocking all hyperv tests, especially for current 15sp5 beta2 milestone. It needs to be fixed ASAP.

In my understanding, this is OSD infra issue after security zone migration. Why tools team not fix it? Would you please explain the reasons? Besides, if you think Nan has the relevant infra permission to do so, we can do it by ourselves. But would you please give some guide to him? It seems stuck now. He can do nothing...

Actions

Copy link

#16

Updated by xlai over 2 years ago

Status changed from In Progress to Feedback
Assignee changed from nanzhang to okurz

Actions

Copy link

#17

Updated by rcai over 2 years ago

It is resolved, please refer ticket: https://sd.suse.com/servicedesk/customer/portal/1/SD-106325

Actions

Copy link

#18

Updated by xlai over 2 years ago

Status changed from Feedback to Resolved
Assignee deleted (~~okurz~~)

Actions

Copy link

#19

Updated by okurz over 2 years ago

xlai wrote:

[…] In my understanding, this is OSD infra issue after security zone migration. Why tools team not fix it? Would you please explain the reasons? Besides, if you think Nan has the relevant infra permission to do so, we can do it by ourselves. But would you please give some guide to him? It seems stuck now. He can do nothing...

Just to make sure this is answered completely: The SUSE QE tools team does not maintain the machine worker7-hyperv.oqa.suse.de. This is also reflected in https://racktables.nue.suse.com/index.php?page=object&tab=default&object_id=10407 which gives as contact person "qa-apac2@suse.de", not "osd-admins@suse.de".

This is also explained in https://progress.opensuse.org/projects/qa/wiki/Tools#Out-of-scope in the point:

Maintenance of special worker addendums needed for tests, e.g. external hypervisor hosts for s390x, powerVM, xen, hyperv, IPMI, VMWare (Clarification: We maintain the code for all backends but we are no experts in specific domains. So we always try to help but it's a case by case decision based on what we realistically can provide based on our competence. We can't be expected to be experts in everything and also we are limited in what we can actually test.)

So let's explain that in examples:

The SUSE QE Tools team does not monitor the machine so the team won't immediately realize if the machine is not reachable or otherwise misbehaving
The SUSE QE Tools team can react to problems if they are brought up, like as happened in a ticket here. The team tries to help as far as their competences and capabilities go. Often frustration comes from the expectation that people would be more experienced than they actually are. It is often better to expect that the ones "on the other side" are overwhelmed, unexperienced, already pre-occupied with other stories and more. You, know, mere humans ;)
The SUSE QE Tools team does not ensure that the machine is properly updated or upgraded, especially because it's "non-standard" as not an openSUSE Leap system for which we have good automation

One more suggestion: If your workflows rely on the presence of "an hyperv" host so much that any problem needs to be fixed ASAP then I strongly suggest you build up redundancy. openQA is very good with providing worker redundancy. This is why mostly nobody cares if a single openQA worker hardware goes down and is unavailable even for months while the problem is being worked on.

Actions

Copy link

#20

Updated by rcai over 2 years ago

% Done changed from 20 to 100

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

QA (public) » openQA Project (public) » openQA Infrastructure (public)

Tags

Custom queries

action #121306

[virtualization][hyperv] worker7-hyperv.oqa.suse.de can not be reached

Updated by rcai over 2 years ago

Updated by rcai over 2 years ago

Updated by rcai over 2 years ago

Updated by okurz over 2 years ago

Updated by okurz over 2 years ago

Updated by rcai over 2 years ago

Updated by rcai over 2 years ago

Updated by nanzhang over 2 years ago

Updated by rcai over 2 years ago

Updated by rcai over 2 years ago

Updated by rcai over 2 years ago

Updated by rcai over 2 years ago

Updated by xlai over 2 years ago

Updated by xlai over 2 years ago

Updated by rcai over 2 years ago

Updated by xlai over 2 years ago

Updated by okurz over 2 years ago

Updated by rcai over 2 years ago