action #121306
closed[virtualization][hyperv] worker7-hyperv.oqa.suse.de can not be reached
100%
Description
##Backgroud
VLAN Migration
openqaw7-hyperv.qa.suse.de(10.162.0.101) -> worker7-hyperv.oqa.suse.de(10.137.10.7)
I debugged windows network card and disabled ExternalVirtualSwitch.
Then I can not ping it...ah
ping worker7-hyperv.oqa.suse.de
PING worker7-hyperv.oqa.suse.de (10.137.10.7) 56(84) bytes of data.
From 10.136.0.1 (10.136.0.1) icmp_seq=1 Destination Host Unreachable
From 10.136.0.1 (10.136.0.1) icmp_seq=2 Destination Host Unreachable
From 10.136.0.1 (10.136.0.1) icmp_seq=3 Destination Host Unreachable
##Before VLAN Migration
Web access: https://sp.openqaw7-hyperv.qa.suse.de
I can access windows desktop and make some configuration, including network configuration.
It's convenient to enable it again.
##After VLAN Migration
As far as you know, just only use jumpy server to access now.
ssh -4 jumpy@qe-xxx.suse.de -- ipmitool -I lanplus -H worker7-hyperv.qe-xxx-xx -U ADMIN -P xxxxxxx sol activate
but I can not access windows desktop by this method.
##Expected
It is better to access windows desktop after VLAN Migration and helpful to debug environment sometimes.
##Impact
It effects the job running of build 52.4 on this server(5 jobs)
Files
Updated by rcai about 2 years ago
- Copied from action #116344: openqaw9-hyperv.qa.suse.de (flexo.qa.suse.cz) can not be reached size:M added
Updated by rcai about 2 years ago
- Copied from deleted (action #116344: openqaw9-hyperv.qa.suse.de (flexo.qa.suse.cz) can not be reached size:M)
Updated by rcai about 2 years ago
- Start date changed from 2022-09-08 to 2022-12-01
Updated by okurz about 2 years ago
- Subject changed from worker7-hyperv.oqa.suse.de can not be reached to [virtualization] worker7-hyperv.oqa.suse.de can not be reached
- Assignee set to nanzhang
- Target version deleted (
Ready)
nanzhang is the maintainer and managing the migration
Updated by rcai about 2 years ago
Apply access to the server(with jumper server:qe-jumpy.suse.de)
Submit MR: https://gitlab.suse.de/rcai/salt/-/merge_requests/1
Updated by rcai about 2 years ago
Merged https://gitlab.suse.de/OPS-Service/salt/-/merge_requests/2926.
Nan had access to server worker7-hyperv.oqa.suse.de
Updated by nanzhang about 2 years ago
- File worker7-hyperv.png worker7-hyperv.png added
Actually I can do nothing for this problem, as I can't see anything on Remote Console Preview via ipmi web console (refer to screenshot worker7-hyperv.png). And also its host can't be pingable.
$ ping worker7-hyperv.oqa.suse.de
PING worker7-hyperv.oqa.suse.de (10.137.10.7) 56(84) bytes of data.
From 10.136.0.1 icmp_seq=1 Destination Host Unreachable
From 10.136.0.1 icmp_seq=2 Destination Host Unreachable
From 10.136.0.1 icmp_seq=3 Destination Host Unreachable
Updated by rcai about 2 years ago
- Subject changed from [virtualization] worker7-hyperv.oqa.suse.de can not be reached to [virtualization][hyperv] worker7-hyperv.oqa.suse.de can not be reached
Updated by rcai about 2 years ago
Also file a SD ticket to trace it.
https://sd.suse.com/servicedesk/customer/portal/1/SD-106325
Updated by xlai about 2 years ago
- Priority changed from Urgent to Immediate
nanzhang wrote:
Actually I can do nothing for this problem, as I can't see anything on Remote Console Preview via ipmi web console (refer to screenshot worker7-hyperv.png). And also its host can't be pingable.
$ ping worker7-hyperv.oqa.suse.de
PING worker7-hyperv.oqa.suse.de (10.137.10.7) 56(84) bytes of data.
From 10.136.0.1 icmp_seq=1 Destination Host Unreachable
From 10.136.0.1 icmp_seq=2 Destination Host Unreachable
From 10.136.0.1 icmp_seq=3 Destination Host Unreachable
@okurz Hi Oliver, this issue is blocking all hyperv tests, especially for current 15sp5 beta2 milestone. It needs to be fixed ASAP.
In my understanding, this is OSD infra issue after security zone migration. Why tools team not fix it? Would you please explain the reasons? Besides, if you think Nan has the relevant infra permission to do so, we can do it by ourselves. But would you please give some guide to him? It seems stuck now. He can do nothing...
Updated by xlai about 2 years ago
- Status changed from In Progress to Feedback
- Assignee changed from nanzhang to okurz
Updated by rcai about 2 years ago
It is resolved, please refer ticket: https://sd.suse.com/servicedesk/customer/portal/1/SD-106325
Updated by xlai about 2 years ago
- Status changed from Feedback to Resolved
- Assignee deleted (
okurz)
Updated by okurz about 2 years ago
xlai wrote:
[…] In my understanding, this is OSD infra issue after security zone migration. Why tools team not fix it? Would you please explain the reasons? Besides, if you think Nan has the relevant infra permission to do so, we can do it by ourselves. But would you please give some guide to him? It seems stuck now. He can do nothing...
Just to make sure this is answered completely: The SUSE QE tools team does not maintain the machine worker7-hyperv.oqa.suse.de. This is also reflected in https://racktables.nue.suse.com/index.php?page=object&tab=default&object_id=10407 which gives as contact person "qa-apac2@suse.de", not "osd-admins@suse.de".
This is also explained in https://progress.opensuse.org/projects/qa/wiki/Tools#Out-of-scope in the point:
Maintenance of special worker addendums needed for tests, e.g. external hypervisor hosts for s390x, powerVM, xen, hyperv, IPMI, VMWare (Clarification: We maintain the code for all backends but we are no experts in specific domains. So we always try to help but it's a case by case decision based on what we realistically can provide based on our competence. We can't be expected to be experts in everything and also we are limited in what we can actually test.)
So let's explain that in examples:
- The SUSE QE Tools team does not monitor the machine so the team won't immediately realize if the machine is not reachable or otherwise misbehaving
- The SUSE QE Tools team can react to problems if they are brought up, like as happened in a ticket here. The team tries to help as far as their competences and capabilities go. Often frustration comes from the expectation that people would be more experienced than they actually are. It is often better to expect that the ones "on the other side" are overwhelmed, unexperienced, already pre-occupied with other stories and more. You, know, mere humans ;)
- The SUSE QE Tools team does not ensure that the machine is properly updated or upgraded, especially because it's "non-standard" as not an openSUSE Leap system for which we have good automation
One more suggestion: If your workflows rely on the presence of "an hyperv" host so much that any problem needs to be fixed ASAP then I strongly suggest you build up redundancy. openQA is very good with providing worker redundancy. This is why mostly nobody cares if a single openQA worker hardware goes down and is unavailable even for months while the problem is being worked on.