Project

General

Profile

Actions

action #168106

closed

coordination #167054: [epic] Run more workloads in CC-compliant PRG2 to be less affected by CC related network changes

QE PXE server in PRG2

Added by okurz 2 months ago. Updated about 12 hours ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Feature requests
Start date:
2024-10-10
Due date:
% Done:

0%

Estimated time:
Tags:

Description

Motivation

The non-compliant NUE2 based PXE server qa-jump.qe.nue2.suse.org is relied upon also for PRG2 based machines. In case CC-related network changes due to #165282 might include preventing access from non-CC areas to openqa.suse.de or hosts within CC areas like PowerPC in PRG2 might not be able to access qa-jump.qe.nue2.suse.org anymore which would pose problems with all bare-metal tests, PowerPC and similar. For this setting up a PRG2 local PXE server would be the right approach. This however is already planned with #155524

Acceptance criteria

  • AC1: PRG2 based openQA tests relying on PXE do not rely on qa-jump.qe.nue2.suse.org
  • AC2: NUE2 based openQA tests relying on PXE still use qa-jump.qe.nue2.suse.org

Suggestions

Actions #1

Updated by okurz about 2 months ago

  • Assignee changed from okurz to mkittler

assigning to mkittler to track blocker tickets for now

Actions #2

Updated by szarate about 2 months ago · Edited

For the sake of completeness

https://gitlab.suse.de/OPS-Service/salt/-/merge_requests/5748
If my very naive understanding is correct, all is needed to unblock https://jira.suse.com/browse/ENGINFRA-3941 is the VM + PR above with the proper changes (see the description)
Updated title of  https://jira.suse.com/browse/ENGINFRA-3941 so it reflects better what is needed
@Oliver Kurz
 can you confirm that it is like that? yes/no?
@Moroni Flores
 lmk if there’s an issue with provisioning the VM (id guess same specs/os as qa-jump) 

I’d say we can use qamaster if you can’t provision, but I realized it is in NUE, and likely will have the wireguard tunnel… and that defeats the purpose (I guess, but it does not have the tunnel
Actions #4

Updated by okurz 19 days ago

Still blocked by #155524 . There is significant process but not done yet

Actions #5

Updated by okurz about 20 hours ago

  • Status changed from Blocked to New
  • Priority changed from Normal to High

#155524 was resolved. Please check and ensure that both ACs are covered and if not working please collaborate with dheidler.

Actions #7

Updated by okurz about 18 hours ago

  • Priority changed from High to Normal
Actions #9

Updated by mkittler about 13 hours ago

I'll check documentation mentioned on #155524#note-43 when GitLab comes back and will check what was changed via https://gitlab.suse.de/OPS-Service/salt/-/merge_requests/5893.

It looks like IPXE_HTTPSERVER already points to http://baremetal-support.qe.prg2.suse.org on relevant worker slots (on worker33, 34, 35 and 36). The NUE2-URL http://baremetal-support.qe.nue2.suse.org is only used on sapworker1 slots anymore. So I don't have to update workerconf.sls.

Actions #10

Updated by mkittler about 13 hours ago

  • Status changed from New to In Progress

It looks like the PXE setup generally works. There are passing jobs like https://openqa.suse.de/tests/16210082 using the new PXE server according to https://openqa.suse.de/tests/16210082/file/vars.json. Other examples are https://openqa.suse.de/tests/16210189, https://openqa.suse.de/tests/16238719 and https://openqa.suse.de/tests/16239161.

I suppose with that all ACs are fulfilled. AC2 is also fulfilled because the support server in NUE2 is still up and running (and connectivity problems with it are handled in #173839).

Actions #11

Updated by dheidler about 13 hours ago

To my knowledge both ACs are already resolved.

Actions #12

Updated by okurz about 12 hours ago

  • Status changed from In Progress to Resolved

Agreed

Actions

Also available in: Atom PDF