Project

General

Profile

Actions

action #159048

open

QA - coordination #153655: [saga][epic] Future datacenter and network setup at SUSE

QA - coordination #159543: [epic] PowerPC Power10 setup for QE LSG

Setup new Power10 machine for QE LSG in PRG2 (S/N 7882391)

Added by mgriessmeier 3 months ago. Updated 4 days ago.

Status:
Blocked
Priority:
Normal
Assignee:
Category:
Feature requests
Target version:
Start date:
2024-04-16
Due date:
% Done:

0%

Estimated time:

Description

(JIRA-SD: https://sd.suse.com/servicedesk/customer/portal/1/SD-154392)
according to @horon, two new power10 machines have been arrived in PRG2 (among others). One of this machine (S/N 7882391) is supposed for QE LSG and will be integrated into openQA.Ideally this machine should be mounted into PRG2-J-J11

(I am aware that mentioned rack is full at the moment, so one goal of the ticket would also be to find a solution for this)

As for requirements of the machines, the same apply as they do for our P9 machines in this rack (e.g. redcurrant).oqa.prg2.suse.orgThis means:

AC1 : Machine is racked and connected to power and network in an openQA-assigned rack
AC2 : the machine is reachable over ssh from openqa.suse.de over FQDN (TBD)
AC3 : the ASM of the machine is reachable from powerhmc1.oqa.suse.org
AC4 : racktables is up-to-date with the network connection details


Related issues 2 (0 open2 closed)

Related to QA - action #132140: Support move of PowerPC machines to PRG2 size:MResolvedokurz2023-06-29

Actions
Copied to openQA Infrastructure - action #162443: Setup of dedicated HMC within qe.prg2.suse.org size:SResolvedokurz

Actions
Actions #1

Updated by mgriessmeier 3 months ago

  • Description updated (diff)
Actions #2

Updated by okurz 3 months ago

  • Tags set to infra, reactive work
  • Category set to Feature requests
  • Status changed from New to Blocked
  • Assignee set to okurz
  • Target version set to Ready
Actions #3

Updated by okurz 2 months ago · Edited

Following points discussed with mgriessmeier:

  1. Moving around machines in PRG2-J11 to pack the rack 100% sounds like a lot of effort and will only be a time-limited solution because sooner or later we need to add or change something again so I recommend against doing that
  2. legolas might be moved because it’s not used in openqa.suse.de directly, same as huckleberry, soapberry, blackcurrant, cloudberry but that would mean we need access to the HMC powerhmc1.oqa.prg2.suse.org OR a separate HMC as requested in https://jira.suse.com/browse/ENGINFRA-3764 . If an HMC can be provided in PRG2e then I see no problem to move those machines to PRG2e. haldir should not be moved as it’s currently used in openqa.suse.de
  3. If access to powerhmc1.oqa.prg2.suse.org can be provided or the separate HMC can be provided in PRG2e then the easiest for now would be to place the new Power10 machine in PRG2e however that would be less consistent as then more “production” oqa.prg2.suse.org machines would be in PRG2e whereas non-oqa.prg2.suse.org machines would be in PRG2
  4. It would be preferred to have “production”-load in PRG2 meaning we need more space in PRG2. Generally we are flexible where in PRG2 that would be.
  5. You could also put machines into the neighboring rack J12, reserved for openqa.opensuse.org, but only if there are no further restrictions from IT. For this keep in mind that machines in J12 are in a dedicated DMZ network so not part of SUSE internal networks. One could either just put machines in J12 and still connect to the switch in J11, a bit messy, or configure the switch in J12 to serve multiple networks, also a bit messy and potentially insecure if one is not careful with configuration
  6. Maybe there is also other space in PRG2 that can be used and easier to connect to than PRG2e, e.g. J5 which is currently empty, or move the content from J10, the openSUSE rack, to have a neighboring rack for similar purpose

the problem of physical space is also related to #153736 which is about nessberry that "should" go to J11 but does not have enough free space.

In #132140-4 I took notes regarding the original plan to put PowerPC machines in PRG2 for now

Meeting with jford, mcaj, horon, mgriessmeier, gpfuetzenreuter. We decided into which racks the PowerPC in PRG2 should go. There are two racks planned for LSG QE: PRG2-J11 for QE&openqa.suse.de, PRG2-J12 for openqa.opensuse.org […] no specific place was planned within the QE/openQA racks so far but we have some space available so we planned all those PowerPC machines to go to PRG2-J11 as well. If there is not enough space those machines can move to other spare racks based on consideration by Eng-Infra

Regarding that one should also consider that at this time, 2023-07-07, there were no plans regarding PRG2e yet and NUE3 was planned to be a viable hot redundant datacenter. After those plans changed it was decided to move more PowerPC machines to PRG2/PRG2e.

Actions #4

Updated by okurz 2 months ago

  • Parent task set to #159543
Actions #5

Updated by okurz 2 months ago

  • Related to action #132140: Support move of PowerPC machines to PRG2 size:M added
Actions #6

Updated by okurz 2 months ago

One more idea: We are currently using only 17 HEs in PRG2e in https://racktables.nue.suse.com/index.php?page=rack&rack_id=24562 plus estimated 2 HE for new Power10 would be 19 HEs. Maybe the easiest solution for all would be to move 19 HEs to PRG2 if IT can provide the space.

Actions #7

Updated by livdywan about 1 month ago

okurz wrote in #note-2:

Thx, will track https://sd.suse.com/servicedesk/customer/portal/1/SD-154392

Looks like work on the HMC is being tracked in https://jira.suse.com/browse/ENGINFRA-4259 now.

Actions #9

Updated by okurz 27 days ago

I overlooked an email from 2024-05-29 with credentials. Created
https://gitlab.suse.de/openqa/password/-/merge_requests/14
accordingly for the second HMC.

Actions #10

Updated by okurz 19 days ago

https://gitlab.suse.de/openqa/password/-/merge_requests/14 now merged myself. gschlotter informed me that tomorrow huckleberry will be moved as decided in https://sd.suse.com/servicedesk/customer/portal/1/SD-154392 . To replicate the latest status from https://sd.suse.com/servicedesk/customer/portal/1/SD-154392 here

Moroni Flores 5 days ago
Oliver's suggestion is sensible and can be actioned. We will move the machines listed from PRG2 to PRG2e and mount also the new power10 machine with them. DCOps will decide in which rack they will be mounted.

Oliver Kurz 1 week ago
Discussed with mgriessmeier and based on suggestions by gschlotter and based on the wish by IT to move all “QE but not openQA” machines to PRG2e mgriessmeier and me suggest the following:

  1. We can move the following machines to PRG2e as they are “QE” and not “openQA”: haldir, legolas, huckleberry, soapberry, blackcurrant
  2. Not cloudberry as that machine seems to be free so we would like to repurpose it for openQA directly as openQA production worker so it’s already in the right location.
  3. nessberry is already in PRG2e and should be able to stay there as we now have the separate HMC and we are moving other QE PowerPC machines next to it.
  4. The two new Power10 machines can then also stay or be mounted in PRG2e for now, next to haldir, legolas, huckleberry, soapberry, blackcurrant

The above suggestions hold as long as we can assume that we will still be able to interconnect those machines with openqa.suse.de aka. machines in the oqa.prg2.suse.org domain.

Actions #11

Updated by okurz 12 days ago

  • Copied to action #162443: Setup of dedicated HMC within qe.prg2.suse.org size:S added
Actions #12

Updated by livdywan 4 days ago

okurz wrote in #note-10:

https://gitlab.suse.de/openqa/password/-/merge_requests/14 now merged myself. gschlotter informed me that tomorrow huckleberry will be moved as decided in https://sd.suse.com/servicedesk/customer/portal/1/SD-154392

Server is racked in J11 and hmc's are connected to TORs, power attached, server is powered off. https://racktables.nue.suse.com/index.php?page=object&object_id=28056
Once legolas is moved the drawer will be racked and attached to the server/activated.

Things are happening 😸

Actions

Also available in: Atom PDF