Project

General

Profile

Actions

action #159231

closed

coordination #121720: [saga][epic] Migration to QE setup in PRG2+NUE3 while ensuring availability

coordination #123800: [epic] Provide SUSE QE Tools services running in PRG2 aka. Prg CoLo

Bring back worker class "hmc_ppc64le-4disk" on redcurrant or another machine size:M

Added by okurz 8 months ago. Updated 7 months ago.

Status:
Resolved
Priority:
High
Assignee:
Start date:
Due date:
% Done:

0%

Estimated time:

Description

Motivation

Most PowerPC machines are being setup in PRG2 within #132140 and are at least discoverable from HMC. Now we can setup redcurrant as production openQA PowerVM worker in OSD again.

https://suse.slack.com/archives/C02CANHLANP/p1713437618501069

Hello, Is there any plan/ticket to bring back ppc64le hmc worker with 4 disk?
https://openqa.suse.de/tests/overview?state=assigned&state=setup&state=running&state=uploading&state=scheduled&distri=sle&version=15-SP6&build=80.1&groupid=129

Acceptance criteria

Suggestions

Actions #2

Updated by okurz 8 months ago

  • Subject changed from Bring back "hmc_ppc64le-4disk-poo139199" to Bring back worker class "hmc_ppc64le-4disk" on redcurrant or another machine
Actions #3

Updated by nicksinger 8 months ago

  • Subject changed from Bring back worker class "hmc_ppc64le-4disk" on redcurrant or another machine to Bring back worker class "hmc_ppc64le-4disk" on redcurrant or another machine size:M
  • Description updated (diff)
  • Status changed from New to Workable
Actions #4

Updated by okurz 8 months ago

  • Description updated (diff)
Actions #5

Updated by okurz 7 months ago

  • Priority changed from Normal to High
Actions #6

Updated by ybonatakis 7 months ago

  • Status changed from Workable to In Progress
  • Assignee set to ybonatakis
Actions #7

Updated by ybonatakis 7 months ago

  • Status changed from In Progress to Workable
  • Assignee deleted (ybonatakis)

I tried to run some jobs. the one that run assigned to worker29. but they never boot.
I also cant ssh into redcurrant-1.oqa.prg2.suse.org.

From other ticket my understanding is that I have to add three disks on the machine but I really not sure how.

Actions #9

Updated by nicksinger 7 months ago

  • Assignee set to nicksinger
Actions #10

Updated by nicksinger 7 months ago

  • Status changed from Workable to Feedback

So I added 3 additional disks via the HMC ( https://10.145.14.33 ) to redcurrant-1 and redcurrant-2 - 20GB for each of these disks. Then tried to clone https://openqa.suse.de/tests/11163027 with a more recent build and SP. I downloaded a vars.json of a newer job and used:

cat vars.json | grep -v "\"BUILD\"" | grep -E "SP5|102\\.1" | tr -d "\"" | tr -d "," | tr ":" "=" | tr -d " " | tr "\n" " " | sed "s/102\\.1/90\\.1/g" | sed "s/-SP5/-SP6/g" | sed "s/http=/http:/g" | sed "s/ftp=/ftp:/g" | sed "s/https=/https:/g" | sed "s/nfs=/nfs:/g" | sed "s/smb=/smb:/g"

which is very hacky but produced updated variables needed to run a job. I also had to update SCC-codes and then appended the output of the above and SCC-codes to my clone-job call:

openqa-clone-job --within-instance https://openqa.suse.de/tests/11163027 YAML_SCHEDULE=schedule/yast/raid/raid0_sle_gpt_prep_boot_pvm.yaml YAML_SCHEDULE_DEFAULT=schedule/yast/sle/flows/default_ppc64le.yaml _GROUP=0 WORKER_CLASS=redcurrant-2 {TEST,BUILD}+=_poo139199 SCC_REGCODE=[…] [output of adjusted vars.json]

which produced: https://openqa.suse.de/tests/14516901 and proofs that the setup works as in before the VIOS reinstallation. Created https://gitlab.suse.de/openqa/salt-pillars-openqa/-/merge_requests/828 to enable the class in production again.

Actions #11

Updated by okurz 7 months ago

https://gitlab.suse.de/openqa/salt-pillars-openqa/-/merge_requests/828 merged. Technically both ACs are covered but as it was such a long time that production tests had to been disabled for those worker classes please ensure that there are now production tests using this worker class again to resolve.

Actions #12

Updated by nicksinger 7 months ago

  • Status changed from Feedback to Resolved

I asked our testing squads to enable tests again and Joaquin followed up: https://suse.slack.com/archives/C02CANHLANP/p1717564405852019?thread_ts=1713437618.501069&cid=C02CANHLANP
There doesn't seem to be a very big interest otherwise but we have some production jobs now again: https://openqa.suse.de/tests/14525632#dependencies

Actions

Also available in: Atom PDF