action #157777
opencoordination #121720: [saga][epic] Migration to QE setup in PRG2+NUE3 while ensuring availability
coordination #123800: [epic] Provide SUSE QE Tools services running in PRG2 aka. Prg CoLo
Provide more consistent PowerPC openQA ressources by migrating all novalink instances to hmc size:M
0%
Description
Motivation¶
We have the machines redcurrant running with pvm_hmc and grenache running with spvm. Both are more or less doing the same but being separate ressources in OSD makes both always scarce. We should switch grenache to pvm_hmc as well.
Acceptance criteria¶
- AC1: Both redcurrant+grenache run pvm_hmc OSD production jobs
- AC2: No more jobs are scheduled with spvm on OSD
- AC3: No OSD workers remain with spvm
Suggestions¶
- Wait for at least both redcurrant+grenache available, i.e. wait for #139112 and #139199
- Inform users and ask them to schedule only pvm_hmc jobs
- Remove the machine definition in OSD
- Switch off all grenache spvm instances
- Do
chcomgmt -o setmaster -t norm -m grenache
- Adapt settings like https://gitlab.suse.de/openqa/salt-pillars-openqa/-/blob/master/openqa/workerconf.sls#L1176-1180 but for grenache
- Ensure that both redcurrant+grenache happily work on pvm_hmc OSD production jobs
Updated by okurz about 1 month ago
- Copied from action #139199: Ensure OSD openQA PowerPC machine redcurrant is operational from PRG2 size:M added
Updated by okurz about 1 month ago
- Subject changed from Provide more consistent PowerPC openQA ressources by migrating all novalink instances to hmc to Provide more consistent PowerPC openQA ressources by migrating all novalink instances to hmc size:M
- Description updated (diff)
- Status changed from New to Blocked
- Assignee set to okurz
Updated by okurz about 1 month ago
- Target version changed from Tools - Next to future
Updated by michals about 1 month ago
The way to remove Novalink
- power on the machine, shutdown Novalink if automatically started
- from the HMC take management with chcomgmt
- remove the management flag on the Novalink partition
- delete all partitions and vios
- release management from HMC with chcomgmt
- with the management partition gone the machine management should get unlocked
- machine enters recovery state, should recover after a few minutes
- once machine recovers create and install VIOS
Updated by okurz about 1 month ago
Thank you. We should follow those steps eventually when we execute this ticket. The step "once machine recovers create and install VIOS" might be the tricky one as we encountered on redcurrant the VIOS installer can not reach all requested ressources unless the network is temporarily switched needing IT support with access to the physical network switches.
Updated by michals about 1 month ago
As per SD-150999 no DHCP proxy is needed but it is required that arbitrary traffic be allowed between the HMC and the target system.
While some individual protocols are well-understood it is not the case for all traffic that happens during installation.
Updated by okurz 26 days ago
- Related to action #153718: Move of LSG QE non-openQA PowerPC machine NUE1 to PRG2 - haldir size:M added