action #153718
opencoordination #121720: [saga][epic] Migration to QE setup in PRG2+NUE3 while ensuring availability
coordination #123800: [epic] Provide SUSE QE Tools services running in PRG2 aka. Prg CoLo
coordination #137630: [epic] QE (non-openQA) setup in PRG2
Move of LSG QE non-openQA PowerPC machine NUE1 to PRG2 - haldir size:M
0%
Description
Acceptance criteria¶
- AC1: haldir is usable from PRG2
- AC2: https://racktables.nue.suse.com/index.php?page=object&tab=default&object_id=15308 is up-to-date
Suggestions¶
- DONE Follow https://jira.suse.com/browse/ENGINFRA-3744
- Ensure machine can be reached
- DONE Ensure machine is used as in before migration -> Apparently nobody claims needing this machine
- Let's use the machine as openQA worker like redcurrant using the pvm_hmc backend
- Add network configuration, e.g. based on #139199-22
- Add partitions in https://powerhmc1.oqa.prg2.suse.org/
- Add according DHCP/DNS entries in https://gitlab.suse.de/OPS-Service/salt/
- Add according config to https://gitlab.suse.de/openqa/salt-pillars-openqa/-/blob/master/openqa/workerconf.sls
- Verify
- Update https://racktables.nue.suse.com/index.php?page=object&tab=default&object_id=15308
Updated by okurz 3 months ago
- Copied from action #153715: Move of LSG QE non-openQA PowerPC machine NUE1 to PRG2 - whale added
Updated by okurz 3 months ago
- Copied to action #153721: Move of LSG QE non-openQA PowerPC machine NUE1 to PRG2 - legolas added
Updated by okurz about 1 month ago
- Due date set to 2024-04-09
- Status changed from Blocked to Feedback
- Target version changed from future to Ready
https://jira.suse.com/browse/ENGINFRA-3744 was set to "Done" but I have no full confirmation that the machine is usable as in before.
@dawei_pang can you confirm that haldir.qe.prg2.suse.org is fully usable? The machine is within the HMC https://powerhmc1.oqa.prg2.suse.org/ and you have access there so you could use that access.
Updated by dawei_pang about 1 month ago
Hello Oliver, haldir is not assigned to my squad.
I am not sure if I can check the system or modify any configuration, thanks!
Updated by okurz about 1 month ago
Asked in
https://suse.slack.com/archives/C02CANHLANP/p1711522254790089
@channel apparently nobody claims the PowerPC machine "haldir" for use so unless there are objections in the next days I will plan to use the machine as openQA worker then.
Updated by okurz 24 days ago
- Related to action #139199: Ensure OSD openQA PowerPC machine redcurrant is operational from PRG2 size:M added
Updated by okurz 24 days ago
- Related to action #157777: Provide more consistent PowerPC openQA ressources by migrating all novalink instances to hmc size:M added
Updated by okurz 21 days ago
- Status changed from Workable to Feedback
- Assignee set to okurz
Will need to clarify with acarvajal who mentioned haldir on https://confluence.suse.com/display/qasle/QE-SAP+Power9+Infrastructure
Updated by okurz 18 days ago
- Due date set to 2024-04-23
https://suse.slack.com/archives/C02CANHLANP/p1712665391961199
@Alvaro Carvajal in https://confluence.suse.com/display/qasle/QE-SAP+Power9+Infrastructure you mentioned haldir but previously when I asked I got the answer that QE-SAP doesn't use haldir. So should we prepare haldir as a generic openQA PowerVM host or do you use haldir within QE-SAP? Context: https://progress.opensuse.org/issues/153718
Updated by okurz 18 days ago
- Due date deleted (
2024-04-23) - Status changed from Feedback to Workable
- Assignee deleted (
okurz)
Got confirmation from acarvajal that haldir is free
yes. that confluence page is WIP and the haldir part was taken (copy & paste) from the old document at https://gitlab.suse.de/hsehic/qa-css-docs/-/blob/master/infrastructure/power9-configuration.md. I asked internally and haldir should be free
Updated by nicksinger 16 days ago
- Status changed from Workable to In Progress
Recovered the password for the VIOS padmin-user and reset it to the default. https://gitlab.suse.de/OPS-Service/salt/-/merge_requests/4976 created to give the VIOS a basic working network config. Alvaro made me aware of https://sd.suse.com/servicedesk/customer/portal/1/SD-153996. Next I try to setup a working RMC connection to the HMC to be able to configure a virtual network+disk for the LPARs and connect them with each other.
Updated by nicksinger 16 days ago
- Status changed from In Progress to Blocked
managed to figure out which interface is connected and already statically configured haldir-vios4 which is able to be reached:
workstation pillar/domain ‹add_haldir› » ping 10.145.0.103
PING 10.145.0.103 (10.145.0.103) 56(84) bytes of data.
64 bytes from 10.145.0.103: icmp_seq=1 ttl=253 time=218 ms
^C
--- 10.145.0.103 ping statistics ---
2 packets transmitted, 1 received, 50% packet loss, time 1001ms
rtt min/avg/max/mdev = 217.517/217.517/217.517/0.000 ms
I now hit the same issue as described in https://sd.suse.com/servicedesk/customer/portal/1/SD-153996 and will follow that one. But currently we can't do anything more from our side.
Updated by nicksinger 15 days ago
- Status changed from Blocked to Workable
SD ticket resolved and the RMC connection works now. I will continue with a network and disk configuration before defining the SUT LPARs.
Updated by openqa_review 10 days ago
- Due date set to 2024-05-01
Setting due date based on mean cycle time of SUSE QE Tools
Updated by nicksinger 9 days ago
LPAR network configured, disks created, LPARs created, disks+network attached to each LPAR and created https://gitlab.suse.de/OPS-Service/salt/-/merge_requests/5009 to give them all an according DHCP+DNS config. After the MR is merged and live we need to validate that the LPARs can reach and boot the PXE-server. If so, add them to our workerconf and let them run production jobs.
Updated by nicksinger 8 days ago
- Status changed from In Progress to Feedback
https://gitlab.suse.de/openqa/salt-pillars-openqa/-/merge_requests/782 which can already be merged but only tested after the OPS-request is merged as well.
Updated by nicksinger 5 days ago
- Status changed from In Progress to Feedback
started to validate the instances but realized that the OPS request is still not merged. Asking in #dct-migration
Also for later my command to create validation jobs:
for i in {1..10}; do echo openqa-clone-job --within-instance https://openqa.suse.de/tests/14092616 --skip-chained-deps --skip-download TEST+=-poo153718#${i} BUILD=nsinger_validate_poo153718 _GROUP=0 WORKER_CLASS=hmc_ppc64le_poo153718; done
Updated by nicksinger 5 days ago
MR merged, validation jobs running in https://openqa.suse.de/tests/overview?build=nsinger_validate_poo153718 and production MR prepared in https://gitlab.suse.de/openqa/salt-pillars-openqa/-/merge_requests/790
Updated by nicksinger 5 days ago
- Status changed from Feedback to Workable
All jobs reached PXE and even loaded kernel+initrd silently fail and fall back into grub. Not sure yet what causes this, have to investigate.
Updated by nicksinger about 21 hours ago
- Status changed from Workable to In Progress