action #132143
Updated by okurz over 1 year ago
## Motivation
The openQA webUI VM for o3 will move to PRG2. This will be conducted by Eng-Infra. We must support them.
## Acceptance criteria
* **AC1:** o3 is reachable from the new location for SUSE employees
* **AC2:** Same as AC1 but for community members outside SUSE
* **AC3:** o3 multi-machine jobs run successfully on o3 after the migration
* **AC4:** We can still login into the machine over ssh from outside the SUSE network
* **AC5:** https://zabbix.nue.suse.com/ can still monitor o3
## Suggestions
* *DONE* Track https://jira.suse.com/browse/ENGINFRA-2347 "DMZ-OpenQA implementation" (done) so that the o3 network is available
* *DONE* Track https://jira.suse.com/browse/ENGINFRA-2155 "Install Additional links to DMZ-CORE from J12 - openQA-DMZ" (done), something about cabling
* Track https://jira.suse.com/browse/ENGINFRA-1742 "Build OpenQA Environment" for story of the o3 VM being migrated
* *DONE* Inform affected users about planned migration on date 2023-07-19
* During migration work closely with Eng-Infra members conducting the actual VM migration
1. *DONE* Join Jitsi and one thread in team-qa-tools and one thread in dct-migration
2. *DONE* Wait for go-no-go meeting at 0700Z
3. Wait for mcaj to give the go from Eng-Infra side, then switch off the openQA scheduler on o3 and disable the authentication. I guess we can try to "break" the code by disabling any authenticated actions.
4. Also switch off other services like gru, scripts, investigation, etc.
5. Prepare old workers to connect over https as soon as o3 comes up again in prg2
6. Install more new machines in prg2 while waiting for the VM to come online
7. As soon as VM is ready in new place ensure that the webUI is good in read-only mode first
8. Update IP addresses on ariel where necessary in /etc/hosts, also crosscheck /etc/dnsmasq.d/openqa.conf
9. Ask Eng-Infra, mcaj, to switch off the DHCP/DNS/PXE server in the oqa dmz network
10. Try to reboot a worker from the PXE on o3
11. Enable workers to connect to o3 directly, not external https, and use testpoolserver with rsync instead
12. Enable production worker classes on new workers after tests look good
13. Connect old workers from NUE1 over https, in particular everything non-qemu-x86_64 for the time being, e.g. aarch64, ppc64le, s390x, bare-metal until we have such things directly from prg2
14. Test and monitor a lot of o3 tests
15. As soon as everything looks really stable announce it to users as response all the above announcments
* Ensure that o3 is reachable again after migration from the new location
* for SUSE employees
* for community members outside SUSE
* for o3 workers from at least one location (NUE1 or PRG2)
* Ensure that we can still login into the machine over ssh from outside the SUSE network
* Ensure that https://zabbix.nue.suse.com/ can still monitor o3
* Update https://progress.opensuse.org/projects/openqav3/wiki/ where necessary
* Inform users as soon as migration is complete
* Make sure we know what to keep an eye out for for the later planned OSD VM migration
* As necessary also make sure that BuildOPS knows about caveats of migration as they plan to migrate OBS/IBS after us
* Rename /dev/vg0-new to /dev/vg0
* Ensure IPv6 is fully working
## Rollback steps
1. On o3 `systemctl unmask --now openqa-auto-update openqa-continuous-update`
2. On o3 enable again o3 specific nginx tmp+log paths in /etc/nginx/vhosts.d/openqa.conf