action #157975
Updated by okurz 22 days ago
## Motivation
* Need to upgrade workers before EOL of Leap 15.5 and have a consistent environment
## Acceptance criteria
* **AC1:** all osd worker machines run a clean upgraded openSUSE Leap 15.6 (no failed systemd services, no left over .rpm-new files, etc.)
## Acceptance tests
* **AT1-1:** `sudo salt -C 'G@roles:worker and not G@osrelease:15.6' test.ping` is empty
## Suggestions
* read https://progress.opensuse.org/projects/openqav3/wiki#Distribution-upgrades
* Reserve some time when the workers are only executing a few or no openQA test jobs
* Keep IPMI interface ready and test that Serial-over-LAN works for potential recovery
* Apply the workaround for #162296, i.e. `zypper al -m "boo#1227616" *firewall*`
* Start with non-ppc64le due to #169939
* After upgrade reboot and check everything working as expected, if not rollback, e.g. with `snapper rollback`
* Consider also ppc64le but see #169939
## Rollback steps
* `hostname=worker31.oqa.prg2.suse.org ssh osd "sudo salt-key -y -a $hostname && sudo salt --state-output=changes $hostname state.apply"`
* `ssh osd "sudo salt -C 'G@roles:worker' cmd.run 'systemctl unmask rebootmgr && systemctl enable --now rebootmgr && rebootmgrctl reboot'"`
## Further details
* Don't worry, everything can be repaired :) If by any chance the worker gets misconfigured there are btrfs snapshots to recover, the IPMI Serial-over-LAN, a reinstall is possible and not hard, there is no important data on the host (it's only an openQA worker) and there are also other machines that can jobs while one host might be down for a little bit longer. And okurz can hold your hand :)
Back