Actions
action #130588
closedcoordination #130582: [epic] Upgrade all our infrastructure, e.g. o3+osd workers+webui and production workloads, to openSUSE Leap 15.5
Upgrade osd workers to openSUSE Leap 15.5
Start date:
Due date:
% Done:
0%
Estimated time:
Tags:
Description
Motivation¶
- Need to upgrade workers before EOL of Leap 15.4 and have a consistent environment
Acceptance criteria¶
- AC1: all osd worker machines run a clean upgraded openSUSE Leap 15.5 (no failed systemd services, no left over .rpm-new files, etc.)
Acceptance tests¶
- AT1-1:
sudo salt -C 'G@roles:worker and not G@osrelease:15.5' test.ping
is empty
Suggestions¶
- read https://progress.opensuse.org/projects/openqav3/wiki#Distribution-upgrades
- Reserve some time when the workers are only executing a few or no openQA test jobs
- Keep IPMI interface ready and test that Serial-over-LAN works for potential recovery
- After upgrade reboot and check everything working as expected, if not rollback, e.g. with
snapper rollback
Further details¶
- Don't worry, everything can be repaired :) If by any chance the worker gets misconfigured there are btrfs snapshots to recover, the IPMI Serial-over-LAN, a reinstall is possible and not hard, there is no important data on the host (it's only an openQA worker) and there are also other machines that can jobs while one host might be down for a little bit longer. And okurz can hold your hand :)
Updated by okurz over 1 year ago
- Copied from action #111866: Upgrade osd workers and openqa-monitor to openSUSE Leap 15.4 added
Updated by okurz over 1 year ago
- Subject changed from Upgrade osd workers and openqa-monitor to openSUSE Leap 15.4 to Upgrade osd workers and openqa-monitor to openSUSE Leap 15.5
- Category set to Organisational
- Assignee deleted (
okurz) - Target version changed from Ready to future
Updated by okurz over 1 year ago
- Subject changed from Upgrade osd workers and openqa-monitor to openSUSE Leap 15.5 to Upgrade osd workers to openSUSE Leap 15.5
- Description updated (diff)
Updated by okurz over 1 year ago
- Copied to action #130648: Upgrade all other LSG QE salt controlled machines to openSUSE Leap 15.5 added
Updated by okurz over 1 year ago
- Target version changed from future to Tools - Next
Updated by okurz about 1 year ago
- Status changed from New to Blocked
- Assignee set to okurz
- Target version changed from Tools - Next to Ready
I suggest to wait for #130648 with referenced MRs first.
Updated by okurz about 1 year ago
- Status changed from Blocked to In Progress
#130648 resolved. I can continue with upgrading.
# salt --no-color -C 'not G@osrelease:15.5 and G@roles:worker' test.true
openqaworker1.qe.nue2.suse.org:
True
openqaworker16.qa.suse.cz:
True
openqaworker18.qa.suse.cz:
True
openqaworker17.qa.suse.cz:
True
openqaworker14.qa.suse.cz:
True
petrol.qe.nue2.suse.org:
True
diesel.qe.nue2.suse.org:
True
First checking
salt --no-color -C 'not G@osrelease:15.5 and G@roles:worker' cmd.run 'export new_version=15.5; zypper --releasever=$new_version --gpg-auto-import-keys ref && zypper -n --releasever=$new_version dup --auto-agree-with-licenses --replacefiles --download-only --details --dry-run' | grep -av 'Result: Clean'
our ppc64le machines show the same problem that I saw on kerosene:
diesel.qe.nue2.suse.org:
Warning: Enforced setting: $releasever=15.5
Repository 'NPI' is up to date.
Repository 'SUSE_CA' is up to date.
Repository 'devel_openQA' is up to date.
Repository 'devel_openQA_Modules' is up to date.
Repository 'Update repository of openSUSE Backports' is up to date.
Retrieving repository 'openSUSE-Leap-15.5-Oss' metadata [.error]
Repository 'openSUSE-Leap-15.5-Oss' is invalid.
[repo-oss|http://download.opensuse.org/ports/ppc/distribution/leap/15.5/repo/oss/] Valid metadata not found at specified URL
History:
- [repo-oss|http://download.opensuse.org/ports/ppc/distribution/leap/15.5/repo/oss/] Repository type can't be determined.
Please check if the URIs defined for this repository are pointing to a valid repository.
Skipping repository 'openSUSE-Leap-15.5-Oss' because of the above error.
Repository 'Update repository with updates from SUSE Linux Enterprise 15' is up to date.
Repository 'openSUSE-Leap-15.5-Update' is up to date.
Repository 'openSUSE-Leap-15.5-Update-Non-Oss' is up to date.
Repository 'Server Monitoring Software' is up to date.
Some of the repositories have not been refreshed because of an error.
petrol.qe.nue2.suse.org:
Warning: Enforced setting: $releasever=15.5
Repository 'NPI' is up to date.
Repository 'SUSE_CA' is up to date.
Repository 'devel_openQA' is up to date.
Repository 'devel_openQA_Modules' is up to date.
Repository 'kernel-stable-backport' is up to date.
Repository 'Update repository of openSUSE Backports' is up to date.
Retrieving repository 'openSUSE-Leap-15.5-Oss' metadata [.error]
Repository 'openSUSE-Leap-15.5-Oss' is invalid.
[repo-oss|http://download.opensuse.org/ports/ppc/distribution/leap/15.5/repo/oss/] Valid metadata not found at specified URL
History:
- [repo-oss|http://download.opensuse.org/ports/ppc/distribution/leap/15.5/repo/oss/] Repository type can't be determined.
Please check if the URIs defined for this repository are pointing to a valid repository.
Skipping repository 'openSUSE-Leap-15.5-Oss' because of the above error.
Repository 'Update repository with updates from SUSE Linux Enterprise 15' is up to date.
Repository 'openSUSE-Leap-15.5-Update' is up to date.
Repository 'openSUSE-Leap-15.5-Update-Non-Oss' is up to date.
Repository 'Server Monitoring Software' is up to date.
Some of the repositories have not been refreshed because of an error.
so let's handle x86_64 first.
salt --no-color -C 'not G@osrelease:15.5 and G@roles:worker and G@osarch:x86_64' cmd.run 'export new_version=15.5; zypper --releasever=$new_version --gpg-auto-import-keys ref && systemctl stop openqa-continuous-update.timer telegraf $(systemctl list-units | grep openqa-worker-auto-restart | cut -d . -f 1 | xargs) && zypper -n --releasever=$new_version dup --auto-agree-with-licenses --replacefiles --download-in-advance' | grep -av 'Result: Clean'
# salt --no-color -L 'openqaworker14.qa.suse.cz,openqaworker17.qa.suse.cz,openqaworker16.qa.suse.cz,openqaworker18.qa.suse.cz' cmd.run 'systemctl is-system-running'
openqaworker17.qa.suse.cz:
running
openqaworker16.qa.suse.cz:
running
openqaworker18.qa.suse.cz:
running
openqaworker14.qa.suse.cz:
running
and for the remaining ppc64le machines petrol+diesel
salt --no-color -C 'not G@osrelease:15.5 and G@roles:worker' cmd.run 'export new_version=15.5 && sed -i "s@baseurl=http://download.opensuse.org/ports/ppc/distribution/leap/\$releasever/repo/oss/@baseurl=http://download.opensuse.org/distribution/leap/\$releasever/repo/oss/@" /etc/zypp/repos.d/repo-oss.repo && zypper --releasever=$new_version --gpg-auto-import-keys ref && systemctl stop telegraf $(systemctl list-units | grep openqa-worker-auto-restart | cut -d . -f 1 | xargs) && zypper -n --releasever=$new_version dup --auto-agree-with-licenses --replacefiles --download-in-advance' | grep -av 'Result: Clean'
Updated by okurz about 1 year ago
- Related to action #138038: diesel+petrol missing network, IPMI still reachable added
Updated by okurz 9 months ago
- Copied to action #157975: Upgrade osd workers to openSUSE Leap 15.6 size:S added
Actions