action #157975: Upgrade osd workers to openSUSE Leap 15.6 size:S - openQA Infrastructure (public) - openSUSE Project Management Tool

Actions

action #157975

open

openQA Project (public) - coordination #157969: [epic] Upgrade all our infrastructure, e.g. o3+osd workers+webui and production workloads, to openSUSE Leap 15.6

Upgrade osd workers to openSUSE Leap 15.6 size:S

Added by okurz 9 months ago. Updated 9 minutes ago.

Status:

In Progress

Priority:

High

Assignee:

ybonatakis

Category:

Organisational

Target version:

openQA Project (public) - Ready

Start date:

Due date:

2024-12-26 (Due in 7 days)

% Done:

Estimated time:

Tags:

infra

Description

Motivation¶

Need to upgrade workers before EOL of Leap 15.5 and have a consistent environment size:S

Acceptance criteria¶

AC1: all osd worker machines run a clean upgraded openSUSE Leap 15.6 (no failed systemd services, no left over .rpm-new files, etc.)

Acceptance tests¶

AT1-1: sudo salt -C 'G@roles:worker and not G@osrelease:15.6' test.ping is empty

Suggestions¶

read https://progress.opensuse.org/projects/openqav3/wiki#Distribution-upgrades
Reserve some time when the workers are only executing a few or no openQA test jobs
Keep IPMI interface ready and test that Serial-over-LAN works for potential recovery
Apply the workaround for #162296, i.e. zypper al -m "boo#1227616" *firewall*
Start with non-ppc64le due to #169939
After upgrade reboot and check everything working as expected, if not rollback, e.g. with snapper rollback
Consider also ppc64le but see #169939

Rollback steps¶

hostname=worker31.oqa.prg2.suse.org ssh osd "sudo salt-key -y -a $hostname && sudo salt --state-output=changes $hostname state.apply"
ssh osd "sudo salt -C 'G@roles:worker' cmd.run 'systemctl unmask rebootmgr && systemctl enable --now rebootmgr && rebootmgrctl reboot'"

Further details¶

Don't worry, everything can be repaired :) If by any chance the worker gets misconfigured there are btrfs snapshots to recover, the IPMI Serial-over-LAN, a reinstall is possible and not hard, there is no important data on the host (it's only an openQA worker) and there are also other machines that can jobs while one host might be down for a little bit longer. And okurz can hold your hand :)

Related issues 8 (2 open — 6 closed)

Related to openQA Tests (public) - action #162239: [s390x] test fails in bootloader_start due to slow response from z/VM hypervisor and/or changed response on "cp i cms" command

Blocked

okurz

2024-06-13

Actions

Related to openQA Infrastructure (public) - action #162260: auto-update.service fails on various workers due to a package conflict

Resolved

okurz

2024-06-14

Actions

Related to openQA Project (public) - action #169939: Upgrade Power8 o3 workers to openSUSE Leap 15.6

New

2024-11-14

Actions

Related to openQA Infrastructure (public) - action #174319: https://gitlab.suse.de/openqa/salt-pillars-openqa/-/jobs/3520298#L74 fails with "File './x86_64/glibc-2.38-150600.14.17.2.x86_64.rpm' not found on medium 'http://download.opensuse.org/update/leap/15.5/sle/'" size:S

Resolved

ybonatakis

2024-12-12

Actions

Copied from openQA Project (public) - action #130588: Upgrade osd workers to openSUSE Leap 15.5

Resolved

okurz

Actions

Copied to openQA Project (public) - action #162284: Prevent multi-machine tests to be picked up if os-autoinst-openvswitch service does not work size:M

Resolved

mkittler

2024-06-14

Actions

Copied to openQA Infrastructure (public) - action #162293: SMART errors on bootup of worker31, worker32 and worker34 size:M

Resolved

nicksinger

2024-06-14

Actions

Copied to openQA Project (public) - action #163472: Upgrade a single osd worker to openSUSE Leap 15.6

Resolved

okurz

2024-07-08

Project

General

Profile

QA (public) » openQA Project (public) » openQA Infrastructure (public)

Tags

Custom queries

action #157975

Upgrade osd workers to openSUSE Leap 15.6 size:S

Motivation¶

Acceptance criteria¶

Acceptance tests¶

Suggestions¶

Rollback steps¶

Further details¶

Updated by okurz 9 months ago

Updated by okurz 9 months ago

Updated by okurz 8 months ago

Updated by okurz 6 months ago

Updated by okurz 6 months ago

Updated by okurz 6 months ago

Updated by okurz 6 months ago

Updated by okurz 6 months ago

Updated by okurz 6 months ago

Updated by okurz 6 months ago · Edited

Updated by okurz 6 months ago

Updated by okurz 6 months ago

Updated by okurz 6 months ago

Updated by okurz 6 months ago

Updated by okurz 6 months ago

Updated by okurz 6 months ago

Updated by okurz 5 months ago

Updated by okurz 2 months ago

Updated by okurz about 1 month ago

Updated by okurz 22 days ago

Updated by okurz 22 days ago

Updated by mkittler 16 days ago

Updated by mkittler 16 days ago

Updated by dheidler 10 days ago

Updated by dheidler 10 days ago

Updated by okurz 10 days ago

Updated by ybonatakis 8 days ago

Updated by ybonatakis 8 days ago

Updated by openqa_review 7 days ago

Updated by ybonatakis 7 days ago

Updated by ybonatakis 7 days ago

Updated by ybonatakis 6 days ago

Updated by ybonatakis 6 days ago

Updated by ybonatakis 3 days ago

Updated by okurz 1 day ago

Updated by livdywan about 24 hours ago

Updated by ybonatakis about 22 hours ago

Updated by ybonatakis about 20 hours ago

Updated by okurz about 20 hours ago

Updated by ybonatakis 10 minutes ago

Updated by ybonatakis 9 minutes ago