Project

General

Profile

Actions

action #160089

closed

coordination #157969: [epic] Upgrade all our infrastructure, e.g. o3+osd workers+webui and production workloads, to openSUSE Leap 15.6

Handle uncommented package lock on "kernel-default" and "kernel-default-base" on openqa-piworker

Added by okurz 7 months ago. Updated 7 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Regressions/Crashes
Target version:
Start date:
2024-05-08
Due date:
% Done:

0%

Estimated time:
Tags:

Description

Motivation

While upgrading machines from Leap 15.5->15.6 I encountered package locks for "kernel-default" and "kernel-default-base" on openqa-piworker which I left in place while conducting the upgrade. Both package locks don't have a description so it's not clear why we have those.

# zypper ll

# | Name                | Type    | Repository | Comment
--+---------------------+---------+------------+--------
1 | kernel-default      | package | (any)      | 
2 | kernel-default-base | package | (any)      | 

Acceptance criteria

  • AC1: There are no uncommented package locks on openqa-piworker

Suggestions

  • Ask around and look in old tickets why we have those package locks
  • As necessary remove the package locks and ensure that we run an upgraded kernel without problems or have an according comment with bug or ticket reference explaining why we have the package lock in place and what would be necessary to remove the lock again

Related issues 2 (0 open2 closed)

Related to openQA Infrastructure (public) - action #132860: openqa-piworker is unstable and needs regular power-cycles size:MResolveddheidler2023-07-172024-02-27

Actions
Copied from openQA Project (public) - action #157996: Upgrade all other LSG QE salt controlled machines to openSUSE Leap 15.6Resolvedokurz

Actions
Actions #1

Updated by okurz 7 months ago

  • Copied from action #157996: Upgrade all other LSG QE salt controlled machines to openSUSE Leap 15.6 added
Actions #2

Updated by jbaier_cz 7 months ago

  • Assignee set to jbaier_cz
Actions #3

Updated by jbaier_cz 7 months ago

  • Related to action #132860: openqa-piworker is unstable and needs regular power-cycles size:M added
Actions #4

Updated by jbaier_cz 7 months ago

  • Status changed from New to Resolved

Comment #132860#note-32 suggests that the locks should be there in a first place (this is probably a "race condition" with https://gitlab.suse.de/openqa/salt-pillars-openqa/-/merge_requests/718).

The locks are probably safe to remove: https://gitlab.suse.de/openqa/salt-pillars-openqa/-/merge_requests/806

In the mean time:

# zypper ll

# | Name                | Type    | Repository | Comment
--+---------------------+---------+------------+-----------
1 | kernel-default      | package | (any)      | poo#132860
2 | kernel-default-base | package | (any)      | poo#132860
Actions #5

Updated by okurz 7 months ago

  • Due date set to 2024-05-27
  • Status changed from Resolved to Feedback

Let's please be explicit about the kernel upgrade then. https://gitlab.suse.de/openqa/salt-pillars-openqa/-/merge_requests/806 was removed but so far openqa-piworker still runs the old kernel. I would like to have an explicit open ticket in case there are problems with the new kernel or if the system does not survive the next reboot. If openQA jobs are executed fine after rebooting into the new kernel then everything should be good and you can resolve.

Actions #6

Updated by livdywan 7 months ago

$ zypper ll | grep poo
1 | kernel-default      | package | (any)      | poo#132860                       
2 | kernel-default-base | package | (any)      | poo#132860

$ systemctl cat auto-update | grep Exec                
ExecStart=/bin/sh -c 'zypper -n --non-interactive-include-reboot-patches patch --r
eplacefiles --auto-agree-with-licenses --download-in-advance && needs-restarting -
-reboothint >/dev/null || (command -v rebootmgrctl >/dev/null && rebootmgrctl rebo
ot ||:)'

I took a quick look as I was surprised the kernel wasn't replaced. It looks like it's not running auto-update.sh and the update-update timer is inactive. Some inconsistency?

Actions #7

Updated by okurz 7 months ago

because openqa-piworker is not an openQA worker the service that should be running is auto-upgrade, not auto-update. And I checked that auto-upgrade.timer is enabled and last ran successfully today in the morning.

Actions #8

Updated by livdywan 7 months ago

okurz wrote in #note-7:

because openqa-piworker is not an openQA worker the service that should be running is auto-upgrade, not auto-update. And I checked that auto-upgrade.timer is enabled and last ran successfully today in the morning.

Right. I guess I was wrong to assume we manage locks on that machine, so all (not) working as expected.

Actions #9

Updated by okurz 7 months ago

So openqa-piworker has the new kernel installed but hasn't rebooted yet. @jbaier_cz please either explicitly reboot or just monitor what happens after the next automatic reboot. sudo rebootmgrctl status says

Status: Reboot requested, waiting for maintenance window

which would be next Sunday.

Actions #10

Updated by jbaier_cz 7 months ago

Yeah, I did as well checked the machine this morning to see the kernel is successfully installed after me removing the locks manually yesterday. As the reboot window is nearby, I will keep it alone for the weekend and we will see on Monday.

Actions #11

Updated by jbaier_cz 7 months ago

  • Status changed from Feedback to Resolved

I guess all good here.

openqa-piworker:~>  uname -a
Linux openqa-piworker 6.4.0-150600.20-default #1 SMP PREEMPT_DYNAMIC Thu May  9 20:34:03 UTC 2024 (ed3fdbb) aarch64 aarch64 aarch64 GNU/Linux
Actions #12

Updated by okurz 7 months ago

  • Due date deleted (2024-05-27)
Actions

Also available in: Atom PDF