Project

General

Profile

Actions

action #150965

closed

At least diesel+petrol+mania fail to auto-update due to kernel locks preventing patches size:M

Added by okurz 6 months ago. Updated 5 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
Start date:
2023-11-16
Due date:
2023-12-22
% Done:

0%

Estimated time:

Description

Observation

petrol:~ # systemctl status auto-update
 auto-update.service - Automatically patch system packages.
     Loaded: loaded (/etc/systemd/system/auto-update.service; static)
     Active: inactive (dead) since Thu 2023-11-16 02:34:18 CET; 18h ago
TriggeredBy: auto-update.timer
   Main PID: 99487 (code=exited, status=0/SUCCESS)

Nov 16 02:34:15 petrol sh[99764]: Loading repository data...
Nov 16 02:34:16 petrol sh[99764]: Reading installed packages...
Nov 16 02:34:18 petrol sh[99764]: Resolving package dependencies...
Nov 16 02:34:18 petrol sh[99764]: Problem: the to be installed patch:openSUSE-SLE-15.5-2023-4375-1.noarch conflicts with 'kernel-default.ppc64le < 5.14.21>
Nov 16 02:34:18 petrol sh[99764]:  Solution 1: deinstallation of kernel-default-5.3.18-150300.59.93.1.ppc64le
Nov 16 02:34:18 petrol sh[99764]:  Solution 2: do not install patch:openSUSE-SLE-15.5-2023-4375-1.noarch
Nov 16 02:34:18 petrol sh[99764]:  Solution 3: remove lock to allow installation of kernel-default-5.14.21-150500.55.36.1.ppc64le[repo-sle-update]
Nov 16 02:34:18 petrol sh[99764]:  Solution 4: remove lock to allow installation of kernel-default-6.5.9-lp155.4.1.g1823166.ppc64le[kernel-stable-backport]
Nov 16 02:34:18 petrol sh[99764]: Choose from above solutions by number or cancel [1/2/3/4/c/d/?] (c): c
Nov 16 02:34:18 petrol systemd[1]: auto-update.service: Deactivated successfully.

because of

petrol:~ # zypper ll

# | Name             | Type    | Repository | Comment
--+------------------+---------+------------+------------------------------------------
1 | kernel*          | package | (any)      | poo#119008, kernel regression boo#1202138
2 | qemu-ovmf-x86_64 | package | (any)      | poo#116812
3 | util-linux       | package | (any)      | poo#119008, kernel regression boo#1202138

For #131249 we maybe already applied an approach that worked for us which we should apply here, I guess?

On petrol now I ran zypper patch --dry-run manually and sequentially added patches to the package locks as well ending up with

zypper al -t patch -m "poo#119008, kernel regression boo#1202138" openSUSE-SLE-15.5-2023-4375
zypper al -t patch -m "poo#119008, kernel regression boo#1202138" openSUSE-SLE-15.5-2023-4071
zypper al -t patch -m "poo#119008, kernel regression boo#1202138" openSUSE-SLE-15.5-2023-3971
zypper al -t patch -m "poo#119008, kernel regression boo#1202138" openSUSE-SLE-15.5-2023-3311
zypper al -t patch -m "poo#119008, kernel regression boo#1202138" openSUSE-SLE-15.5-2023-3172
zypper al -t patch -m "poo#119008, kernel regression boo#1202138" openSUSE-SLE-15.5-2023-2871

but I doubt this is long-term maintainable. We should learn better ways to do that. E.g. research more about zypper or ask SUSE domain experts on that.

Acceptance criteria

  • AC1: Machines using auto-update still regularly update despite having package locks in place
  • AC2: Package locks are still regarded during automatic updates
  • AC3: We still don't automatically upgrade devel:openQA packages
  • AC4: We still have a reasonable OSD changelog not more than once a day with all relevant changes since the last explicit deployment

Suggestions

  • Research more about zypper or ask SUSE domain experts on that
  • Try to make zypper patch not complain about locks
  • Research why we came up with a separate auto-update service for OSD openQA machines at all (or if we can ditch that by now)
  • Fallback updates when openQA deployment pipeline runs zypper dup
  • Check whether it helps to make the package lock more specific (currently it uses a glob which might be problematic) It can be problematic to make kernel locks more specific because other packages like kernel-default-base might be installed instead.
  • Consider switching to openqa-auto-update https://github.com/os-autoinst/openQA/blob/master/script/openqa-auto-update as used on o3 and adapt osd-deployment so that we still receive reasonable changelogs

Related issues 2 (0 open2 closed)

Related to openQA Infrastructure - action #131249: [alert][ci][deployment] OSD deployment failed, grenache-1, worker5, worker2 salt-minion does not return, error message "No response" size:MResolvedokurz2023-06-22

Actions
Related to openQA Infrastructure - action #152092: Handle all package downgrades in OSD infrastructure properly in salt size:MResolvednicksinger2023-12-05

Actions
Actions

Also available in: Atom PDF