action #177393
closedsalt pipelines fail due to missing shadow on w16+17
0%
Description
Observation¶
https://gitlab.suse.de/openqa/salt-states-openqa/-/jobs/3827684#L446 fails as "useradd" is missing which is part of shadow. That should be part of a default installation but we did not specify that dependency in our salt states. With https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/1377 fixed the dependency but that may still be unresolvable due to
openqaworker17:~ # zypper in shadow
Retrieving repository 'Agama development packages (15.6)' metadata .................................................................................[done]
Building repository 'Agama development packages (15.6)' cache ......................................................................................[done]
Loading repository data...
Reading installed packages...
Resolving package dependencies...
Problem: 1: the to be installed shadow-4.8.1-150600.17.9.1.x86_64 requires 'libsemanage.so.2()(64bit)', but this requirement cannot be provided
not installable providers: libsemanage2-3.5-150600.1.48.x86_64[repo-oss]
Solution 1: deinstallation of libsemanage1-3.1-150400.1.65.x86_64
Solution 2: do not install shadow-4.8.1-150600.17.9.1.x86_64
Solution 3: break shadow-4.8.1-150600.17.9.1.x86_64 by ignoring some of its dependencies
Choose from above solutions by number or cancel [1/2/3/c/d/?] (c):
See https://suse.slack.com/archives/C02AJ1E568M/p1739802156666479 for context
Acceptance criteria¶
- AC1: Stable salt pipelines on all current OSD machines including w16+17
Suggestions¶
- Explicitly specify the dependencies
- Fix the root cause
- Retrigger failed salt pipelines including osd-deployment as applicable
Rollback actions¶
- Add back w16+w17 to salt and ensure a clean salt state
Updated by jbaier_cz 2 months ago
- Related to action #177066: Prevent _openqa-worker to install random packages size:S added
Updated by nicksinger 2 months ago
- Status changed from New to In Progress
- Assignee set to nicksinger
The zypper error message is a little misleading. After selecting solution 1, zypper can perfectly resolve all deps:
openqaworker16:~ # zypper in shadow
Loading repository data...
Reading installed packages...
Resolving package dependencies...
Problem: 1: the to be installed shadow-4.8.1-150600.17.9.1.x86_64 requires 'libsemanage.so.2()(64bit)', but this requirement cannot be provided
not installable providers: libsemanage2-3.5-150600.1.48.x86_64[repo-oss]
Solution 1: deinstallation of libsemanage1-3.1-150400.1.65.x86_64
Solution 2: do not install shadow-4.8.1-150600.17.9.1.x86_64
Solution 3: break shadow-4.8.1-150600.17.9.1.x86_64 by ignoring some of its dependencies
Choose from above solutions by number or cancel [1/2/3/c/d/?] (c): 1
Resolving dependencies...
Resolving package dependencies...
The following 4 NEW packages are going to be installed:
libsemanage-conf libsemanage2 libsepol2 shadow
The following package is going to be REMOVED:
libsemanage1
4 new packages to install, 1 to remove.
Package download size: 1003.5 KiB
Package install size change:
| 4.2 MiB required by packages that will be installed
3.9 MiB | - 265.6 KiB released by packages that will be removed
Backend: classic_rpmtrans
Continue? [y/n/v/...? shows all options] (y):
This also matches with a different system I checked for this lib:
openqa:~ # rpm -qf /usr/lib64/libsemanage.so.2
libsemanage2-3.5-150600.1.48.x86_64
openqa:~ # zypper info libsemanage2
Loading repository data...
Reading installed packages...
Information for package libsemanage2:
-------------------------------------
Repository : repo-oss
Name : libsemanage2
Version : 3.5-150600.1.48
Arch : x86_64
Vendor : SUSE LLC <https://www.suse.com/>
Installed Size : 268.1 KiB
Installed : Yes (automatically)
Status : up-to-date
Source package : libsemanage-3.5-150600.1.48.src
Using worker17 as "known bad" I found that libsemanage1 is still installed from the (enabled) leap15.4 repos:
openqaworker17:~ # zypper info libsemanage1
Loading repository data...
Reading installed packages...
Information for package libsemanage1:
-------------------------------------
Repository : openSUSE-Leap-15.4-1
Name : libsemanage1
Version : 3.1-150400.1.65
Arch : x86_64
Vendor : SUSE LLC <https://www.suse.com/>
Installed Size : 265.6 KiB
Installed : Yes (automatically)
Status : up-to-date
Source package : libsemanage-3.1-150400.1.65.src
Upstream URL : https://github.com/SELinuxProject/selinux/wiki/Releases
Summary : SELinux policy management library
Description :
libsemanage is the policy management library. Using libsepol and
libselinux to interact with the SELinux system, it also calls helper
programs for loading policy and for checking whether the
file_contexts configuration is valid.
(Security-enhanced Linux is a feature of the kernel and some
utilities that implement mandatory access control policies, such as
Type Enforcement, Role-based Access Control and Multi-Level
Security.)
openqaworker17:~ # zypper lr -u
Repository priorities in effect: (See 'zypper lr -P' for details)
85 (raised priority) : 1 repository
90 (raised priority) : 1 repository
95 (raised priority) : 1 repository
99 (default priority) : 8 repositories
110 (lowered priority) : 1 repository
# | Alias | Name | Enabled | GPG Check | Refresh | URI
---+-------------------------------+---------------------------------------------------------------------------------------------+---------+-----------+---------+---------------------------------------------------------------------------------
1 | SUSE_CA | SUSE_CA | Yes | (r ) Yes | Yes | https://download.opensuse.org/repositories/SUSE:/CA/15.6/
2 | devel_openQA | devel_openQA | Yes | (r ) Yes | Yes | http://download.opensuse.org/repositories/devel:/openQA/15.6/
3 | devel_openQA_Modules | devel_openQA_Modules | Yes | (r ) Yes | Yes | http://download.opensuse.org/repositories/devel:/openQA:/Leap:/15.6/15.6/
4 | home_favogt_stagingovmf_repo | home_favogt_stagingovmf | Yes | (r ) Yes | Yes | https://download.opensuse.org/repositories/home:/favogt:/stagingovmf/15.6/
5 | openSUSE-Leap-15.4-1 | openSUSE-Leap-15.4-1 | Yes | (r ) Yes | Yes | http://download.opensuse.org/distribution/leap/15.4/repo/oss/
so apparently we have some kind of incomplete upgrade here. I'm checking why this repo is still here and how to get rid of it.
Updated by nicksinger 2 months ago
after disabling the 15.4 repo and doing a zypper ref; zypper dup
I saw no problems reported. However, zypper also did not really remove anything from the old repos (as I would have expected). I messed around a little more and found:
openqaworker17:/etc/zypp/repos.d # zypper packages --orphaned
Loading repository data...
Reading installed packages...
S | Repository | Name | Version | Arch
---+------------+-----------------------------------+-----------------------+-------
i+ | @System | golang-github-google-jsonnet | 0.20.0-lp156.3.1 | x86_64
i | @System | libavif13 | 0.9.3-150400.1.9 | x86_64
i | @System | libcpupower0 | 5.14-150400.1.8 | x86_64
i | @System | libdpdk-20_0 | 19.11.10-150400.2.10 | x86_64
i | @System | libgupnp-1_2-1 | 1.4.3-150400.1.6 | x86_64
i | @System | libgupnp-igd-1_0-4 | 1.2.0-150400.1.10 | x86_64
i | @System | libopencv405 | 4.5.5-150400.1.28 | x86_64
i | @System | libopencv_aruco405 | 4.5.5-150400.1.28 | x86_64
i | @System | libopencv_face405 | 4.5.5-150400.1.28 | x86_64
i | @System | libopencv_highgui405 | 4.5.5-150400.1.28 | x86_64
i | @System | libopencv_imgcodecs405 | 4.5.5-150400.1.28 | x86_64
i | @System | libopencv_objdetect405 | 4.5.5-150400.1.28 | x86_64
i | @System | libopencv_superres405 | 4.5.5-150400.1.28 | x86_64
i | @System | libopencv_videoio405 | 4.5.5-150400.1.28 | x86_64
i | @System | libopencv_videostab405 | 4.5.5-150400.1.28 | x86_64
i | @System | libopencv_ximgproc405 | 4.5.5-150400.1.28 | x86_64
i | @System | libopenvswitch-2_14-0 | 2.14.2-150400.22.23 | x86_64
i | @System | libpoppler126 | 23.01.0-150500.3.11.1 | x86_64
i | @System | libprocps7 | 3.3.15-7.22.1 | x86_64
i | @System | librav1e0 | 0.5.1+0-150400.1.10 | x86_64
i | @System | libsemanage1 | 3.1-150400.1.65 | x86_64
i | @System | libsrt1 | 1.3.4-1.45 | x86_64
i | @System | libwacom2 | 1.12-150400.1.10 | x86_64
i | @System | libwireplumber-0_4-0 | 0.4.9-150400.1.5 | x86_64
i | @System | openSUSE-release-appliance-custom | 15.4-lp154.166.1 | x86_64
i | @System | python3-bind | 9.16.20-150400.3.6 | noarch
I guess the "@System" is wrong here because I previously removed the agamar-repos as well. Not sure if this output means golang-github-google-jsonnet was installed explicitly and everything else was pulled in as dependency from the wrong repo or these other libs came from something else. While looking for some command similar to "apt autoremove" I stumbled over https://forums.opensuse.org/t/is-there-a-way-to-remove-orphaned-packages/143213/3 and eventually used the proposed solution:
openqaworker17:/etc/zypp/repos.d # zypper packages --orphaned | grep @System | cut -d '|' -f3 | xargs echo zypper rm
[…]
openqaworker17:/etc/zypp/repos.d # zypper rm libavif13 libcpupower0 libdpdk-20_0 libgupnp-1_2-1 libgupnp-igd-1_0-4 libopencv405 libopencv_aruco405 libopencv_face405 libopencv_highgui405 libopencv_imgcodecs405 libopencv_objdetect405 libopencv_superres405 libopencv_videoio405 libopencv_videostab405 libopencv_ximgproc405 libopenvswitch-2_14-0 libpoppler126 libprocps7 librav1e0 libsemanage1 libsrt1 libwacom2 libwireplumber-0_4-0 openSUSE-release-appliance-custom python3-bind
Reading installed packages...
Resolving package dependencies...
The following 25 packages are going to be REMOVED:
libavif13 libcpupower0 libdpdk-20_0 libgupnp-1_2-1 libgupnp-igd-1_0-4 libopencv405 libopencv_aruco405 libopencv_face405 libopencv_highgui405
libopencv_imgcodecs405 libopencv_objdetect405 libopencv_superres405 libopencv_videoio405 libopencv_videostab405 libopencv_ximgproc405
libopenvswitch-2_14-0 libpoppler126 libprocps7 librav1e0 libsemanage1 libsrt1 libwacom2 libwireplumber-0_4-0 openSUSE-release-appliance-custom
python3-bind
25 packages to remove.
Package install size change:
| 0 B required by packages that will be installed
-60.4 MiB | - 60.4 MiB released by packages that will be removed
Backend: classic_rpmtrans
Continue? [y/n/v/...? shows all options] (y): y
( 1/25) Removing: libavif13-0.9.3-150400.1.9.x86_64 ................................................................................................[done]
( 2/25) Removing: libcpupower0-5.14-150400.1.8.x86_64 ..............................................................................................[done]
( 3/25) Removing: libgupnp-igd-1_0-4-1.2.0-150400.1.10.x86_64 ......................................................................................[done]
( 4/25) Removing: libopencv_aruco405-4.5.5-150400.1.28.x86_64 ......................................................................................[done]
( 5/25) Removing: libopencv_face405-4.5.5-150400.1.28.x86_64 .......................................................................................[done]
( 6/25) Removing: libopencv_highgui405-4.5.5-150400.1.28.x86_64 ....................................................................................[done]
( 7/25) Removing: libopencv_superres405-4.5.5-150400.1.28.x86_64 ...................................................................................[done]
( 8/25) Removing: libopencv_videostab405-4.5.5-150400.1.28.x86_64 ..................................................................................[done]
( 9/25) Removing: libopenvswitch-2_14-0-2.14.2-150400.22.23.x86_64 .................................................................................[done]
(10/25) Removing: libpoppler126-23.01.0-150500.3.11.1.x86_64 .......................................................................................[done]
(11/25) Removing: libprocps7-3.3.15-7.22.1.x86_64 ..................................................................................................[done]
(12/25) Removing: libsemanage1-3.1-150400.1.65.x86_64 ..............................................................................................[done]
(13/25) Removing: libsrt1-1.3.4-1.45.x86_64 ........................................................................................................[done]
(14/25) Removing: libwacom2-1.12-150400.1.10.x86_64 ................................................................................................[done]
(15/25) Removing: libwireplumber-0_4-0-0.4.9-150400.1.5.x86_64 .....................................................................................[done]
(16/25) Removing: openSUSE-release-appliance-custom-15.4-lp154.166.1.x86_64 ........................................................................[done]
(17/25) Removing: python3-bind-9.16.20-150400.3.6.noarch ...........................................................................................[done]
(18/25) Removing: librav1e0-0.5.1+0-150400.1.10.x86_64 .............................................................................................[done]
(19/25) Removing: libgupnp-1_2-1-1.4.3-150400.1.6.x86_64 ...........................................................................................[done]
(20/25) Removing: libopencv_objdetect405-4.5.5-150400.1.28.x86_64 ..................................................................................[done]
(21/25) Removing: libopencv_videoio405-4.5.5-150400.1.28.x86_64 ....................................................................................[done]
(22/25) Removing: libdpdk-20_0-19.11.10-150400.2.10.x86_64 .........................................................................................[done]
(23/25) Removing: libopencv405-4.5.5-150400.1.28.x86_64 ............................................................................................[done]
(24/25) Removing: libopencv_ximgproc405-4.5.5-150400.1.28.x86_64 ...................................................................................[done]
(25/25) Removing: libopencv_imgcodecs405-4.5.5-150400.1.28.x86_64 ..................................................................................[done]
To ensure the system is in a consistent state of dependencies now, I used rpm -qa --qf '%{NAME}\n' | grep -v gpg-pubkey | grep -v x3270 | xargs zypper in -f --no-recommends --dry-run
. This showed no dependency conflicts and only reinstalls. So worker17 should be fine again. Doing the same for worker16 now.
Updated by okurz 2 months ago
sounds reasonable. IIRC "@System" can also mean that there is no online repository anymore for that exact package, e.g. if "devel:openQA" has a new version of os-autoinst available and we did not yet upgrade then the local "os-autoinst" package might also show up under "@System"
Updated by nicksinger 2 months ago
okurz wrote in #note-5:
sounds reasonable. IIRC "@System" can also mean that there is no online repository anymore for that exact package, e.g. if "devel:openQA" has a new version of os-autoinst available and we did not yet upgrade then the local "os-autoinst" package might also show up under "@System"
right, makes perfect sense because I explicitly search for orphaned packages. So doing a full update before or examining the packages to be removed closely is a good idea.
The cause was the same and list of packages was similar on worker16:
The following 29 packages are going to be REMOVED:
atk-lang golang-github-google-jsonnet libavif13 libcpupower0 libdpdk-20_0 libgupnp-1_2-1 libgupnp-igd-1_0-4 liblept5 libopencv405 libopencv_aruco405
libopencv_face405 libopencv_highgui405 libopencv_imgcodecs405 libopencv_objdetect405 libopencv_superres405 libopencv_videoio405 libopencv_videostab405
libopencv_ximgproc405 libopenvswitch-2_14-0 libpoppler126 libprocps7 librav1e0 libsemanage1 libsrt1 libtesseract4 libwacom2 libwireplumber-0_4-0
openSUSE-release-appliance-custom python3-bind
The fancy rpm command from above also confirmed it would catch eventual missing packages on that host by proposing:
The following 4 NEW packages are going to be installed:
libsemanage-conf libsemanage2 libsepol2 shadow
I continued to install shadow on both hosts and checking /etc/zypp/repos.d - all remaining repos use $releasever
- so this should be fine. I enabled both machines in salt again and did a manual salt 'openqaworker16.qa.suse.cz' state.apply
. While this is running I will check for eventual pipelines we need to restart.
Updated by nicksinger 2 months ago
- Status changed from In Progress to Resolved
all states applied successfully on both hosts. The latest state-pipeline was already successful and pillars now also succeeded: https://gitlab.suse.de/openqa/salt-pillars-openqa/-/pipelines/1572282
I can't come up with a proper improvement to be honest. Removing old repos certainly is a good thing but no idea how we could automate such checks. Also #177066 helps.