Project

General

Profile

Actions

action #107638

closed

OSD deployment failed due to openqaworker-arm-1 with "rpmdb2solv: inconsistent rpm database" despite our repair attempts

Added by okurz about 2 years ago. Updated about 2 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
Start date:
2022-02-25
Due date:
% Done:

0%

Estimated time:

Description

Observation

https://gitlab.suse.de/openqa/osd-deployment/-/jobs/856121 failed with

openqaworker-arm-1.suse.de:
    Loading repository data...
 …
    (15/15) Installing: dpdk-kmp-default-19.11.4_k5.3.18_59.5-6.1.aarch64 [......done]
…
    .....done]
    Problem occurred during or after installation or removal of packages:
    Failed to cache rpm database (1).
    History:
     - 'rpmdb2solv' '-r' '/' '-D' '/usr/lib/sysimage/rpm' '-X' '-p' '/etc/products.d' '/var/cache/zypp/solv/@System/solv' '-o' '/var/cache/zypp/solv/@System/solvoOLjYZ'
       rpmdb2solv: inconsistent rpm database, key 2095 not found. run 'rpm --rebuilddb' to fix.

    Please see the above error message for a hint.

On the machine /var/log/salt/minion says:

2022-02-25 05:43:16,678 [salt.state       :2467][WARNING ][33939] State is set to retry, but a valid dict for retry configuration was not found.  Using retry defaults
2022-02-25 06:27:43,224 [salt.loaded.int.module.cmdmod:844 ][ERROR   ][37171] Command 'if' failed with return code: 8
2022-02-25 06:27:43,225 [salt.loaded.int.module.cmdmod:846 ][ERROR   ][37171] stdout: Loading repository data...
Reading installed packages...
Warning: You are about to do a distribution upgrade with all enabled repositories. Make sure these repositories are compatible before you continue. See 'man zypper' for more information about this command.
Computing distribution upgrade...

The following 10 packages are going to be upgraded:
  openQA-client openQA-common openQA-worker os-autoinst os-autoinst-devel os-autoinst-distri-opensuse-deps os-autoinst-openvswitch os-autoinst-swtpm perl-Code-TidyAll perl-DBD-Pg

The following 5 NEW packages are going to be installed:
  crash-kmp-default dpdk-kmp-default kernel-default-5.3.18-150300.59.49.1 kernel-default-extra kernel-default-optional

The following package requires a system reboot:
  kernel-default-5.3.18-150300.59.49.1

10 packages to upgrade, 5 new.
Overall download size: 0 B. Already cached: 96.0 MiB. After the operation, additional 137.0 MiB will be used.

    Note: System reboot required.
Continue? [y/n/v/...? shows all options] (y): y
In cache os-autoinst-4.6.1645700100.d410cc0d-lp153.1097.1.aarch64.rpm (1/15), 320.7 KiB (968.9 KiB unpacked)
…
In cache perl-DBD-Pg-3.15.1-lp153.2.1.aarch64.rpm (8/15), 214.4 KiB (637.7 KiB unpacked)
dracut: *** Creating initramfs image file '/boot/initrd-5.3.18-150300.59.49-default' done ***
.....done]
Problem occurred during or after installation or removal of packages:
Failed to cache rpm database (1).
History:
 - 'rpmdb2solv' '-r' '/' '-D' '/usr/lib/sysimage/rpm' '-X' '-p' '/etc/products.d' '/var/cache/zypp/solv/@System/solv' '-o' '/var/cache/zypp/solv/@System/solvoOLjYZ'
   rpmdb2solv: inconsistent rpm database, key 2095 not found. run 'rpm --rebuilddb' to fix.

Please see the above error message for a hint.
2022-02-25 06:27:43,226 [salt.loaded.int.module.cmdmod:850 ][ERROR   ][37171] retcode: 8
2022-02-25 06:27:43,227 [salt.loaded.int.module.cmdmod:1216][ERROR   ][37171] Command 'if' failed with return code: 8
2022-02-25 06:27:43,228 [salt.loaded.int.module.cmdmod:1221][ERROR   ][37171] output: Loading repository data...
Computing distribution upgrade...
…
(15/15) Installing: dpdk-kmp-default-19.11.4_k5.3.18_59.5-6.1.aarch64 [......done]
…
Failed to cache rpm database (1).
History:
 - 'rpmdb2solv' '-r' '/' '-D' '/usr/lib/sysimage/rpm' '-X' '-p' '/etc/products.d' '/var/cache/zypp/solv/@System/solv' '-o' '/var/cache/zypp/solv/@System/solvoOLjYZ'
   rpmdb2solv: inconsistent rpm database, key 2095 not found. run 'rpm --rebuilddb' to fix.

Please see the above error message for a hint.
2022-02-25 06:43:16,424 [salt.state       :2467][WARNING ][2130] State is set to retry, but a valid dict for retry configuration was not found.  Using retry defaults

So it looks like two installation attempts have been made so looks like the retry was active. But was the RPM database really tried to be repaired? I can reproduce problems right now when manually calling zypper dup

Actions #1

Updated by okurz about 2 years ago

  • Status changed from New to In Progress

I executed sudo zypper --no-refresh -n dup --force-resolution --replacefiles; if grep --perl-regexp -q "^$(date +%F).*(inconsistent rpm database|DB_PAGE_NOTFOUND)" /var/log/zypper.log; then rpm --rebuilddb || exit 1; fi && zypper --no-refresh -n dup --force-resolution --replacefiles

output:

Loading repository data...
Reading installed packages...
Warning: You are about to do a distribution upgrade with all enabled repositories. Make sure these repositories are compatible before you continue. See 'man zypper' for more information about this command.
Computing distribution upgrade...
Nothing to do.
Loading repository data...
Reading installed packages...
Warning: You are about to do a distribution upgrade with all enabled repositories. Make sure these repositories are compatible before you continue. See 'man zypper' for more information about this command.
Computing distribution upgrade...
Nothing to do.

so the RPM database is actually rebuilt but that's not visible from the output.

So either an explicit echo with sudo zypper --no-refresh -n dup --force-resolution --replacefiles; if grep --perl-regexp -q "^$(date +%F).*(inconsistent rpm database|DB_PAGE_NOTFOUND)" /var/log/zypper.log; then echo "# Rebuilding RPM database due to corruption" && rpm --rebuilddb || exit 1; fi && zypper --no-refresh -n dup --force-resolution --replacefiles

Loading repository data...
Reading installed packages...
Warning: You are about to do a distribution upgrade with all enabled repositories. Make sure these repositories are compatible before you continue. See 'man zypper' for more information about this command.
Computing distribution upgrade...
Nothing to do.
# Rebuilding RPM database due to corruption
Loading repository data...
…

or we use bash -x.

https://gitlab.suse.de/openqa/osd-deployment/-/merge_requests/47

I created the above MR and retriggered the failed deployment job.

Actions #2

Updated by okurz about 2 years ago

  • Status changed from In Progress to Resolved

I merged the MR. Deployment done.

Actions

Also available in: Atom PDF