action #115484
closed[alert] OSD deployment failed on 18.08.22 size:M
0%
Description
Observation¶
Jobs in the pipeline failed. Apparently it cannot find the salt
command:
++ echo '$ ssh $TARGET \ # collapsed multi-line command'
++ ssh openqa.suse.de 'set -xo pipefail; sudo salt -C '\''G@roles:worker'\'' cmd.run '\''for i in {1..7}; do zypper -n dup --download-only --details && break || (echo Retry' after sleep '... && sleep 30); done'\'''
+ sudo salt -C G@roles:worker cmd.run 'for i in {1..7}; do zypper -n dup --download-only --details && break || (echo Retry after sleep ... && sleep 30); done'
sudo: salt: command not found
+++ kill %1
Cleaning up project directory and file based variables 00:03
Using docker image sha256:81cd3442a3502ceb7f516b7ce7a0f02634449600ddbb28a2d5c0e192564d0dcf for registry.opensuse.org/home/darix/apps/containers/gitlab-runner-helper:x86_64-latest with digest registry.opensuse.org/home/darix/apps/containers/gitlab-runner-helper@sha256:6dc8b8a636f4dde3ae54e781e487219ce27c5ee46b8376b495c2bc66050222ba ...
Acceptance criteria¶
- AC1: salt can be found again
- AC2: The cause of the issue is known
Suggestions¶
- See messages on 18.08.22 on the osd-admins list.
- See https://gitlab.suse.de/openqa/osd-deployment/-/pipelines/461261
- Retrying didn't help.
- Check if we might have some force resolution behavior or similar that removed packages
Updated by livdywan about 2 years ago
- Subject changed from [alert] OSD deployment failed on 18.08.22 to [alert] OSD deployment failed on 18.08.22 size:M
- Description updated (diff)
- Status changed from New to Workable
Marius mitigated the issue by installing salt{,-master} (but we don't know what caused this)
Updated by mkittler about 2 years ago
- Status changed from Workable to In Progress
- Assignee set to mkittler
Updated by mkittler about 2 years ago
/var/log/zypp/history
:
2022-08-18 03:00:41|command|root@openqa|'zypper' '-n' '--non-interactive-include-reboot-patches' 'patch' '--replacefiles' '--auto-agree-with-licenses' '--force-resolution' '--download-in-advance'|
# 2022-08-18 03:00:46 salt-minion-3004-150400.8.8.1.x86_64 removed ok
# Additional rpm output:
# Removed /etc/systemd/system/multi-user.target.wants/salt-minion.service.
# warning: /etc/salt/minion saved as /etc/salt/minion.rpmsave
#
2022-08-18 03:00:46|remove |salt-minion|3004-150400.8.8.1|x86_64||
2022-08-18 03:00:47|remove |salt-ssh|3004-150400.8.8.1|x86_64||
# 2022-08-18 03:00:54 salt-master-3004-150400.8.8.1.x86_64 removed ok
# Additional rpm output:
# Removed /etc/systemd/system/multi-user.target.wants/salt-master.service.
# warning: /etc/salt/master saved as /etc/salt/master.rpmsave
#
2022-08-18 03:00:54|remove |salt-master|3004-150400.8.8.1|x86_64||
2022-08-18 03:00:56|remove |salt|3004-150400.8.8.1|x86_64||
2022-08-18 03:01:01|remove |python3-salt|3004-150400.8.8.1|x86_64||
2022-08-18 03:01:01|remove |python3-requests|2.24.0-1.24|noarch||
2022-08-18 03:01:01|remove |python3-py|1.8.1-5.6.1|noarch||
2022-08-18 03:01:01|patch |openSUSE-SLE-15.4-2022-2119|1|noarch|repo-sle-update|important|recommended|applied|not-needed|
2022-08-18 03:01:01|patch |openSUSE-SLE-15.4-2022-2831|1|noarch|repo-sle-update|moderate|security|needed|not-needed|
2022-08-18 03:01:01|patch |openSUSE-SLE-15.4-2022-2304|1|noarch|repo-sle-update|important|security|applied|not-needed|
2022-08-18 11:55:43|command|root@openqa|'zypper' 'in' 'salt'|
2022-08-18 11:55:43|install|python3-py|1.8.1-5.6.1|noarch||repo-oss|31ddc63bc7178278d7d72abe6bdeb2a97c486cdd56234292f56a0fa1a562c324|
2022-08-18 11:55:43|install|python3-requests|2.24.0-1.24|noarch||repo-oss|89d71c20ba199d5ab01e24ca8d9d8ad8d64e95f16983c118399a7b4eb93f272c|
2022-08-18 11:55:46|install|python3-salt|3004-150400.8.8.1|x86_64||repo-sle-update|cd0f786c1fc1be720eaa4c98f371f9ec4dfb9d8f70426cfa1b9477f9665f3110|
2022-08-18 11:55:46|install|salt|3004-150400.8.8.1|x86_64|root@openqa|repo-sle-update|39a52e8809f6cf55c7add5408136e57b55a1cc799dca212c51d818496c5f2349|
2022-08-18 11:55:46|patch |openSUSE-SLE-15.4-2022-2119|1|noarch|repo-sle-update|important|recommended|not-needed|applied|
2022-08-18 11:55:46|patch |openSUSE-SLE-15.4-2022-2831|1|noarch|repo-sle-update|moderate|security|not-needed|needed|
2022-08-18 11:55:46|patch |openSUSE-SLE-15.4-2022-2304|1|noarch|repo-sle-update|important|security|not-needed|applied|
2022-08-18 11:55:58|command|root@openqa|'zypper' 'in' 'salt-master'|
2022-08-18 11:56:01|install|salt-master|3004-150400.8.8.1|x86_64|root@openqa|repo-sle-update|8a88fab48d176667c908ee6e248d4a82a5089d7abbe2a1f3122894be52e8de7c|
Updated by mkittler about 2 years ago
That's the more detailed log from the uninstallation:
2022-08-18 03:00:39 <1> openqa(30384) [libsolv++] PoolImpl.cc(logSat):131 normal: 28427, 79923 literals
2022-08-18 03:00:39 <1> openqa(30384) [libsolv++] PoolImpl.cc(logSat):131 pkg rule memory used: 1160 K
2022-08-18 03:00:39 <1> openqa(30384) [libsolv++] PoolImpl.cc(logSat):131 pkg rule creation took 155 ms
2022-08-18 03:00:39 <1> openqa(30384) [libsolv] PoolImpl.cc(logSat):133 job: blacklist providing retracted-patch-package()
2022-08-18 03:00:39 <1> openqa(30384) [libsolv] PoolImpl.cc(logSat):133 job: blacklist providing ptf()
2022-08-18 03:00:39 <1> openqa(30384) [libsolv] PoolImpl.cc(logSat):133 job: install patch:openSUSE-SLE-15.4-2022-2831-1.noarch
2022-08-18 03:00:39 <1> openqa(30384) [libsolv] PoolImpl.cc(logSat):133 - job Rule #52783:
2022-08-18 03:00:39 <1> openqa(30384) [libsolv] PoolImpl.cc(logSat):133 patch:openSUSE-SLE-15.4-2022-2831-1.noarch [180057] (w1)
2022-08-18 03:00:39 <1> openqa(30384) [libsolv] PoolImpl.cc(logSat):133 job: install providing glibc
2022-08-18 03:00:39 <1> openqa(30384) [libsolv] PoolImpl.cc(logSat):133 - job Rule #52784:
2022-08-18 03:00:39 <1> openqa(30384) [libsolv] PoolImpl.cc(logSat):133 glibc-2.31-150300.20.7.x86_64 [90300] (w1)
2022-08-18 03:00:39 <1> openqa(30384) [libsolv] PoolImpl.cc(logSat):133 glibc-2.31-150300.26.5.x86_64 [171973] (w2)
2022-08-18 03:00:39 <1> openqa(30384) [libsolv] PoolImpl.cc(logSat):133 glibc-2.31-150300.31.2.x86_64 [171978]
2022-08-18 03:00:39 <1> openqa(30384) [libsolv] PoolImpl.cc(logSat):133 glibc-2.31-150300.37.1.x86_64 [171983]
2022-08-18 03:00:39 <1> openqa(30384) [libsolv] PoolImpl.cc(logSat):133 glibc-2.31-150300.37.1.x86_64 [180578]I
2022-08-18 03:00:39 <1> openqa(30384) [libsolv] PoolImpl.cc(logSat):133 job: erase openQA-local-db-4.6.1660648257.0cc7a55-lp154.5211.5.noarch
2022-08-18 03:00:39 <1> openqa(30384) [libsolv] PoolImpl.cc(logSat):133 - job Rule #52785:
2022-08-18 03:00:39 <1> openqa(30384) [libsolv] PoolImpl.cc(logSat):133 !openQA-local-db-4.6.1660648257.0cc7a55-lp154.5211.5.noarch [25] (w1)
2022-08-18 03:00:39 <1> openqa(30384) [libsolv] PoolImpl.cc(logSat):133 job: erase openQA-local-db-4.6.1639414134.aa9bed13e-bp154.1.64.noarch
2022-08-18 03:00:39 <1> openqa(30384) [libsolv] PoolImpl.cc(logSat):133 - job Rule #52786:
2022-08-18 03:00:39 <1> openqa(30384) [libsolv] PoolImpl.cc(logSat):133 !openQA-local-db-4.6.1639414134.aa9bed13e-bp154.1.64.noarch [130809] (w1)
2022-08-18 03:00:39 <1> openqa(30384) [libsolv++] PoolImpl.cc(logSat):131 choice rule creation took 22 ms
2022-08-18 03:00:39 <1> openqa(30384) [libsolv++] PoolImpl.cc(logSat):131 49520 pkg rules, 2 * 1631 update rules, 4 job rules, 1 infarch rules, 0 dup rules, 2 choice rules, 0 best rules, 0 yumobs rules
2022-08-18 03:00:39 <1> openqa(30384) [libsolv++] PoolImpl.cc(logSat):131 0 black rules, 0 recommends rules, 0 repo priority rules
2022-08-18 03:00:39 <1> openqa(30384) [libsolv++] PoolImpl.cc(logSat):131 overall rule memory used: 1237 K
2022-08-18 03:00:39 <1> openqa(30384) [libsolv++] PoolImpl.cc(logSat):131 solver statistics: 0 learned rules, 7 unsolvable, 0 minimization steps
2022-08-18 03:00:39 <1> openqa(30384) [libsolv++] PoolImpl.cc(logSat):131 done solving.
2022-08-18 03:00:39 <1> openqa(30384) [libsolv++] PoolImpl.cc(logSat):131
2022-08-18 03:00:39 <1> openqa(30384) [libsolv++] PoolImpl.cc(logSat):131 solver took 21 ms
2022-08-18 03:00:39 <1> openqa(30384) [libsolv++] PoolImpl.cc(logSat):131 final solver statistics: 0 problems, 0 learned rules, 7 unsolvable
2022-08-18 03:00:39 <1> openqa(30384) [libsolv++] PoolImpl.cc(logSat):131 solver_solve took 255 ms
2022-08-18 03:00:39 <1> openqa(30384) [zypp::solver] SATResolver.cc(solving):556 ....Solver end
2022-08-18 03:00:39 <1> openqa(30384) [zypp::solver] SATResolver.cc(resolvePool):816 SATResolver::resolvePool() done. Ret:1
2022-08-18 03:00:39 <1> openqa(30384) [zypper] solve-commit.cc(solve_and_commit):675 got solution, showing summary
2022-08-18 03:00:39 <1> openqa(30384) [zypper] Summary.cc(readPool):103 Pool contains 66307 items.
2022-08-18 03:00:39 <1> openqa(30384) [zypper++] Summary.cc(readPool):104 Install summary:
2022-08-18 03:00:39 <1> openqa(30384) [zypper++] Summary.cc(readPool):144 <install> UNTu_(180057)patch:openSUSE-SLE-15.4-2022-2831-1.noarch(repo-sle-update)
2022-08-18 03:00:39 <1> openqa(30384) [zypper++] Summary.cc(readPool):144 <uninstall> I_Ts_(181756)python3-py-1.8.1-5.6.1.noarch(@System)
2022-08-18 03:00:39 <1> openqa(30384) [zypper++] Summary.cc(readPool):144 <uninstall> I_Ts_(181771)python3-requests-2.24.0-1.24.noarch(@System)
2022-08-18 03:00:39 <1> openqa(30384) [zypper++] Summary.cc(readPool):144 <uninstall> I_Ts_(181773)python3-salt-3004-150400.8.8.1.x86_64(@System)
2022-08-18 03:00:39 <1> openqa(30384) [zypper++] Summary.cc(readPool):144 <uninstall> I_Ts_(181849)salt-3004-150400.8.8.1.x86_64(@System)
2022-08-18 03:00:39 <1> openqa(30384) [zypper++] Summary.cc(readPool):144 <uninstall> I_Ts_(181850)salt-master-3004-150400.8.8.1.x86_64(@System)
2022-08-18 03:00:39 <1> openqa(30384) [zypper++] Summary.cc(readPool):144 <uninstall> I_Ts_(181851)salt-minion-3004-150400.8.8.1.x86_64(@System)
2022-08-18 03:00:39 <1> openqa(30384) [zypper++] Summary.cc(readPool):144 <uninstall> I_Ts_(181852)salt-ssh-3004-150400.8.8.1.x86_64(@System)
2022-08-18 03:00:39 <1> openqa(30384) [zypper] Summary.cc(readPool):327 package update candidates: 5
2022-08-18 03:00:39 <1> openqa(30384) [zypper] Summary.cc(readPool):328 to be actually updated: 0
2022-08-18 03:00:39 <1> openqa(30384) [zypper] Summary.cc(readPool):327 product update candidates: 0
2022-08-18 03:00:39 <1> openqa(30384) [zypper] Summary.cc(readPool):328 to be actually updated: 0
I couldn't find any interesting clues in the logs before (like a repo download error).
Updated by mkittler about 2 years ago
- Tags deleted (
alert) - Assignee deleted (
mkittler) - Target version deleted (
Ready)
SR for removing --force-resolution
: https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/726
Btw, maybe this upstream issue is related: https://github.com/openSUSE/zypper/issues/446
However, this time I couldn't spot any repository refreshing errors in logs. Unfortunately journalctl -u auto-update.service
only shows messages from May but not current ones.
Updated by mkittler about 2 years ago
- Assignee set to mkittler
- Target version set to Ready
Updated by mkittler about 2 years ago
I installed all packages again that have been accidentally removed and retried the deployment which worked now.
Not sure why only these few packages were uninstalled. Unfortunately the logs only state what packages have been removed but not why.
Updated by openqa_review about 2 years ago
- Due date set to 2022-09-02
Setting due date based on mean cycle time of SUSE QE Tools
Updated by tinita about 2 years ago
Regarding the missing journal logs: we fixed this in #115208 so from today on we should have more logs again.
Updated by mkittler about 2 years ago
- Status changed from In Progress to Resolved
The SR has been merged and I think https://github.com/openSUSE/zypper/issues/446 is still good enough as upstream bug.