action #112232
closed[tools] Multiple recurring failures due to zypper failing to download packages temporarily
0%
Description
Observation¶
openQA test in scenario openqa-Tumbleweed-dev-x86_64-openqa_install+publish@64bit-2G fails in
openqa_webui
also reported in https://github.com/openSUSE/zypper/issues/420#issuecomment-1150843963 on the report https://github.com/openSUSE/zypper/issues/420 which I had opened months ago.
This problem hits us multiple times a week and we already try to handle it with downstream retrying on multiple levels. Other LSG QE squads are hit by the same problem recurringly and also seemingly much more than months or years ago. There are also reports by users. So far I see good and helpful responses regarding the mirroring infrastructure, e.g. from @Andrii Nikitin (thanks for that) but no useful reaction by anyone else responsible, e.g. zypper, infrastructure, product as as a whole, etc. Can we please get a reaction from someone feeling responsible for the overall user experience of openSUSE/SLE?
Reproducible¶
Fails since (at least) Build :TW.11585 but also in quite different cases like https://gitlab.suse.de/openqa/osd-deployment/-/jobs/1007219
Expected result¶
Would be great if both the interactive as well as the non-interactive mode would also offer automatic retries.
Further details¶
Always latest result in this scenario: latest
Updated by okurz over 2 years ago
- Project changed from openQA Tests (public) to QA (public)
- Category deleted (
Bugs in existing tests)
Updated by okurz over 2 years ago
- Assignee set to okurz
Updated by okurz over 2 years ago
- Due date set to 2022-06-23
- Status changed from New to Feedback
Updated by livdywan over 2 years ago
okurz wrote:
https://github.com/os-autoinst/os-autoinst-distri-openQA/pull/89
It seems like re-trying zypper
in is not enough in the case where the metadata is outdated i.e. https://openqa.opensuse.org/tests/2410284#step/openqa_worker/7
This probably needs to retry all zypper calls i.e. for i in {1..3}; do zypper -n --gpg-auto-import-keys ref -f ; zypper --no-cd --non-interactive in os-autoinst; do zypper --no-cd --non-interactive in openQA-worker && break; done
or similar
Updated by okurz over 2 years ago
Updated by okurz over 2 years ago
There is also good movement in https://github.com/openSUSE/zypper/issues/420
On a related note there is also https://hackweek.opensuse.org/21/projects/generic-retry-command-in-opensuse with https://build.opensuse.org/package/show/Base:System/retry and a pending SR to oSC:Factory
Updated by okurz over 2 years ago
- Status changed from Feedback to Workable
"retry" is now included in openSUSE:Factory so also within Tumbleweed so we can consider using that as well. In the meantime https://openqa.opensuse.org/tests/2417194#step/test_distribution/4 showed that even after three retries and sleeping we hit a temporarily non-existant package. I have to try out if retrying multiple times more and waiting longer is the way to go
Updated by okurz over 2 years ago
- Related to action #112595: continous deployment installed old version of openQA due to timeout accessing a repo size:M added
Updated by okurz over 2 years ago
- Blocks action #111446: openQA-in-openQA tests fail due to corrupted downloaded rpm auto_review:"Test died: command '.*zypper -n in os-autoinst-distri-opensuse-deps' failed at openqa//tests/install/test_distribution.pm line 1.*":retry added
Updated by okurz over 2 years ago
- Due date changed from 2022-06-23 to 2022-07-07
I have planned to work on "retry" during hackweek, so I might get something done next week
Updated by okurz over 2 years ago
- Due date deleted (
2022-07-07) - Status changed from New to Resolved
https://github.com/os-autoinst/os-autoinst-distri-openQA/pull/92 fixed everything at least for openQA-in-openQA, https://openqa.opensuse.org/tests/2453475#next_previous looks stable