Project

General

Profile

action #111446

openQA-in-openQA tests fail due to corrupted downloaded rpm auto_review:"Test died: command '.*zypper -n in os-autoinst-distri-opensuse-deps' failed at openqa//tests/install/test_distribution.pm line 1.*":retry

Added by okurz about 1 month ago. Updated 9 days ago.

Status:
Blocked
Priority:
Low
Assignee:
Target version:
Start date:
2022-05-23
Due date:
% Done:

0%

Estimated time:

Description

Observation

https://openqa.opensuse.org/tests/2364769#step/test_distribution/8 shows in the screenshot

canvas.png

so a digest failed for an RPM package. I wonder what users are expected to do. There is a question to "retry retrieval" which of course is negated due to the non-interactive mode. Of course we try to get around that with another "retry" on the outer level but this becomes annoying quickly so we should aim to get a proper solution into the OS itself, e.g. into zypper.

Acceptance criteria

  • AC1: Common openSUSE related openQA test distributions (e.g. os-autoinst-distri-opensuse+os-autoinst-distri-openQA) uses retry on related errors automatically

Suggestions

There are already upstream feature requests like already resolved https://github.com/openSUSE/zypper/issues/177 and https://github.com/openSUSE/zypper/issues/312 but also there is at least one open feature request https://github.com/openSUSE/zypper/issues/420 which we can support, propose an implementation to, ask nicely to be prioritized, etc.

canvas.png (44.2 KB) canvas.png okurz, 2022-05-23 09:40
13289

Related issues

Blocked by QA - action #112232: [tools] Multiple recurring failures due to zypper failing to download packages temporarilyNew2022-06-092022-07-07

History

#1 Updated by okurz about 1 month ago

  • Tags set to reactive work

#2 Updated by andriinikitin about 1 month ago

To me these "digest failed" and "file not found" in the next job look like concurrency issues when rpm is replaced on server during download or after zypper refresh.
There is no protection from that in infrastructure and it is likely to happen if source gets released / published often.
The best idea I have at the moment is just retry zypper ref -f && zypper in ... again

#3 Updated by okurz about 1 month ago

andriinikitin wrote:

To me these "digest failed" and "file not found" in the next job look like concurrency issues when rpm is replaced on server during download or after zypper refresh.
There is no protection from that in infrastructure and it is likely to happen if source gets released / published often.
The best idea I have at the moment is just retry zypper ref -f && zypper in ... again

Yes, but IMHO that is something that zypper should allow to do in the non-interactive mode. It already offers to retry but I see no point in needing to do that in another way in non-interactive scripts when zypper already offers it in the interactive way.

#4 Updated by okurz about 1 month ago

  • Priority changed from Normal to Low

#5 Updated by okurz 29 days ago

  • Status changed from New to Blocked
  • Assignee set to okurz

#6 Updated by okurz 9 days ago

  • Blocked by action #112232: [tools] Multiple recurring failures due to zypper failing to download packages temporarily added

#7 Updated by okurz 9 days ago

  • Subject changed from openQA-in-openQA tests fail due to corrupted downloaded rpm auto_review:"Test died: command 'zypper -n in os-autoinst-distri-opensuse-deps' failed at openqa//tests/install/test_distribution.pm line 13":retry to openQA-in-openQA tests fail due to corrupted downloaded rpm auto_review:"Test died: command '.*zypper -n in os-autoinst-distri-opensuse-deps' failed at openqa//tests/install/test_distribution.pm line 1.*":retry

In the meantime we have updated the test code to use retries but that's not enough, see #112232, the blocker. Updated auto-review expression accordingly.

Also available in: Atom PDF