action #163112: test fails in openqa_webui due to repeated and reproducible errors in reading from the devel:openQA repository "repodata…filelists-ext.xml.gz not found on medium" size:S - openQA Project (public) - openSUSE Project Management Tool

Custom queries

All 'new' issues w/o assignee, sorted by version/priority
All auto_review tickets
All auto_review+force_result tickets
openQA Infrastructure Project
openqa-review - Closed tickets last updated by openqa-review, last 30 days
QA roadmap long-term
QA SLE functional
QA SLE Functional - closed in last 14 days
QA SLE Functional - High, need to be refined
QA SLE Functional - over cycle time median
QA SLE u
QA SLE y
QA tools (tag not necessary in openQA and subprojects)
QA tools tag (tag not necessary in openQA and subprojects; excluding tickets in "Ready" version as they are already on the backlog)
QAC - Backlog
QE tools team - backlog (dev)
QE tools team - backlog (ready issues)
QE tools team - backlog SLA high
QE tools team - backlog SLA immediate
QE tools team - backlog SLA no immediate/urgent in feedback/blocked
QE tools team - backlog SLA normal
QE tools team - backlog SLA urgent
QE tools team - backlog SLO high
QE tools team - backlog SLO normal
QE tools team - backlog SLO urgent
QE tools team - backlog, high-level view (epics and higher)
QE tools team - backlog, non-reactive work, needs parent
QE tools team - backlog, top-level view (all sagas)
QE tools team - closed within last 14 days
QE tools team - closed within last 60 days
QE tools team - closed yesterday
QE Tools Team - Collaborative Session
QE tools team - due date forecast
QE Tools team - due soon
QE tools team - exceeding due-date
QE tools team - infrastructure backlog
QE tools team - next - sorted by update time
QE tools team - next issues
QE tools team - non-estimated (unblocked) issues (dev)
QE tools team - non-estimated (unblocked) issues (infra)
QE tools team - ready issues - Workable
QE tools team - ready, not assigned/blocked/low
QE tools team - SLO high forecast
QE tools team - update forecast
QE tools team - updated by priority
QE tools team - what members of the team are working on - Feedback (not-low)
QE Tools Team Backlog By Assignee
Tools Team Retrospective
Tools Team Retrospective (not estimated or assigned)

Actions

Copy link

action #163112

closed

test fails in openqa_webui due to repeated and reproducible errors in reading from the devel:openQA repository "repodata…filelists-ext.xml.gz not found on medium" size:S

Added by okurz 6 months ago. Updated about 1 month ago.

Status:

Resolved

Priority:

Low

Assignee:

okurz

Category:

Regressions/Crashes

Target version:

Ready

Start date:

2024-07-02

Due date:

% Done:

Estimated time:

Tags:

alert, infra, reactive work

Description

Observation¶

openQA test in scenario openqa-Tumbleweed-dev-x86_64-openqa_install_nginx@64bit-2G fails in
openqa_webui
due to repeated and reproducible errors in reading from the devel:openQA repository "repodata…filelists-ext.xml.gz not found on medium"
on the command

retry -e -s 30 -- zypper -n --gpg-auto-import-keys ref

I assume the problem happens when devel:openQA is in the process of being refreshed due to frequent updates in devel:openQA however there should be a better way to ensure consistent and at best atomic updates of the repo content.

Expected result¶

Last good: :TW.29599 (or more recent)

Acceptance criteria¶

AC1: The scenario latest passes consistently even if devel:openQA is frequently updated

Suggestions¶

We "only" do 3 retries
- Consider how often retry in other cases, make it consistent
- Keep in mind the script timeout
Research upstream if there is a better way to handle that, e.g. look into github.com/openSUSE/zypper/, mailing lists or forums regarding OBS/mirror/zypper behaviour. Also engage with domain experts in corresponding chat channels to find best practices and apply them. According to livdywan she already did all of that. So maybe we need to come up with ideas ourselves.
- Maybe we need to set something cool in the OBS project config to keep older data intact until new repository content is completely available?

Further details¶

Always latest result in this scenario: latest

Related issues 3 (0 open — 3 closed)

Related to openQA Project (public) - action #162848: webui-docker-compose tests failing on GitHub PR's size:S

Resolved

okurz

Actions

Related to openQA Tests (public) - action #161729: [sporadic] test fails in containers/build of openqa-in-openqa probably due to temporary download.opensuse.org and zypper issues

Resolved

okurz

2024-06-04

Actions

Related to openQA Project (public) - action #165399: Unable to use openqa-single-instance due to "Valid metadata not found at specified URL" reproducing often size:S

Resolved

mkittler

2024-08-16

Actions

Issue # Delay: days Cancel

History
Notes
Property changes

Actions

Copy link

Updated by okurz 6 months ago

Subject changed from test fails in openqa_webui due to repeated and reproducible errors in reading from the devel:openQA repository "repodata…filelists-ext.xml.gz not found on medium" to test fails in openqa_webui due to repeated and reproducible errors in reading from the devel:openQA repository "repodata…filelists-ext.xml.gz not found on medium" size:S
Description updated (diff)
Status changed from New to Workable

Actions

Copy link

Updated by okurz 5 months ago

Related to action #162848: webui-docker-compose tests failing on GitHub PR's size:S added

Actions

Copy link

Updated by okurz 5 months ago

Tags changed from alert, infra to alert, infra, reactive work

Actions

Copy link

Updated by okurz 5 months ago

Project changed from openQA Tests (public) to openQA Project (public)
Category deleted (~~Bugs in existing tests~~)
Priority changed from Normal to High

Actions

Copy link

Updated by dheidler 5 months ago

Related to deleted (action #162848: webui-docker-compose tests failing on GitHub PR's size:S)

Actions

Copy link

Updated by dheidler 5 months ago

Blocks action #162848: webui-docker-compose tests failing on GitHub PR's size:S added

Actions

Copy link

Updated by okurz 5 months ago

Category set to Regressions/Crashes

Actions

Copy link

Updated by mkittler 5 months ago

Blocks deleted (action #162848: webui-docker-compose tests failing on GitHub PR's size:S)

Actions

Copy link

Updated by mkittler 5 months ago

Status changed from Workable to In Progress
Assignee set to mkittler

Actions

Copy link

#10

Updated by mkittler 5 months ago

I asked about it on the Matrix OBS channel. In the meantime I created a PR to workaround it: https://github.com/os-autoinst/os-autoinst-distri-openQA/pull/184

Actions

Copy link

#11

Updated by okurz 5 months ago

Related to action #162848: webui-docker-compose tests failing on GitHub PR's size:S added

Actions

Copy link

#12

Updated by openqa_review 5 months ago

Due date set to 2024-07-24

Setting due date based on mean cycle time of SUSE QE Tools

Actions

Copy link

#13

Updated by okurz 5 months ago

https://github.com/os-autoinst/os-autoinst-distri-openQA/pull/184 merged. can you link the matrix conversation?

Actions

Copy link

#14

Updated by mkittler 5 months ago

Status changed from In Progress to Feedback

https://matrix.to/#/!JmUbgVFsJPolpfQhiT:opensuse.org/$Yqu6Mihx2dO7hHgYQRHIx0qropyOlcVFqFhBhPgHkg0?via=opensuse.org&via=matrix.org&via=kde.org but there was no answer.

Actions

Copy link

#15

Updated by okurz 5 months ago

Related to action #161729: [sporadic] test fails in containers/build of openqa-in-openqa probably due to temporary download.opensuse.org and zypper issues added

Actions

Copy link

#16

Updated by mkittler 5 months ago

I asked again on our internal channel. I guess there were two main suggestions:

Add -vvv flag to zypper -vvv ref openQA to see eventual details.
Collect something tail -n 400 /var/log/zypper.log on failure to see if any mirror is involved

Both doesn't sound really promising and 2. is also in conflict with AC1 because I needed to remove the retry again. (Otherwise we would probably not be aware of the relevant jobs and never look into those logs after all.) I guess I'll leave checking the zypper log for when I encounter the issue when updating my local system or one of our servers manually.

Actions

Copy link

#17

Updated by okurz 5 months ago

But isn't an openQA test the perfect candidate to do this reproduction and log collection? IMHO the issue is more likely to happen if devel:openQA is rebuilt so consider triggering the tests just after/while devel:openQA content is building or trigger that recurringly to trigger the issue

Actions

Copy link

#18

Updated by okurz 5 months ago

Status changed from Feedback to Workable

Actions

Copy link

#19

Updated by mkittler 5 months ago

Status changed from Workable to Resolved

I think this is too much effort for this specific and not so often happening problem - especially because those ideas are actually not that promising (they're just the only thing that came to mind).

Actions

Copy link

#20

Updated by okurz 5 months ago

Due date deleted (~~2024-07-24~~)
Status changed from Resolved to In Progress
Assignee changed from mkittler to okurz
Priority changed from High to Low

ok, interesting. I think I will try to build in some of the mentioned debugging and try some things in openQA tests.

Actions

Copy link

#21

Updated by okurz 5 months ago

Due date set to 2024-07-26
Status changed from In Progress to Feedback

https://github.com/os-autoinst/os-autoinst-distri-openQA/pull/186

Actions

Copy link

#22

Updated by livdywan 5 months ago

okurz wrote in #note-21:

https://github.com/os-autoinst/os-autoinst-distri-openQA/pull/186

The branch was merged. As we found work-arounds in #162848 are not effective so far we should probably ask once more for help from people working on OBS.

Actions

Copy link

#23

Updated by okurz 5 months ago

Due date changed from 2024-07-26 to 2024-12-31
Target version changed from Ready to Tools - Next

I understood that there is

livdywan wrote in #note-22:

okurz wrote in #note-21:

https://github.com/os-autoinst/os-autoinst-distri-openQA/pull/186

The branch was merged. As we found work-arounds in #162848 are not effective so far we should probably ask once more for help from people working on OBS.

Yes, already did that in https://suse.slack.com/archives/C02BXKBMXNV/p1720599897971239 but no success so far. Seems there is little interest to look into that problem unless we could help with more information which we need to have more time for.

Actions

Copy link

#24

Updated by okurz 4 months ago

Four failures in https://openqa.opensuse.org/tests/overview?distri=openqa&build=%3ATW.30418&version=Tumbleweed&groupid=24 showing same problems:

https://openqa.opensuse.org/tests/4386217#step/openqa_webui/1
https://openqa.opensuse.org/tests/4386220#step/openqa_webui/1
https://openqa.opensuse.org/tests/4386221#step/openqa_webui/1
https://openqa.opensuse.org/tests/4386222#step/openqa_webui/1

Asked in https://suse.slack.com/archives/C02BXKBMXNV/p1722954943422089?thread_ts=1720599897.971239&cid=C02BXKBMXNV

Today we had 4 openQA tests failing in the same step trying to run zypper -n --gpg-auto-import-keys ref, e.g. https://openqa.opensuse.org/tests/4386217#step/openqa_webui/5 . The test also runs zypper again with -vvv as visible in https://openqa.opensuse.org/tests/4386217#step/openqa_webui/9 and we have zypper.log available, see https://openqa.opensuse.org/tests/4386217/logfile?filename=openqa_webui-zypper.log.txt . Can you please take a look and see what you can make out of it and how we would be able to avoid that?

Actions

Copy link

#25

Updated by okurz 4 months ago

On request also forwarded now to https://suse.slack.com/archives/C02CL8FJ8UF/p1722956051597539 #discuss-zypp

Actions

Copy link

#26

Updated by okurz 4 months ago

https://github.com/os-autoinst/os-autoinst-distri-openQA/pull/195#issuecomment-2271821021

andrii-suse commented 2 hours ago
so it looks the root cause is that download.o.o redirects request repomd* files, which should not happen. I will analyze that redirect, no additional info is needed atm.

Actions

Copy link

#27

Updated by okurz 4 months ago

More progress:

(Oliver Kurz) Thx. Yes, we can. https://github.com/os-autoinst/os-autoinst-distri-openQA/pull/195 is now merged. We observed the issue also in many other places, e.g. gitlab CI jobs that we use for automatically deploying openQA and such but it's easier to reproduce in openQA-in-openQA tests. You stated

deployed a hotpatch and downloadcontent should now be used only for versioned files

so let's see if we hit the problem again at all

Actions

Copy link

#28

Updated by okurz 4 months ago

Related to action #165399: Unable to use openqa-single-instance due to "Valid metadata not found at specified URL" reproducing often size:S added

Actions

Copy link

#29

Updated by okurz about 1 month ago

Due date deleted (~~2024-12-31~~)
Status changed from Feedback to Resolved

It seems like with changes on the mirror infrastructure and the retries we have applied on multiple levels we are not running into related problems anymore recently.

Actions

Copy link

#30

Updated by okurz about 1 month ago

Target version changed from Tools - Next to Ready

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

QA (public) » openQA Project (public)

Tags

Custom queries

action #163112

test fails in openqa_webui due to repeated and reproducible errors in reading from the devel:openQA repository "repodata…filelists-ext.xml.gz not found on medium" size:S

Observation¶

Expected result¶

Acceptance criteria¶

Suggestions¶

Further details¶

Updated by okurz 6 months ago

Updated by okurz 5 months ago

Updated by okurz 5 months ago

Updated by okurz 5 months ago

Updated by dheidler 5 months ago

Updated by dheidler 5 months ago

Updated by okurz 5 months ago

Updated by mkittler 5 months ago

Updated by mkittler 5 months ago

Updated by mkittler 5 months ago

Updated by okurz 5 months ago

Updated by openqa_review 5 months ago

Updated by okurz 5 months ago

Updated by mkittler 5 months ago

Updated by okurz 5 months ago

Updated by mkittler 5 months ago

Updated by okurz 5 months ago

Updated by okurz 5 months ago

Updated by mkittler 5 months ago

Updated by okurz 5 months ago

Updated by okurz 5 months ago

Updated by livdywan 5 months ago

Updated by okurz 5 months ago

Updated by okurz 4 months ago

Updated by okurz 4 months ago

Updated by okurz 4 months ago

Updated by okurz 4 months ago

Updated by okurz 4 months ago

Updated by okurz about 1 month ago

Updated by okurz about 1 month ago