Project

General

Profile

Actions

tickets #137045

open

download.opensuse.org issues seen in our build pipelines

Added by gameboy974 7 months ago. Updated 7 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
Mirrors
Target version:
-
Start date:
2023-09-26
Due date:
% Done:

0%

Estimated time:

Description

Hi there!

I would like to report an issue my team saw yesterday during the build of our project: https://github.com/aquarist-labs/s3gw.

It seems that download.opensuse.org/distribution/leap/15.4/repo/oss/http://download.opensuse.org/distribution/leap/15.4/repo/oss/ was erratic for 8 hours, so could you confirm that we were facing an infrastructure issue during that time? We are also all ears if there is any tips/advices to avoid being slow down by issues like this.

Thank you very much!

Attempt#1: https://github.com/aquarist-labs/s3gw/actions/runs/6296237263/job/17090845649
2023-09-25T07:44:07.4929222Z #8 725.1 Timeout exceeded when accessing 'http://download.opensuse.org/distribution/leap/15.4/repo/oss/x86_64/libpmem1-1.11.1-150400.1.10.x86_64.rpm'.
2023-09-25T07:44:07.6436902Z #8 725.1 Retrying in 30 seconds...
2023-09-25T07:47:40.6257739Z #8 756.8 Retrieving: libpmem1-1.11.1-150400.1.10.x86_64.rpm […………error]

Attempt#2: https://github.com/aquarist-labs/s3gw/actions/runs/6296237263/job/17095244395
2023-09-25T10:09:42.8647089Z #9 726.1 Timeout exceeded when accessing 'http://download.opensuse.org/distribution/leap/15.4/repo/oss/noarch/perl-Error-0.17025-1.20.noarch.rpm'.
2023-09-25T10:13:15.7906489Z #9 757.8 Retrieving: perl-Error-0.17025-1.20.noarch.rpm [............error]

Attempt#3: https://github.com/aquarist-labs/s3gw/actions/runs/6296237263/job/17101805355
2023-09-25T13:35:10.6202386Z #9 725.1 Timeout exceeded when accessing 'http://download.opensuse.org/distribution/leap/15.4/repo/oss/x86_64/cpp-7-3.9.1.x86_64.rpm'.
2023-09-25T13:35:10.7708850Z #9 725.1 Retrying in 30 seconds...
2023-09-25T13:38:43.7541710Z #9 756.8 Retrieving: cpp-7-3.9.1.x86_64.rpm [............error]

Attempt#4: https://github.com/aquarist-labs/s3gw/actions/runs/6296237263/job/17103131938
2023-09-25T14:22:18.9547935Z #5 1478.1 - Timeout exceeded when accessing 'http://download.opensuse.org/update/leap/15.4/sle/repodata/a59d72a8e7993bf4f2020827e1f8dcfef9c8dc68d40138476b13b69a4079ec79-susedata.xml.gz'.

Attempt#5: https://github.com/aquarist-labs/s3gw/actions/runs/6296237263/job/17105018495
2023-09-25T14:58:37.2113891Z #9 724.8 Timeout exceeded when accessing 'http://download.opensuse.org/distribution/leap/15.4/repo/oss/x86_64/gettext-runtime-0.20.2-1.43.x86_64.rpm'.
2023-09-25T15:02:10.1384258Z #9 756.5 Retrieving: gettext-runtime-0.20.2-1.43.x86_64.rpm [............error]

Attempt#6: https://github.com/aquarist-labs/s3gw/actions/runs/6296237263/job/17106177406
2023-09-25T15:27:59.4680967Z #9 720.3 Timeout exceeded when accessing 'http://download.opensuse.org/update/leap/15.4/sle/x86_64/sqlite3-devel-3.39.3-150000.3.20.1.x86_64.rpm'.
2023-09-25T15:31:32.5434586Z #9 752.0 Retrieving: sqlite3-devel-3.39.3-150000.3.20.1.x86_64.rpm

Passing Attempt#7: https://github.com/aquarist-labs/s3gw/actions/runs/6296237263/job/17107238110
2023-09-25T15:47:57.5634617Z #9 148.6 (310/310) Installing: python3-Sphinx-2.3.1-150400.5.69.noarch [.....done]

Regards,
[image001.png]

Vincent Moutoussamy
Engineering Manager
Paris, France


Files

image001.png (3.88 KB) image001.png gameboy974, 2023-09-26 14:37
Actions #1

Updated by crameleon 7 months ago

  • Category set to Mirrors
  • Private changed from Yes to No
Actions #2

Updated by luc14n0 7 months ago

Hi there Vicent,

First of all, I'd like to point out that download.opensuse.org is basically a redirector.
So, those logs of CI runs won't get us much closer to the truth. And taking a look at our
monitor metrics for the last 48 hours for the download.opensuse.org host doesn't show
anything unusual to me. Thus, in order to get (next) to the bottom of this we'd need to
see the Zypper log files (/var/log/zypper.log) of those failed runs -- which I understand
might not be possible due to the CI be using containers.

That said, we also have regional redirectors around the world -- mirrorcache-eu.opensuse.org
in EU, and there might be other secondary redirectors around EU that I'm not aware of
(yet) -- that send people to mirrors of such regions. This means that nowadays there
won't be that many requests reaching download.opensuse.org effectively as there were in
the past.

Looking at those CI runs I can see that all attempts failed at package download step of
the package installation, except attempt number 4 where it failed at repository metadata
refreshing. And they all fail with "Timeout exceeded when accessing ...", which makes me
think that the mirror those requests ended up had stability issues that caused those
connection timeout -- Zypper can timeout either establishing a connection to a server or
during downloads that takes too much time.

So, my best guess is the CI runs were hitting a bad mirror. Usually, depending on the
issue, bad mirrors are (temporarily) removed from the list of mirrors to avoid people
being redirected to them until their issues are sorted out. However, I imagine there
are cases where a bad mirror can't be distinguished from a good one, by our checks.

Unfortunately in this case I don't think there's much you can do to mitigate the
situation, since you people are using GitHub Actions. Otherwise, I'd advise to temporarily
use an alternate mirror directly instead of download.opensuse.org in the "baseurl" of
Zypper repo files under /etc/zypp/repos.d/. To find out what mirror you get redirected
to you can use an URL of a failing package download, such as:

curl -IL 'http://download.opensuse.org/distribution/leap/15.4/repo/oss/x86_64/gettext-runtime-0.20.2-1.43.x86_64.rpm'

And you can see a list of mirrors serving that package by visiting the URL:

http://download.opensuse.org/distribution/leap/15.4/repo/oss/x86_64/gettext-runtime-0.20.2-1.43.x86_64.rpm.mirrorlist

Kind regards,
Luciano

Actions #3

Updated by luc14n0 7 months ago

Interestingly enough, I've ran across the same issue, but in my case I strongly believe it happened because I was using a USB WiFi adapter with not good enough support by the in-tree Linux driver, which cripples severely my download speed.

And the "Timeout exceeded when accessing" output is a bit misleading as it time outs trying to download the rpm binary, really (Timeout exceeded when accessing 'http://mirrorcache-br-2.opensuse.org/repositories/GNOME:/Next/openSUSE_Factory/x86_64/lib
webkit2gtk-4_1-0-2.42.1-1038.1.x86_64.rpm'). What's more, I have "download.transfer_timeout = 300" in /etc/zypp/zypp.conf, instead of the default 180. However, from what I see Zypper is not honoring
that configuration, and is still timing out after only 3 minutes.

Regards,
Luciano

Actions

Also available in: Atom PDF