Project

General

Profile

Actions

tickets #137345

open

CDN redirecting to bad mirror (opensusemirror.vod.comcast.com)

Added by mikebirdgeneau about 1 year ago. Updated about 1 year ago.

Status:
Feedback
Priority:
Normal
Assignee:
-
Category:
Mirrors
Target version:
-
Start date:
2023-10-03
Due date:
% Done:

0%

Estimated time:

Description

Hi!
Been running into issues with repos in MicroOS/Aeon recently; it looks like for the failed URLs, the CDN is redirecting me to a subdirectory of:
http://opensusemirror.vod.comcast.com/tumbleweed/repo/

This server doesn't appear to be responding (timeout).
The first time I've run into the issue was on Sept 29, 2023, but the problem seems to be persisting.

This URL from the mirror status page returns a "raptor not found" message:
http://download.opensuse.org/app/server/opensusemirror.vod.comcast.com

Thanks for looking into it!
Mike


Files

vod.comcast.jpeg (66.8 KB) vod.comcast.jpeg pjessen, 2023-10-03 07:11
Screenshot from 2023-10-03 06-43-24.png (23.5 KB) Screenshot from 2023-10-03 06-43-24.png Screenshot of mirror from my location mikebirdgeneau, 2023-10-03 12:43
cdn_error.png (164 KB) cdn_error.png mikebirdgeneau, 2023-10-04 20:09
Actions #1

Updated by pjessen about 1 year ago

Been running into issues with repos in MicroOS/Aeon recently; it looks like for the failed URLs, the CDN is
redirecting me to a subdirectory of:
http://opensusemirror.vod.comcast.com/tumbleweed/repo/

This server doesn't appear to be responding (timeout).
The first time I've run into the issue was on Sept 29, 2023, but the problem seems to be persisting.

Hi Mike
Unable to reproduce - seen from here, that mirror works fine. See attached.

Actions #2

Updated by mikebirdgeneau about 1 year ago

Thanks - is it possible that the mirror is down from certain geographic locations?

I ran tracepath to see if I could gather any additional information, and here's what I found:

$ tracepath opensusemirror.vod.comcast.com
 1?: [LOCALHOST]                      pmtu 1500
[ ... omitted local / ISP traceroute ... ]
 9:  be-207-pe12.seattle.wa.ibone.comcast.net             34.045ms asymm  4 
10:  be-2412-cs04.seattle.wa.ibone.comcast.net            37.223ms asymm  5 
11:  be-1411-cr11.seattle.wa.ibone.comcast.net            36.600ms asymm  6 
12:  be-301-cr11.champa.co.ibone.comcast.net              64.818ms 
13:  be-1411-cs04.champa.co.ibone.comcast.net             67.000ms asymm 11 
14:  be-1314-cr14.champa.co.ibone.comcast.net             68.062ms asymm 10 
15:  be-304-cr13.1601milehigh.co.ibone.comcast.net        67.917ms asymm  9 
16:  be-1414-cs04.1601milehigh.co.ibone.comcast.net       68.008ms asymm  8 
17:  be-1111-cr11.1601milehigh.co.ibone.comcast.net       67.683ms asymm  7 
18:  be-304-cr21.350ecermak.il.ibone.comcast.net          70.848ms asymm  6 
19:  be-1221-cs22.350ecermak.il.ibone.comcast.net         69.404ms asymm  5 
20:  ae33-ar02-d.northlake.il.ndcchgo.comcast.net         72.509ms asymm  6 
21:  et-0-0-25-sas01-d.northlake.il.ndcchgo.comcast.net   90.459ms asymm 11 
22:  lo0-t1s8016-d.northlake.il.ndcchgo.comcast.net       69.922ms asymm  8 
23:  lo0-t2s8007-d.northlake.il.ndcchgo.comcast.net       68.901ms asymm  9 
24:  lo0-t1s8025-d.northlake.il.ndcchgo.comcast.net       73.599ms asymm 13 
25:  no reply
26:  no reply
27:  no reply
28:  no reply
29:  no reply
30:  no reply
     Too many hops: pmtu 1500
     Resume: pmtu 1500 

It looks like the vod.comcast mirror isn't accessible from my location - but it looks to be something within the Comcast infrastructure / routing?

Actions #3

Updated by pjessen about 1 year ago

mikebirdgeneau wrote in #note-2:

Thanks - is it possible that the mirror is down from certain geographic locations?

It is possible, some mirrors do restrict access to their own country only. In this case, it sounds unlikely though.

I ran tracepath to see if I could gather any additional information, and here's what I found:
[snip]
It looks like the vod.comcast mirror isn't accessible from my location - but it looks to be something within the Comcast infrastructure / routing?

Yes, that would also be my guess.

Actions #4

Updated by pjessen about 1 year ago

Oh, about "raptor not found" - I have no idea, it works for my own mirror: http://download.opensuse.org/app/server/mirror.hostsuisse.com

Actions #5

Updated by mikebirdgeneau about 1 year ago

I think we've narrowed it down to a CDN issue related to a specific mirror and a specific geographic / network routing location.

Now, fortunately I was able to figure out a work around, but certainly a less experienced user would really struggle in this situation (especially when it occurs during the system install process). It brings up an interesting aspect of the CDN for me: While mirror selection can result in higher latency, it also creates the potential for these types of issues - and local mirror selection logic would have simply resulted in fallback to a different mirror.

CDNs are definitely not something I'm very familiar with, but that's why I posted this issue here - hopefully someone with more experience can figure out what's going on, and how to ensure other users don't run into the same problems!

The only mention I could find of the 'raptor' error was in this repo:
https://github.com/drdrew42/renderer/issues/63
Perhaps the 'renderer' is being used somewhere in the infrastructure.

Cheers!
Mike

Actions #6

Updated by luc14n0 about 1 year ago

Hi there Mike,

I'm in Brazil and I can access http://opensusemirror.vod.comcast.com/tumbleweed/repo/:

$ curl -IL http://opensusemirror.vod.comcast.com/tumbleweed/repo/oss/repodata/repomd.xml
HTTP/1.1 200 OK
Server: nginx/1.21.5
Date: Tue, 03 Oct 2023 20:13:46 GMT
Content-Type: text/xml
Content-Length: 10725
Last-Modified: Sun, 01 Oct 2023 23:40:17 GMT
Connection: keep-alive
ETag: "651a0361-29e5"
Accept-Ranges: bytes

But, like in your case, traceroute/tracepath shows that the mirror is unreachable, with packets dying on Northlake, IL.

Regards,
Luciano

Actions #7

Updated by mikebirdgeneau about 1 year ago

Thanks, for the extra input / datapoint Luciano.

I'm now seeing the same thing as you, the mirror appears reachable by URL,
Tracepath is still showing packets dying at the same location.

The link from the mirrors.opensuse.org site: http://download.opensuse.org/app/server/opensusemirror.vod.comcast.com still returns the 'raptor not found' message, instead of the repo listing.

But... visiting the mirror directly is now working (yesterday it was timing out).

I guess what this problem likely comes down to is: Is there a better way to handle failures associated with geo-location specific failures in CDN urls? E.g. Fallback to a different mirror instead of CDN when this occurs?

When I ran into this from the system installation GUI, it was a bit of a pain.

I'll leave this to people far smarter / experienced than me with how OpenSUSE handles the mirrors, CDN, etc. and how this works with Zypper - but thought it worth bringing to attention.

Cheers,
Mike

Actions #8

Updated by luc14n0 about 1 year ago

Not a problem.

About the http://download.opensuse.org/app/server/... URL. I suspect that not every mirror are configured for that, otherwise there's something going on as there are several others in the same state:

http://download.opensuse.org/app/server/mirror.clarkson.edu
http://download.opensuse.org/app/server/mirror.ette.biz
http://download.opensuse.org/app/server/mirror.fcix.net
http://download.opensuse.org/app/server/mirror.math.princeton.edu

And probably there are more than that.

I guess what this problem likely comes down to is: Is there a better way to handle failures associated with geo-location specific failures in CDN urls? E.g. Fallback to a different mirror instead of CDN when this occurs?

I'm not sure the CDN is the one to blame, here. At the beginning of the CDN experiment setup, CDN nodes were caching rpm packages rather than redirection to mirrors -- just like download.opensuse.org. However, I can't tell whether we reached the point where we're caching redirections already, like it was initially aimed when we'd reach production level with regard to the CDN.

Now, talking more specifically about this specific issue you ran into. As far as I can tell the MirrorCache instance behind download.opensuse.org is the one scanning openSUSE mirrors from time to time to not only check if they're up, but to also check the freshness of its content. So, download.o.o is in Germany. And being in Germany, I'd suppose it wouldn't hit this issue like you did, since pjessen -- who's also in EU -- didn't hit it. Plus, if it would have hit the issue, it would've been excluded from mirrors.opensuse.org until the issue would get sorted out.

Unfortunately this kind of situation can't be easily prevented I'm afraid, when there aren't many routes to a given mirror and one of these routes get hit by some issue --like troubled DNS, an important optic cable gets broken, some BGP fault that ends up ignoring that specific route-- and in the end only some people can't reach the host.

You said you found a workaround, what was it?

Regards,
Luciano

Actions #9

Updated by mikebirdgeneau about 1 year ago

This all makes good sense to me.
You're right, it's not specifically a CDN issue - what I was getting at is that an external machine (especially in a different geo location) can't check if my local machine can access a specific mirror when making the request. Without being able to do that, this doesn't really seem preventable.

My workaround is nothing fancy, I basically just edited the repo URLs temporarily to point at a specific mirror (bypassing the CDN). I also considered a very temporary edit to /etc/hosts (but wasn't sure how SSL certificates would be handled).

Seems all good now. I definitely learned some things!

Actions #10

Updated by pjessen about 1 year ago

@luc14n0 wrote in #note-6:

But, like in your case, traceroute/tracepath shows that the mirror is unreachable, with packets dying on Northlake, IL.

It is not too unusual for ICMPs to stop being accepted at some ingress point.

At the beginning of the CDN experiment setup, CDN nodes were ...

I had automatically read "CDN" to mean "mirrorcache", I don't think the experimental CDN is involved here?

Mike wrote:

I guess what this problem likely comes down to is: Is there a better way to handle failures associated with geo-location specific failures in CDN urls?
E.g. Fallback to a different mirror instead of CDN when this occurs?
When I ran into this from the system installation GUI, it was a bit of a pain.

If a package cannot be fetched from location#1, zypper will always try the next one. etc. If a single mirror causes a problem, that suggests some package was only available from that one mirror. Mike, if you have a specific package that is causing an issue, we can look at the list of mirrors for that.

Actions #11

Updated by luc14n0 about 1 year ago

pjessen wrote in #note-10:

@luc14n0 wrote in #note-6:

But, like in your case, traceroute/tracepath shows that the mirror is unreachable, with packets dying on Northlake, IL.

It is not too unusual for ICMPs to stop being accepted at some ingress point.

Oh, I wasn't aware of that. What a bummer. In fact, running an online traceroute service that uses multiple locations, none reaches opensusemirror.vod.comcast.com. So, it might be ICMP saying "You shall not pass!"

At the beginning of the CDN experiment setup, CDN nodes were ...

I had automatically read "CDN" to mean "mirrorcache", I don't think the experimental CDN is involved here?

As far as I can tell it is. MicroOS (based) distro(s) are using it by default for a while now. And just to emphasizing my point, it wouldn't matter whether one's using download.opensuse.org or cdn.opensuse.org as baseurl in repo files if the CDN is not caching packages anymore -- which doesn't seem to be the case, as Mike were redirected to a mirror.

In any case, it would be really good to have a sample of the error output.

Actions #12

Updated by mikebirdgeneau about 1 year ago

I wish I'd done a better job at capturing additional information about the error; however, I did capture one URL that was causing the error, maybe it will be helpful in this context:

If a package cannot be fetched from location#1, zypper will always try the next one. etc. If a single mirror causes a problem, that suggests some package was only available from that one mirror. Mike, if you have a specific package that is causing an issue, we can look at the list of mirrors for that.

http://cdn.opensuse.org/tumbleweed/repo/non-oss/repodata/876557493f31abad37825b61e24c91ddc5859b8a65315e16258318341df1dcc2-appdata.xml.gz

This redirected to this URL (which was timing out):
http://opensusemirror.vod.comcast.com/tumbleweed/repo/non-oss/repodata/876557493f31abad37825b61e24c91ddc5859b8a65315e16258318341df1dcc2-appdata.xml.gz

Fortunately I'm still getting errors with the opensusemirror.vod.comcast.com, so here's some fresh terminal output:

$ curl -v http://cdn.opensuse.org/tumbleweed/repo/non-oss/repodata/876557493f31abad37825b61e24c91ddc5859b8a65315e16258318341df1dcc2-appdata.xml.gz >> /dev/null
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 91.193.113.70:80...
* Connected to cdn.opensuse.org (91.193.113.70) port 80
> GET /tumbleweed/repo/non-oss/repodata/876557493f31abad37825b61e24c91ddc5859b8a65315e16258318341df1dcc2-appdata.xml.gz HTTP/1.1
> Host: cdn.opensuse.org
> User-Agent: curl/8.3.0
> Accept: */*
> 
< HTTP/1.1 200 OK
< Server: nginx
< Date: Wed, 04 Oct 2023 20:13:55 GMT
< Content-Type: application/octet-stream
< Content-Length: 2662
< Last-Modified: Sun, 01 Oct 2023 23:08:21 GMT
< Connection: keep-alive
< ETag: "6519fbe5-a66"
< Accept-Ranges: bytes
< 
{ [2662 bytes data]
100  2662  100  2662    0     0  13582      0 --:--:-- --:--:-- --:--:-- 13651
* Connection #0 to host cdn.opensuse.org left intact

And the opensusemirror.vod.comcast.com mirror:

curl -v http://opensusemirror.vod.comcast.com/tumbleweed/repo/non-oss/repodata/876557493f31abad37825b61e24c91ddc5859b8a65315e16258318341df1dcc2-appdata.xml.gz
*   Trying 96.106.45.130:80...
* connect to 96.106.45.130 port 80 failed: Connection timed out
* Failed to connect to opensusemirror.vod.comcast.com port 80 after 133234 ms: Couldn't connect to server
* Closing connection
curl: (28) Failed to connect to opensusemirror.vod.comcast.com port 80 after 133234 ms: Couldn't connect to server
Actions #13

Updated by luc14n0 about 1 year ago

mikebirdgeneau wrote in #note-12:

...
And the opensusemirror.vod.comcast.com mirror:

curl -v http://opensusemirror.vod.comcast.com/tumbleweed/repo/non-oss/repodata/876557493f31abad37825b61e24c91ddc5859b8a65315e16258318341df1dcc2-appdata.xml.gz
*   Trying 96.106.45.130:80...
* connect to 96.106.45.130 port 80 failed: Connection timed out
* Failed to connect to opensusemirror.vod.comcast.com port 80 after 133234 ms: Couldn't connect to server
* Closing connection
curl: (28) Failed to connect to opensusemirror.vod.comcast.com port 80 after 133234 ms: Couldn't connect to server

That's useful, as I'm not running into this issue. So, it might me as I feared: it's a localized failure and MirrorCache isn't running into the issue like you. In such cases I suspect there's not much to be done to improve the regular mirror scans.


P.S.

Now, talking more specifically about this specific issue you ran into. As far as I can tell the MirrorCache instance behind download.opensuse.org is the one scanning openSUSE mirrors from time to time to not only check if they're up, but to also check the freshness of its content. So, download.o.o is in Germany. And being in Germany, I'd suppose it wouldn't hit this issue like you did, since pjessen -- who's also in EU -- didn't hit it. Plus, if it would have hit the issue, it would've been excluded from mirrors.opensuse.org until the issue would get sorted out.

Amending myself. After reading bits and pieces here and there, I've found out that regional MirrorCache instances, e.g. North America, do scan mirrors from their region and report back to the "central" MirrorCache (behind download.opensuse.org). So, it seems that the regional MirrorCache instance that takes care of opensusemirror.vod.comcast.com isn't either having issues, more specifically the checks/scans being done aren't catching the issues, otherwise you wouldn't be redirected to this mirror.

Actions #14

Updated by pjessen about 1 year ago

luc14n0 wrote in #note-11:

As far as I can tell it is. MicroOS (based) distro(s) are using it by default for a while now. And just to emphasizing my point, it wouldn't matter whether one's using download.opensuse.org or cdn.opensuse.org as baseurl in repo files if the CDN is not caching packages anymore -- which doesn't seem to be the case, as Mike were redirected to a mirror.

Mea culpa, I am not up-to-date on the CDN setup.
Sorry Mike, I thought we were talking about mirrorcache, where a zypper request would have been given multiple mirrors.

Actions #15

Updated by luc14n0 about 1 year ago

pjessen wrote in #note-14:

luc14n0 wrote in #note-11:

As far as I can tell it is. MicroOS (based) distro(s) are using it by default for a while now. And just to emphasizing my point, it wouldn't matter whether one's using download.opensuse.org or cdn.opensuse.org as baseurl in repo files if the CDN is not caching packages anymore -- which doesn't seem to be the case, as Mike were redirected to a mirror.

Mea culpa, I am not up-to-date on the CDN setup.
Sorry Mike, I thought we were talking about mirrorcache, where a zypper request would have been given multiple mirrors.

No worries. Anyway, I got confirmation from Bernhard that the CDN is caching content up to 100K, anything else is redirected. As far as I can tell, that's the same behavior of download.opensuse.org / regional MirrorCache instances. Both CDN and MirrorCache instances should have similar behavior on when and where to redirect requests.

Actions #16

Updated by pjessen about 1 year ago

luc14n0 wrote in #note-15:

Both CDN and MirrorCache instances should have similar behavior on when and where to redirect requests.

Thanks for the explanation, I did not imagine the CDN would also be doing redirections.
I would expect to see multiple mirrors listed then. Mike should not be ending up in a situation where one unavailable mirror stops the process.
Unless in the very exceptional case where only a single mirror actually has the package.

Actions

Also available in: Atom PDF