Project

General

Profile

tickets #81908

Request for VM machines in NUE and PRV

Added by anikitin@suse.de 4 months ago. Updated 4 months ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Servers hosted in NBG
Target version:
-
Start date:
2021-01-08
Due date:
2021-01-27
% Done:

100%

Estimated time:
2.00 h

Description

Hi,

MirrorCache project is aimed to address current limitations of
download.opensuse.org, especially for users outside Europe.

Currently it shows promising results when zypper is configured to
get packages from mirrorcache.opensuse.org.

Next step is to setup dedicated production machines in Nuremberg and
Provo and start more aggressive usage.

So this is formal request to set up such machines with 4-8G or 16G RAM,
~12G disk space with external IP address and ssh access from heroes
networks. (and current host behind mirrorcache.opensuse.org will remain
as development/test setup).

Please let me know if you need more info or let me know what should I
do to have such request satisfied.

Regards,

Andrii Nikitin anikitin@suse.de
DevOPS Automation and Build Service Engineer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5
90409 Nuremberg
Germany

(HRB 247165, AG München)
Managing Director: Felix Imendörffer

History

#1 Updated by pjessen 4 months ago

  • Status changed from New to Feedback
  • Private changed from Yes to No

anikitin@suse.de wrote:

Hi,

MirrorCache project is aimed to address current limitations of
download.opensuse.org, especially for users outside Europe.

As my role currently includes that of mirror admin for openSUSE, I think I would like to understand what limitations we are talking about.
Especially before we start allocating significant resources to alleviate those "limitations".
I also find it somewhat strange that you should be working on this without even consulting me.

#2 Updated by anikitin@suse.de 4 months ago

Hi,

On Fri, 08 Jan 2021 17:39:24 +0000
redmine@opensuse.org wrote:

As my role currently includes that of mirror admin for openSUSE, I
think I would like to understand what limitations we are talking
about. Especially before we start allocating significant resources to
alleviate those "limitations".

Sure, below are the main goals here:

  1. MirrorCache may be hosted geographically close to end users, without
    having access to actual files. So, e.g. users in Australia will not need
    to send requests to Germany, and then redirected to close mirror, as
    they currently do.

  2. (* - this should be configurable probably). MirrorCache makes sure
    that selected mirror does have requested file prior to redirecting.
    (currently there is a chance that download.opensuse.org may redirect to
    mirrors, on which requested file has gone up to ~24h ago).
    (This may be an example of such issue (probably)
    https://bugzilla.opensuse.org/show_bug.cgi?id=1159688 )

  3. Currently download.opensuse.org doesn't distinguish https/http
    requests. MirrorCache does prioritize those mirrors, which support
    corresponding client's schema (http/https). So if the client uses https

  4. MirrorCache will make sure that selected mirror does support https.
    https://github.com/openSUSE/mirrorbrain/issues/3

  5. Currently download.opensuse.org doesn't distinguish ipv4/ipv6
    requests. This may be a problem for clients which support only ipv4 or
    only ipv6 :
    https://github.com/openSUSE/mirrorbrain/issues/4

  6. MirrorCache has Web interface, so admins use WebUI to maintain
    mirrors. It is planned to add WebUI so everyone can add own
    mirror and maintain them accordingly in UI.

The same and little more is mentioned:
https://github.com/andrii-suse/MirrorCache/blob/master/doc/mb_compare.md

--
Andrii Nikitin anikitin@suse.de
DevOPS Automation and Build Service Engineer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5
90409 Nuremberg
Germany

(HRB 247165, AG München)
Managing Director: Felix Imendörffer

#3 Updated by pjessen 4 months ago

  • Status changed from Feedback to New

Thanks for the information. I have added some comments below.
Please don't be offended, but I think this is shooting sparrows with a cannon.

  1. MirrorCache may be hosted geographically close to end users, without having access to actual files. So, e.g. users in Australia will not need to send requests to Germany, and then redirected to close mirror, as they currently do.

Okay - is that a significant advantage? Maybe ten-fifteen years ago I could understand the idea, but today?
If we have the need, I suggest it would be more sensible to run more copies of mirrorbrain with an anycast IP address.

  1. (* - this should be configurable probably). MirrorCache makes sure that selected mirror does have requested file prior to redirecting. (currently there is a chance that download.opensuse.org may redirect to mirrors, on which requested file has gone up to ~24h ago).

The scanner runs continually, but with our ever-growing amount of data, maybe it does sometimes fall behind.
I think improving on the scanner setup would have been a better idea than creating a new, separate solution.

  1. Currently download.opensuse.org doesn't distinguish https/http requests. MirrorCache does prioritize those mirrors, which support corresponding client's schema (http/https). So if the client uses https
  2. MirrorCache will make sure that selected mirror does support https. https://github.com/openSUSE/mirrorbrain/issues/3
  3. Currently download.opensuse.org doesn't distinguish ipv4/ipv6 requests. This may be a problem for clients which support only ipv4 or only ipv6 : https://github.com/openSUSE/mirrorbrain/issues/4

Well, in my opinion the above items 3 and 4 are the only worthwhile improvements, but I don't understand why they were not implemented in mirrorbrain instead. (in fact, I thought there was work underway already ?).

Setting up two new machines for the above seems to me to be way overkill. Still, not my decision, only my opinion :-)

#4 Updated by andriinikitin 4 months ago

pjessen wrote:

  1. MirrorCache may be hosted geographically close to end users, without ... Okay - is that a significant advantage? Maybe ten-fifteen years ago I could understand the idea, but today?

If user needs to update 3K packages, that is 3K cross-continental requests to mirrorbrain, resulting in additional ~10 min processing time (but on practice closer to 20 min as far as I understand).

If we have the need, I suggest it would be more sensible to run more copies of mirrorbrain with an anycast IP address.

Currently mirrorbrain needs much more resources (scanners machine uses 84G RAM) and maintaining physical copies of download.opensuse.org is not an option as far as I understand.

  1. (* - this should be configurable probably). MirrorCache makes sure that selected mirror does have requested file prior to redirecting. (currently there is a chance that download.opensuse.org may redirect to mirrors, on which requested file has gone up to ~24h ago).

The scanner runs continually, but with our ever-growing amount of data, maybe it does sometimes fall behind.

We cannot do full scan more often than every 24H, so it is by design that a broken mirror can wreck user's experience. At the same time: if we have proper job queue, then verifying recent requests will be much simpler, thus problem should be detected quicker.

I think improving on the scanner setup would have been a better idea than creating a new, separate solution.

Properly improving scanner setup will need proper Job Queue, which leads to completely new system and MirrorCache is exactly result of this.

  1. Currently download.opensuse.org doesn't distinguish https/http ...
  2. Currently download.opensuse.org doesn't distinguish ipv4/ipv6 ...

Well, in my opinion the above items 3 and 4 are the only worthwhile improvements, but I don't understand why they were not implemented in mirrorbrain instead. (in fact, I thought there was work underway already ?).

The same answer as above: mirrorbrain needs job queue and proper web application for admin UI. In my understanding it will be even bigger mess if we try to add that to current project, because effectively it will result that two projects will be in single repository.

Setting up two new machines for the above seems to me to be way overkill. Still, not my decision, only my opinion :-)

I am not sure how it can be overkill comparing to "multiple copies of mirrorbrain", which will be much more hungry for resources.

In my understanding conceptual architecture must look like below, and I don't see much alternatives (please also keep in mind that enterprise setup suffers from similar problems as well):

  • there must be a host on each continent, which doesn't have physical files, but is able to quickly find proper "local" mirror.

If you think that requested 4G is a problem we can start with 1-2G, but that may need an increase at some point.

#5 Updated by lrupp 4 months ago

  • Category set to Servers hosted in NBG
  • Status changed from New to In Progress
  • Assignee set to lrupp
  • % Done changed from 0 to 30

mirrorcache2.infra.opensuse.org => 192.168.47.28

TODO:

  • create additional machine in the Provo setup
  • assign external IP addresses and DNS entriey for both machines
  • saltify the setup

#6 Updated by andriinikitin 4 months ago

I've realized that the machines will need new DNS - will it be possible to add to TODO like below:

  • Provide new DNS names for the machines: mirrorcache-eu.opensuse.org and mirrorcache-us.opensuse.org

#7 Updated by andriinikitin 4 months ago

Sorry for misunderstanding, the machines don't need external IP - they can use existing proxies.
All looks set now, feel free to close the call
mirrorcache-eu.opensuse.org
mirrorcache-us.opensuse.org

#8 Updated by lrupp 4 months ago

  • Due date set to 2021-01-27
  • Status changed from In Progress to Closed
  • % Done changed from 30 to 100
  • Estimated time set to 2.00 h

andriinikitin wrote:

All looks set now, feel free to close the call

Perfect, thanks!

Also available in: Atom PDF