Project

General

Profile

tickets #81831

OpenSUSE infrastructure migration

Added by esujskaja 5 months ago. Updated 15 days ago.

Status:
New
Priority:
Normal
Assignee:
opensuse-admin
Category:
Core services and infra
Target version:
-
Start date:
2021-01-06
Due date:
% Done:

0%

Estimated time:

Description

Hello all,

Based on discussion in Heros at Jan 6th, here is the tracker for migration OpenSUSE infrastructure away from SUSE based DC - probably to the public cloud.

As initial steps, we're defined collecting info and description of the existing infrastructure and start developing of the initial concept of move.

We also need to understand the financial side of the matter.

Zhenya

Evženie Šujskaja (esujskaja@suse.comesujskaja@suse.com)
+420 702 285 979
Engineering Infrastructure team lead
Křižíkova 148/34
186 00 Praha, CZ

[cid:image001.png@01D6E425.27FCCC50]

image001.png (3.47 KB) image001.png esujskaja, 2021-01-06 11:12
10920

History

#1 Updated by bmwiedemann 5 months ago

There is mostly 1 internal network 192.168.47.0/24 with a salt-master minnie.infra.o.o managing 64 minions.

Most machines have no public IP.
Public services point their DNS to login2.opensuse.org. (SUSE-only proxy-pair handling auth via UCS/IDP) or to proxy.opensuse.org. (heroes-managed haproxy pair = anna+elsa)

#2 Updated by bmwiedemann 5 months ago

concept for a cloud-ready workload:

static.opensuse.org

workload: web-server with static content updated from git

Can use openSUSE Leap/Tumbleweed images built in OBS, auto-updated in cloud via publish-hooks
provisioning of VMs:

setup via salt-ssh : sets up VPN, salt-minion

setup via salt (master+minion)

for maintenance:

fire up a new VM

provision

test

move fail-over-IP to new VM via cloud API

test

delete old VM after grace period

Can have multiple VMs in different locations with all their floating-IPs in DNS for handling higher load

#3 Updated by esujskaja 5 months ago

So, in a nutshell, what would we get as a testing environment? Amount of instances, power, services (like RDS)?

Let's verbalize an order, and I'll check if we can get it.

#4 Updated by cboltz 5 months ago

static.opensuse.org is currently served by 3 "boring" (and salted) VMs (narwal[5-7].i.o.o), so you can look up the details in the salt roles static_master and web_static.

Short version:

  • One of these VMs (narwal5) is (also) static_master which means a cronjob fetching the content from github *) and rsyncing it to all web_static VMs. Note that it's doing rsync over ssh, which means all web_static VMs must be listed in .ssh/known_hosts which makes Bernhard's idea of "throw away old VMs and create new ones" a bit hard ;-)
  • It shouldn't be a surprise that the web_static VMs (narwal[5-7]) run nginx to serve that content to the world ;-) - behind haproxy (anna/elsa) of course.

*) fontinfo.o.o data isn't on github, IIRC it gets manually generated somewhere[tm] once a year and then uploaded to the VMs. And I just noticed that someone also used those VMs to host a bugzilla maintenance page during the bugzilla migration last year. (Unfortunately fontinfo and the bugzilla maintenance page are not salted.)

Current disk usage (besides the OS): 2.5 GB for static_master data (basically the cloned git repos), and 4.3 GB for /srv on all VMs.

So far, so good.

To make things less boring - static.o.o has its own IP on anna/elsa, and IIRC last time I changed it to the common proxy-nue IP, it broke outgoing mails from anna/elsa because of a then-wrong reverse DNS. anna/elsa still handle outgoing mail, so we'll probably run into the same problem again if we change the IP of static.o.o :-/

#5 Updated by esujskaja 5 months ago

Here is the proposal for a AWS sandbox forOpenSUSE from Artem Chernikov:
*
we can set up a sub-account in our AWS infra and set a budget alert there for 100usd per month.

Then we will need to decomission it - say in three months from now.

That account should not be integrated with Okta as the target users are OpenSUSE Heroes and this account will not be connected to any SUSE infra bits and kept isolated.

Who from SUSE will be accountable to create user credentials in it and see that the budgets are respected?
When we agree on the above - I can kick-start account creation
*

If we're interested in that offer, we'd need to define who'll drive it.
How can we share credentials and coordinate forces?
Should we maybe define a team driving that?

#6 Updated by bmwiedemann 5 months ago

How can we share credentials and coordinate forces?

We either use one heroes team credentials and store it in https://gitlab.infra.opensuse.org/infra/pass , or per user credentials distributed to individuals with GPG.

Then we will need to decomission it - say in three months from now.

Decomission what and why? If this is indeed working, it would be used in production instead of the current static.o.o VMs.

cboltz wrote:

Note that it's doing rsync over ssh

That could be replaced with a git clone either from github or from the other VM. We can check gpg sigs for authenticity (github signs all merge commits)

Another question is if it is worth setting up cloudfront for it and pay 85 EUR/TB traffic. Could give nicer user-experience with low-latency file serving.

#7 Updated by esujskaja 5 months ago

The update from the assessment call with Accenture last week. OpenSUSE landing has been changed to the collocation instead of cloud for now (means, lift and shift approach). Attaching the PDF with the detailed minutes.

One more poibt has been raised too, that OpenSUSE independense should be driven as a separate and independent project, not being tightly connected with the DC move. Sounds erasonable to me.

We need to ensure, that the community stays updated and involved into teh move, and will appreciate thoughts on that matter - I'm not sure, that Heroes one a month is good enough.

In parallel, we still can use the testing offer for AWS from the SUSE infrastructure - 3 months of testing with 100$ limit.

I can't be on the Heros call tomorrow, unfortunately. :(

#8 Updated by cboltz 5 months ago

esujskaja wrote:

The update from the assessment call with Accenture last week. OpenSUSE landing has been changed to the collocation instead of cloud for now (means, lift and shift approach). Attaching the PDF with the detailed minutes.

Thanks for the update! Unfortunately the PDF didn't arrive in the ticket, can you please attach it?

Bonus question: is anything in this ticket confidential, or can we make it public?

#9 Updated by lrupp 4 months ago

  • Category set to Core services and infra
  • Assignee set to opensuse-admin

#10 Updated by lrupp 15 days ago

  • Private changed from Yes to No

As some months have passed meanwhile: is there a status update?
At least about the testing instance at AWS?

Also available in: Atom PDF