UPDATE: all systems are back online.
During the maintenance window this Thursday, 2017-12-07, we will not only do the regular maintenance on all machines: this time we will migrate the machine hosting download.opensuse.org to a completely new system running openSUSE Leap 42.3. Together with this switch, we will bring the new PostgreSQL database cluster in production, which is running now since a while also on openSUSE 42.3. As some of the old configurations and services will be changed during that time as well (for example: switching from lighttpd to nginx with TLS 1.2 and http2 support for our "last resort mirror"), we will use this week for some extended testing to make the migration as smooth and quick as possible. But as always: bad things can happen, so we like to inform you in front that there might be some longer downtimes during the switch. If you need to upgrade or update your machines this Thursday morning, please check for a mirror server on our mirror page.
The maintenance of login2.opensuse.org will not last for longer than 30 minutes: we will fire up a second machine that will (in a first version) act as failover (using keepalived) in case the main machine is under maintenance or has a bigger issue. As we want to test the failover, please expect small hickups during that 30 minutes. After that, we hope that this service should also be high available as a couple of other services we setup during the last weeks.
New Galera cluster running in production (1 comment)
As we reported in one of our last news, we setup a new galera cluster for all our applications that make use of MySQL. This cluster should allow us to do maintenance on one of the cluster nodes at any time - and also should scale the workload between the nodes, via the HaProxy in front.
One problem, that affected us for example in case of progress.opensuse.org, are the MyISAM tables: Galera is not really ready (yet?) to sync the content of such tables (even if you can enable the "experimental" feature, if you don't care much about your data). As result, we do not only need to have a look and migrate each an every single MyISAM table - but more worse, we also need to have a look at the used code of the application to identify problematic SQL statements (like DELAYED inserts for example) - and patch it where needed.
But the good news for the two databases mentioned: so far everything seems (still) to work. Other applications will follow one by one (as some like connect need adaptions).
...and of course some more. Once we are done with all the migration topics, we will try to get something out of this data to present you with some nice statistics.
One major step towards a reliable infrastructure was done last week: we implemented a new Galera Cluster, which should provide a high available environment for all services that rely on MySQL/MariaDB. Instead of simply migrating the old Master-Master setup, we decided to implement something new - also giving us not only the ability to grow, but also to show how reliable an openSUSE driven infrastructure is (Note: the new cluster is of course using Leap 42.3 as base).
During the next days, we will fine-tune the database setup and migrate the productive workload from the old to the new cluster. Some of the steps we learned during that migration might end up in some articles on news.opensuse.org - so stay tuned! :-)
The main issue here is the way how MirrorBrain is used: instead of delivering a file directly, download.opensuse.org will redirect the requests of our customers to a mirror server which hosts the file and is nearer to their location. While this has normally benefits for both sides, it becomes problematic if MirrorBrain should redirect users who like to get their files delivered via an encrypted (https) channels.
At first: our mirrors need to support SSL for this. While some mirrors have SSL enabled since a long time, others don't - and want to avoid this also in the future to avoid an overload of their systems.
Second: MirrorBrain does not only need to know if a mirror server supports SSL before it can redirect a user requesting a file via SSL to this mirror - to avoid confusing error messages, we also need to make sure that the SSL setup on the mirrors is correct, and at least (just to give an example) provide a correct SSL certificate.
Third: MirrorBrain itself was never developed to differentiate between encrypted and not encrypted requests. As such, this "new" feature needs to be implemented properly. Volunteers needed...
Do you know that download.opensuse.org is use by nearly all openSUSE systems to get their updates and for downloading new software? The Apache (worker) process running on this machine serves (under normal conditions) 300 up to 500 requests per second for only this reason. In addition to that, a dedicated Nginx service on the same host is used to quickly free up resources from the Apache and deliver files (like RPMs and ISOs) as fast as possible, without blocking Apache from handling more requests. This setup avoids database locks, as each request for a file on the Apache side results in a database request to MirrorBrain, to get the best mirror for the file. As Apache can not free up the DB connection, until the request is handled, the "hand over" (aka redirect) to the Nginx service allows to get the Apache freed up quickly, ready to handle more requests.
But the Nginx on the machine is just used as "last ressort": under normal circumstances, openSUSE benefits from over 180 mirrors world wide who offer files for our users. And the redirection is based on the GeoIP location of the requester and the closest mirrors to that destination. If you ever want to know how many mirrors host a specific file, just click on the "details" link on download.opensuse.org (have a look at the details for the Leap 42.3 ISO as example). We provide even a Google Map for you (see: "Map showing the closest mirrors") to show you the location of your client and the mirror servers around you.
While people are more and more asking to get an encrypted line to download their packages, we - as openSUSE admins - are asking ourselves often enough: "why"?
- During the installation of a client machine, it get's the public signing keys for official packages installed (one of the reasons why you really should "Verify Your Download Before Use")
- Each and every package in the official repositories is signed with such a key. As addition, each RPM also includes checksums for every file it contains.
So what happens if a mirror provides you with some malicious packages?
- first of all: our MirrorBrain scanner might detect a size mismatch and exclude the file from any redirect
- during installation, you will get warned that either the signing key does not match and/or the (internal) checksums of the package are wrong
- if you add a new repository from the Open Build Service, you should also verify the provided key
Does that change, if you download the same file via SSL? - No.
Does an encrypted download help you to mask what you are doing? - Only partly. An attacker or undercover agent might not exactly know what you download - but keep in mind that your DNS queries are known as well as the IP addresses of the machines you connect to, this mitigates the fog you want to produce.
Would TOR help ? - Probably yes, in regard of the anonymity that TOR provides, only you and your entry server know what you are looking for. Interestingly, the traffic inside the TOR network is already encrypted. So you don't win much with an encrypted endpoint download.opensuse.org.
So while we are looking for developers who like to extend MirrorBrain with the needed features for a proper SSL redirection - and on our mirror servers to prepare their infrastructure for SSL traffic - stay tuned and keep in mind that the verification of keys and installation medias will not change, even if we can officially provide you with completely SSL encrypted traffic in the near future.
A scheduled power outage in the Nuremberg office will effect a number of openSUSE services from Oct. 13 at 4 p.m. to Oct. 14 at 4 p.m.
The scheduled maintenance on the building’s electricity will affect most services. The only services that will be normally operating are:
The rest of the services will be fully online on Oct. 15. The Heroes team will try to keep you updated on the situation, and will also send a few reminders (on the opensuse-announce mailing list) before the incident.
Due to technical constraints, the above services will not be available through IPv6 during the outage.
Thank you for your understanding.
On behalf of the openSUSE Heroes Team and the SUSE-IT team.
In the old setup only the MF IT team could do DNS changes for us in their DNS appliance. So we always had to run through tickets for changes.
Now we have a FreeIPA instance for the openSUSE cluster to manage the DNS zone. We use FreeIPA not only for DNS but that is a topic for another article.
After we fixed all the technical problems 3 weeks ago (you can read about it here), we finally got the approval for the change from upper management.
Now also their change control team agreed and we finally completed the change.
So lets all welcome home the opensuse.org zone!
The new status page of the openSUSE infrastructure team provides updates on how the systems of the openSUSE community are doing. If there are interruptions to service, we will post a note here.
The idea behind the new status page is to inform our users via a central point about outages or service interruptions, so there should be no need to check other resources, if there is an outage or problem. By using the open source status page system Cachet, we benefit from the work of another great community, while we try to contribute back by doing marketing and pushing our changes upstream.
Cachet has - beside others - the following wonderful features:
* Email subscriptions: users can subscribe with their Email address to get informed about incidents via personal Email
* RSS and Atom feeds: just put the links to the feeds (provided at the bottom of the page) into your favorite newsreader to get incident updates via RSS feeds
* Nice overview about the provided services: as the openSUSE community is providing so many services, the status page might be a good starting point for everyone to get an overview
At the moment, the page is not fully operational: we might change some settings and/or add some more features. But it might already be good enough to get the idea behind it.
As always, if you are experiencing any issues with the openSUSE infrastructure, don't hesitate to get in touch with us at email@example.com or via irc.opensuse.org/#opensuse-admin and we'll get back to you as soon as we can.
What a start in the new year: the server running rsync.opensuse.org died with two broken hard disks at 2016-01-10.
As the hardware is located in the data center of our sponsor IP Exchange, we apologize for the delay it will take to fix the problem: we need not only the correct replacement hard drives, but also a field worker at the location who has the appropriate permissions and skills.
During the downtime (and maybe also a good tip afterward), please check on http://mirrors.opensuse.org/ for the closest mirror nearby your location that also offers rsync for you.
All backend servers run now on one of three virtualization hosts using KVM:
* 48 Cores
* 512 GB RAM
* 20 Gb Ethernet (incl. FCOE)
The eight virtual servers running on this hardware are using the resources very well - while we still have the ability to use just two of the virtualization hosts for continuous operation during service. We hope to improve the availability of the openSUSE Build Service with this new setup and reduce the overall downtime for you.
At the moment, we are trying to fix the last small issues (like long live migration times or synchronization of the configuration between the machines).
Also available in: Atom