We are happy to pre-announce a new service entering the openSUSE world:
debuginfod is an HTTP file server that serves debugging resources to debugger-like tools.
Instead of using the old way to install the needed debugging packages one by one as root like:
the new debuginfod service lets you debug anywhere, anytime.
Right now the service serves only openSUSE Tumbleweed packages for the x86_64 architecture and runs in an experimental mode.
The simple solution to use the debuginfod for openSUSE Tumbleweed is:
export DEBUGINFOD_URLS="https://debuginfod.opensuse.org/" gdb ...
For every lookup, the client will send a query to the debuginfod server and get's back the requested information, allowing to just download the debugging binaries you really need.
While we monitor basic functionality of our MariaDB (running as Galera-Cluster) and PostgreSQL databases since years, we missed a way to get an easy overview of what's really happening within our databases in production. Especially peaks, that slow down the response times, are not so easy to detect.
That's why we set up our own Grafana instance. The dashboard is public and allows everyone to have a look at:
- The PostgreSQL cluster behind download.opensuse.org. Around 230 average and up to 500 queries per second are not that bad...
- The Galera cluster behind the opensuse.org wikis and other MariaDB driven applications like Matomo or Etherpad. One interesting detail here is - for example - the archiving job of Matomo, triggering some peaks every hour.
- The Elasticsearch cluster behind the wiki search. Here we have a relatively high JVM memory foodprint. Something to look at...
Both: the Grafana dashboard and the databases are driving big parts of the openSUSE infrastructure. And while everything is still up and running, we would love to hear from experts how we could improve. If you are an expert or know someone, feel free to contact us via Email or in our IRC channel.
As you may know, every single Email to email@example.com is forwarded into our ticket system at https://progress.opensuse.org/. As this Email is meanwhile widely known in the public Internet, we see a lot of Spam in our ticket system. So far, we mainly ignored that stuff and simply deleted the Email/Ticket.
But our ticket system is not really planned to become a ticket system: we run Redmine, which originally is intended to be a project management software. The ability to create issues (or tickets, as we call them) in the system by sending an Email was not really intended in the beginning. So the ability to detect and mark Spam Emails as such simply does not exist. Even worse: every Email results in a user, that get's created automatically, to allow us to send out an Email to this person as answer to his ticket.
All of this is not really problematic: you learn to deal with it. But with over 14,000 "users" in the database (and over 17,000 real tickets), the system started to become slow. So we invested a bit of our time and looked into the user list. Good for us: most of the Spammers seen to have special days to submit their stuff. And even more interesting: they do it at the same time from multiple accounts!
So we ended up in setting huge user blocks to "locked", which will not allow them to use the same Email account again to send their Spam to us - and on the other side this fastens up our database, as most queries only search for "active" users (which is the default). Maybe we can use the gathered Email addresses to feed a Spam filter - later, once we have one.
As good and simple as this message is: there is a small potential that we might have blocked/locked some real user accounts in our Redmine instance with this simple workaround. We tried our best and already excluded a lot of domains we trust (like '@opensuse.org') in the query. But we can not guarantee that we did not block your account at the moment, as there are simply too many (to us) unknown openSUSE users. And we want to spend more time on fixing your tickets than on finding out if one of the 10,000 now locked accounts is a false positive.
The information below might fall into the "unsung heroes of openSUSE" category - we think it is clearly worth to be mentioned and getting some applause (not saying that every user should owe the author a beer at the next conference ;-).
- You are searching for a nice font for the next document?
- You want to install such a font directly via 1-click-install once you had a closer look?
- You want to know more about rendering or language information or the character set for a font you want to install?
Just have a look at https://fontinfo.opensuse.org/, which provides all these information for you + some more. Special thanks to Petr Gajdos, who maintains the page and the package with the same name since years.
But this time,we did not only upgrade the package (which lives, btw, in our openSUSE:infrastructure project), we also migrated the underlying database.
As often, the initial deployment was done with a "just for testing" mindset by someone, who afterward left his little project. And - also as often - these kind of deployments suddenly became productive. This means - in turn - that our openSUSE heroes team suddenly gets tickets for services we originally did neither set up, nor maintain.
For etherpad, this means that we suddenly faced a "dirty.db" file of over 2GB in size, filling up the root-fs of the machine. Upstream even has a warning in their boot script, telling everyone that a dirty.db is NOT for production... :-/
The first try, using the dirty-db-cleaner.py script to reduce the size, did not finish after 2 days. So we decided to dump the data directly from the dirty.db into our Galera cluster. After fixing the initially created table scheme from MyISAM to InnoDB (Galera does not like MyISAM), the migration script took "only" 16 hours.
With this final migration, we hope to be prepared for the next update - and hope that this only takes minutes again.
After some back and forth, I'm happy to announce that more machines in the Provo data center use IPv6 in addition to their IPv4 address. Namely:
provo-mirror.opensuse.org (main mirror for US/Pacific regions)
status2.opensuse.org (fallback for status.opensuse.org)
proxy-prv.opensuse.org (fallback for proxy.opensuse.org)
provo-ns.opensuse.org (new DNS server for.opensuse.org - not yet productive)
Sadly neither the forums nor WordPress instances are IPv6 enabled. But we are hoping for the best: this is something we like to work on next year...
Around 16:00 CET at 2019-12-14, one of the Open Build Service (OBS) virtualization servers (which run some of the backend machines) decided to stop operating. Reason: a power failure in one of the UPS systems. Other than normal, this single server had both power supplies on the same UPS - resulting in a complete power loss, while all other servers were still powered via their redundant power supply.
In turn, the communication between the API and those backend machines stopped. The API summed up the incoming requests up to a state where it was not able to handle more.
By moving the backends over to another virtualization server, the problem was temporarily fixed (since ~19:00) and the API was working on the backlog. The cabling on the problematic server is meanwhile fixed and the machine is online again. So we are sure that this specific problem will not happen again in the future.
You might know that Piwik was renamed into Matomo more than a year ago. While everything is still compatible and even the scripts and other (internal) data is still named piwik, the rename is affecting more and more areas. Upstream is working hard to finalize their rename - while trying not to break too much on the other side. But even the file names will be renamed in some future version.
Time - for us - to do some maintenance and start following upstream with the rename. Luckily, our famous distribution already has matomo packages in the main repository (which currently still miss Apparmor profiles, but hey: we can and will help here). So the main thing left (to do) is a database migration and the adjustments of all the small bits and bytes here and there, where we still use the old name.
While the database migration silently happened already, the other, "small" adjustments will take some time - especially as we need to find all the places that need to get adjusted and also need to identify the contact persons, who can do the final change. But we are on it - way before Matomo upstream will do the final switch. :-)
Our infrastructure status page at https://status.opensue.org/ is using Cachet under the hood. While the latest update brought a couple of bugfixes it also deprecated the RSS and Atom feeds, that could be used to integrate the information easily in other applications.
While we are somehow sad to see such a feature go, we also have to admit that the decision of the developers is not really bad - as the generation of those feeds had some problems (bugs) in the old Cachet versions. Instead of fixing them, the developers decided to move on and focus on other areas. So it's understandable that they cut off something, which is not in their focus, to save resources.
As alternative, you might want to subscribe to status changes and incident updates via Email or use the API that is included in the software for your own notification system. And who knows: maybe someone provides us with a RSS feed generator that utilizes the API?
Sometimes it's a good idea to follow best practices. This is what we did by following the recommendations for "general-purpose servers with a variety of clients, recommended for almost all systems" from https://ssl-config.mozilla.org/.
With this, our services accept only TLS 1.2 connections and the latest elliptic curve ciphers. If your client or browser does not support these settings, it's definitely time for you to consider an update.
While we are looking for TLS 1.3 support, the openssl version on our systems (running currently Leap 15.1) does not support it - yet. Once there is an update, we'll let you know.
Also available in: Atom