The openSUSE heroes use Grafana to monitor important databases.
While we monitor basic functionality of our MariaDB (running as Galera-Cluster) and PostgreSQL databases since years, we missed a way to get an easy overview of what's really happening within our databases in production. Especially peaks, that slow down the response times, are not so easy to detect.
That's why we set up our own Grafana instance. The dashboard is public and allows everyone to have a look at:
- The PostgreSQL cluster behind download.opensuse.org. Around 230 average and up to 500 queries per second are not that bad...
- The Galera cluster behind the opensuse.org wikis and other MariaDB driven applications like Matomo or Etherpad. One interesting detail here is - for example - the archiving job of Matomo, triggering some peaks every hour.
- The Elasticsearch cluster behind the wiki search. Here we have a relatively high JVM memory foodprint. Something to look at...
Both: the Grafana dashboard and the databases are driving big parts of the openSUSE infrastructure. And while everything is still up and running, we would love to hear from experts how we could improve. If you are an expert or know someone, feel free to contact us via Email or in our IRC channel.