action #57476
closedRecurring partitions full and logrotate fails, possibly due to disabling /var/log/openqa as log target
0%
Description
Observation¶
There had been some alerts about space depletion on either / or /srv in the past days. This seems to have started on the afternoon of 2019-09-25 when /srv started to fill up. From 2019-09-26 on there are hourly spikes in the space usage on / increasing in size over time:
https://stats.openqa-monitor.qa.suse.de/d/WebuiDb/webui-summary?orgId=1&fullscreen&panelId=74&from=1569378790693&to=1569669597083
I assume that since coolo switched /etc/openqa/openqa.ini to not log messages to /var/log/openqa anymore we end up with openQA debug messages including SQL debugging and these messages seem to not only end up in the system journal but also in /var/log/messages. The logrotate service has failed because of out of space conditions and also /var/spool/mail/root shows the problem why / is running out of space on an hourly base:
Subject: Cron <root@openqa> /usr/sbin/logwatch --service dmeventd
…
cat: write error: No space left on device
system 'cat '/var/log/messages-20190928' >> /var/cache/logwatch/logwatch.hghdmAm9/messages-archive' failed: 256 at /usr/sbin/logwatch line 772.
Suggestions¶
- As we do not have logwatch covered in http://gitlab.suse.de/openqa/salt-states-openqa I assume we do not need it anymore
- We should consider removing
rsyslog
andsyslog-service
and instead configure a persistent journal by creating the directory /var/log/journal/ as we already do in https://gitlab.suse.de/openqa/salt-states-openqa/blob/master/openqa/worker.sls#L324 - Crosscheck after above two points how the space usage behaves, e.g. if openQA sql debug information is still written to /var/log/messages