action #60923
closed[alert] /srv about to run full, postgres logs very big due to repeated error "duplicate key value violates unique constraint "screenshots_filename", Key (filename)=(8ca/3c9/98a00d8bb2ccba5a2de1d403b5.png) already exists. INSERT INTO screenshots …"
0%
Description
Observation¶
On Wednesday, 11 December 2019 20.55.01 CET Grafana wrote:
[Alerting] File systems alert
One of the file systems is too fullMetric name
Value
/srv: Used Percentage
90.151
It seems /srv is about to be full soon. This seems to be an evolution since start of October:
https://stats.openqa-monitor.qa.suse.de/d/WebuiDb/webui-summary?fullscreen&edit&tab=alert&panelId=74&orgId=1&refresh=30s&from=now-6M&to=now
On osd I can see that /srv/PSQL10/data/log has 39G of log files.
There is an error message repeated multiple times per second repeatedly, first occurence in postgresql-2019-10-23_000000.log:
2019-10-23 08:29:38.235 CEST openqa geekotest [30552]ERROR: duplicate key value violates unique constraint "screenshots_filename"
2019-10-23 08:29:38.235 CEST openqa geekotest [30552]DETAIL: Key (filename)=(68c/0ee/3ceb7071b7b9364aa721b3421e.png) already exists.
2019-10-23 08:29:38.235 CEST openqa geekotest [30552]STATEMENT: INSERT INTO screenshots ( filename, t_created) VALUES ( $1, $2 ) RETURNING id
Problem¶
Formerly we also had quite some postgres errors which seems we did not care about but now we have a severe case causing log files to grow that much that multiple files per day are recorded. Either postgres never cleans up old rotations or did not reach the limit yet. The same happens on o3 but we have a little bit more space for the partition holding the database.