Project

General

Profile

Actions

action #115208

closed

failed-systemd-services: logrotate-openqa alerting on and off size:M

Added by livdywan over 2 years ago. Updated over 2 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Start date:
Due date:
% Done:

0%

Estimated time:

Description

Motivation

For the third time this week failed-systemd-services has been alerting in the middle of the European night and recovering:

2022-08-11 02:59:52 
openqa  
logrotate-openqa    
1
2022-08-10 17:43:01 QA-Power8-4-kvm openqa-worker-cacheservice-minion   1

Previously only logrotate-openqa was failing.

sudo journalctl -fu logrotate-openqa
Aug 11 09:00:02 openqa logrotate[26982]:   Last rotated at 2022-08-11 04:00
Aug 11 09:00:02 openqa logrotate[26982]:   log does not need rotating (log size is below the 'size' threshold)
Aug 11 09:00:02 openqa logrotate[26982]: rotating pattern: /var/log/apache2/openqa.access_log  307200000 bytes (20 rotations)
Aug 11 09:00:02 openqa logrotate[26982]: empty log files are not rotated, old logs are removed
Aug 11 09:00:02 openqa logrotate[26982]: considering log /var/log/apache2/openqa.access_log
Aug 11 09:00:02 openqa logrotate[26982]:   Now: 2022-08-11 09:00
Aug 11 09:00:02 openqa logrotate[26982]:   Last rotated at 2022-08-11 05:00
Aug 11 09:00:02 openqa logrotate[26982]:   log does not need rotating (log size is below the 'size' threshold)
Aug 11 09:00:02 openqa systemd[1]: logrotate-openqa.service: Deactivated successfully.
Aug 11 09:00:02 openqa systemd[1]: Finished Rotate openQA log files.

Acceptance criteria

  • AC1: logrotate-openqa is known to work reliably
  • AC2: failed-systemd-services is not alerting on a regular basis

Suggestions

  • Look into what's causing logrotate-openqa to fail
  • Check if there's enough disk space / if the logs are deleted before rotation

Related issues 2 (0 open2 closed)

Related to openQA Infrastructure (public) - action #114565: recover qa-power8-4+qa-power8-5 size:MResolvedokurz2022-12-19

Actions
Related to openQA Infrastructure (public) - action #116722: openqa.suse.de is not reachable 2022-09-18, no ping response, postgreSQL OOM and kernel panics size:MResolvedmkittler2022-09-18

Actions
Actions

Also available in: Atom PDF