Project

General

Profile

action #124412

Updated by mkittler about 1 year ago

## Observation 
 See https://stats.openqa-monitor.qa.suse.de/d/KToPYLEWz/failed-systemd-services?orgId=1&from=1676150994873&to=1676257956411 for the time frame. (It also shows some mount units but they were only failing very shortly.) 

 This is the log of logratate-openqa on OSD: 
 ``` 
 Feb 13 00:00:03 openqa logrotate[30440]:     Now: 2023-02-13 00:00 
 Feb 13 00:00:03 openqa logrotate[30440]:     Last rotated at 2023-02-12 05:00 
 Feb 13 00:00:03 openqa logrotate[30440]:     log does not need rotating (log size is below the 'size' threshold) 
 Feb 13 00:00:03 openqa systemd[1]: logrotate-openqa.service: Deactivated successfully. 
 Feb 13 00:00:03 openqa systemd[1]: Finished Rotate openQA log files. 
 Feb 13 01:00:00 openqa systemd[1]: Starting Rotate openQA log files... 
 Feb 13 01:00:01 openqa logrotate[11051]: reading config file /etc/logrotate.d/openqa 
 Feb 13 01:00:01 openqa logrotate[11051]: warning: 'size' overrides previously specified 'hourly' 
 Feb 13 01:00:01 openqa logrotate[11051]: compress_prog is now /usr/bin/xz 
 Feb 13 01:00:01 openqa logrotate[11051]: compress_ext was changed to .xz 
 Feb 13 01:00:01 openqa logrotate[11051]: uncompress_prog is now /usr/bin/xzdec 
 Feb 13 01:00:01 openqa logrotate[11051]: warning: 'size' overrides previously specified 'hourly' 
 Feb 13 01:00:01 openqa logrotate[11051]: compress_prog is now /usr/bin/xz 
 Feb 13 01:00:01 openqa logrotate[11051]: compress_ext was changed to .xz 
 Feb 13 01:00:01 openqa logrotate[11051]: uncompress_prog is now /usr/bin/xzdec 
 Feb 13 01:00:01 openqa logrotate[11051]: reading config file /etc/logrotate.d/openqa-apache 
 Feb 13 01:00:01 openqa logrotate[11051]: warning: 'size' overrides previously specified 'hourly' 
 Feb 13 01:00:01 openqa logrotate[11051]: compress_prog is now /usr/bin/xz 
 Feb 13 01:00:01 openqa logrotate[11051]: compress_ext was changed to .xz 
 Feb 13 01:00:01 openqa logrotate[11051]: uncompress_prog is now /usr/bin/xzdec 
 Feb 13 01:00:01 openqa logrotate[11051]: error: state file /var/lib/misc/logrotate.status is already locked 
 Feb 13 01:00:01 openqa logrotate[11051]: logrotate does not support parallel execution on the same set of logfiles. 
 Feb 13 01:00:01 openqa systemd[1]: logrotate-openqa.service: Main process exited, code=exited, status=3/NOTIMPLEMENTED 
 Feb 13 01:00:01 openqa systemd[1]: logrotate-openqa.service: Failed with result 'exit-code'. 
 Feb 13 01:00:01 openqa systemd[1]: Failed to start Rotate openQA log files. 
 Feb 13 02:00:01 openqa systemd[1]: Starting Rotate openQA log files... 
 Feb 13 02:00:01 openqa logrotate[12997]: reading config file /etc/logrotate.d/openqa 
 ``` 

 The log for logrotate on the Pi is unfortunately empty. So there's likely not much we can do about the Pi at this point. 

 I haven't paused the alerts. Let's see whether this is happening again. 

 ## Acceptance criteria 
 * **AC1**: The pi worker no longer triggers the failed systemd services alert 
 * **AC2**: OSD no longer triggers the failed systemd services alert 

 ~~Note Note that the issue on OSD was a one-time issue and is at this point no concern anymore.~~ It happened again on OSD as well: 

 ``` 
 Feb 26 00:00:03 openqa systemd[1]: Finished Rotate openQA log files. 
 Feb 26 01:00:00 openqa systemd[1]: Starting Rotate openQA log files... 
 Feb 26 01:00:00 openqa logrotate[16932]: reading config file /etc/logrotate.d/openqa 
 Feb 26 01:00:00 openqa logrotate[16932]: warning: 'size' overrides previously specified 'hourly' 
 Feb 26 01:00:00 openqa logrotate[16932]: compress_prog is now /usr/bin/xz 
 Feb 26 01:00:00 openqa logrotate[16932]: compress_ext was changed to .xz 
 Feb 26 01:00:00 openqa logrotate[16932]: uncompress_prog is now /usr/bin/xzdec 
 Feb 26 01:00:00 openqa logrotate[16932]: warning: 'size' overrides previously specified 'hourly' 
 Feb 26 01:00:00 openqa logrotate[16932]: compress_prog is now /usr/bin/xz 
 Feb 26 01:00:00 openqa logrotate[16932]: compress_ext was changed to .xz 
 Feb 26 01:00:00 openqa logrotate[16932]: uncompress_prog is now /usr/bin/xzdec 
 Feb 26 01:00:00 openqa logrotate[16932]: reading config file /etc/logrotate.d/openqa-apache 
 Feb 26 01:00:00 openqa logrotate[16932]: warning: 'size' overrides previously specified 'hourly' 
 Feb 26 01:00:00 openqa logrotate[16932]: compress_prog is now /usr/bin/xz 
 Feb 26 01:00:00 openqa logrotate[16932]: compress_ext was changed to .xz 
 Feb 26 01:00:00 openqa logrotate[16932]: uncompress_prog is now /usr/bin/xzdec 
 Feb 26 01:00:00 openqa logrotate[16932]: error: state file /var/lib/misc/logrotate.status is already locked 
 Feb 26 01:00:00 openqa logrotate[16932]: logrotate does not support parallel execution on the same set of logfiles. 
 Feb 26 01:00:00 openqa systemd[1]: logrotate-openqa.service: Main process exited, code=exited, status=3/NOTIMPLEMENTED 
 Feb 26 01:00:00 openqa systemd[1]: logrotate-openqa.service: Failed with result 'exit-code'. 
 Feb 26 01:00:00 openqa systemd[1]: Failed to start Rotate openQA log files. 
 Feb 26 02:00:00 openqa systemd[1]: Starting Rotate openQA log files... 
 ``` anymore. 

 ## Suggestions 
 * Look into log files on osd for details (as above) 
 * Check what happens if logrotate is called manually. It's rather safe to call logrotate again but might trigger the above 
 * Likely an unclean shutdown caused this. Try to trigger manually. 
 * Ask @dheidler to fix the Pi-worker

Back