Project

General

Profile

action #62306

osd logrotate fails sporadically on "error opening /var/log/salt/master: Permission denied", only at 00:00, i.e. midnight every day.

Added by okurz over 1 year ago. Updated over 1 year ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
-
Start date:
2020-01-19
Due date:
2020-04-14
% Done:

0%

Estimated time:

Description

> sudo systemctl status logrotate
‚óŹ logrotate.service - Rotate log files
   Loaded: loaded (/usr/lib/systemd/system/logrotate.service; static; vendor preset: disabled)
   Active: failed (Result: exit-code) since Sun 2020-01-19 00:08:34 CET; 22h ago
     Docs: man:logrotate(8)
           man:logrotate.conf(5)
  Process: 5232 ExecStart=/usr/sbin/logrotate /etc/logrotate.conf (code=exited, status=1/FAILURE)
 Main PID: 5232 (code=exited, status=1/FAILURE)

Jan 19 00:00:00 openqa systemd[1]: Starting Rotate log files...
Jan 19 00:08:34 openqa logrotate[5232]: error: error opening /var/log/salt/master: Permission denied
Jan 19 00:08:34 openqa systemd[1]: logrotate.service: Main process exited, code=exited, status=1/FAILURE
Jan 19 00:08:34 openqa systemd[1]: Failed to start Rotate log files.
Jan 19 00:08:34 openqa systemd[1]: logrotate.service: Unit entered failed state.
Jan 19 00:08:34 openqa systemd[1]: logrotate.service: Failed with result 'exit-code'.

Related issues

Related to openQA Infrastructure - action #93195: [Alerting] Failed systemd services alert (except openqa.suse.de) on 2021-05-28, logrotate.service on openqaworker-arm-1Resolved2021-05-282021-06-11

Copied to openQA Infrastructure - action #62309: logrotate fails on QA-Power8-4-kvm (and powerqaworker-qam-1) with "error: destination /var/log/messages-20200118.xz already exists, skipping rotation"Resolved2020-01-192020-04-14

History

#1 Updated by okurz over 1 year ago

  • Copied to action #62309: logrotate fails on QA-Power8-4-kvm (and powerqaworker-qam-1) with "error: destination /var/log/messages-20200118.xz already exists, skipping rotation" added

#2 Updated by okurz over 1 year ago

  • Subject changed from osd logrotate fails on "error opening /var/log/salt/master: Permission denied" to osd logrotate fails sporadically on "error opening /var/log/salt/master: Permission denied", only at 00:00, i.e. midnight every day.
  • Due date set to 2020-03-13
  • Status changed from New to Feedback
  • Assignee set to okurz

Received (again?) an alert about this: http://mailman.suse.de/mailman/private/osd-admins/2020-February/000855.html

logrotate fails on permissions. /etc/logrotate.d/salt states for /var/log/salt/master: su salt salt but the file is root salt on osd. Crosschecked on o3, there it's salt salt hence no problem there. In a clean container environment the start of "salt-master" also creates a file with salt root so I assume that root salt on osd might just be a problem due to migrating from a very old version of OS. I will correct the permissions manually and monitor:

chown salt /var/log/salt/master

If this still fails then we could simply ignore the exit status of logrotate, e.g. with a systemd service override and prepend the command in ExecStart with "-" to ignore exit code.

Potentially helpful bugs for this issue: https://bugzilla.suse.com/show_bug.cgi?id=1030009 and https://bugzilla.suse.com/show_bug.cgi?id=1071322

#3 Updated by okurz over 1 year ago

The owner changed back to "root". Not sure who or what did that. There are now alerts for the same problem also happening on openqaworker8. Why that file exists on openqaworker8 I don't know. Have deleted it from there and reset the systemd service with systemctl reset-failed.

On openqa-monitor.qa this seems to be more tricky:

okurz@openqa-monitor:~> sudo journalctl --since=today -u logrotate
-- Logs begin at Sun 2020-03-01 09:35:15 CET, end at Wed 2020-03-04 22:51:09 CET. --
Mar 04 00:00:24 openqa-monitor systemd[1]: Starting Rotate log files...
Mar 04 00:00:24 openqa-monitor logrotate[28470]: [61B blob data]
Mar 04 00:00:24 openqa-monitor logrotate[28470]: error: 'Access denied for user 'root'@'localhost' (using password: NO)'
Mar 04 00:00:24 openqa-monitor logrotate[28470]: /etc/logrotate.d/mariadb failed, probably because
Mar 04 00:00:24 openqa-monitor logrotate[28470]: the root acount is protected by password.
Mar 04 00:00:24 openqa-monitor logrotate[28470]: See comments in /etc/logrotate.d/mariadb on how to fix this
Mar 04 00:00:24 openqa-monitor logrotate[28470]: error: error running non-shared postrotate script for /var/log/mysql/mysqld.log of '/var/log/mysql/*.log '
Mar 04 00:00:33 openqa-monitor systemd[1]: logrotate.service: Main process exited, code=exited, status=1/FAILURE

And yes, /etc/logrotate.d/mariadb has some more infos. But I don't know why this seems to happen now.

#4 Updated by okurz over 1 year ago

  • Due date deleted (2020-03-13)
  • Status changed from Feedback to Workable
  • Assignee deleted (okurz)

so simply changing permissions did not help. Don't know right now what I can do, leaving for others :)

#5 Updated by okurz over 1 year ago

  • Status changed from Workable to Feedback
  • Assignee set to okurz

#6 Updated by okurz over 1 year ago

  • Due date set to 2020-04-14

#7 Updated by okurz over 1 year ago

  • Status changed from Feedback to Resolved

It seems we have not seen this problem lately.

#8 Updated by okurz 4 months ago

  • Related to action #93195: [Alerting] Failed systemd services alert (except openqa.suse.de) on 2021-05-28, logrotate.service on openqaworker-arm-1 added

Also available in: Atom PDF