Project

General

Profile

Actions

action #125132

closed

[alert] logrotate failed on OSD

Added by osukup over 1 year ago. Updated over 1 year ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
Start date:
2023-02-28
Due date:
% Done:

0%

Estimated time:

Description

from journalctl:

Feb 15 00:00:07 openqa logrotate[12569]: logrotate does not support parallel execution on the same set of logfiles.
Feb 15 00:00:07 openqa logrotate[12569]: error: state file /var/lib/misc/logrotate.status is already locked
Feb 15 00:00:00 openqa systemd[1]: Starting Rotate log files...

Related issues 1 (0 open1 closed)

Related to openQA Infrastructure - action #124412: [alert] logrotate services failed on openqa-piworker.qa.suse.de and OSD size:MResolvedmkittler2023-02-13

Actions
Actions #1

Updated by okurz over 1 year ago

  • Related to action #124412: [alert] logrotate services failed on openqa-piworker.qa.suse.de and OSD size:M added
Actions #2

Updated by okurz over 1 year ago

  • Tags set to infra, osd, logrotate
  • Assignee set to mkittler
  • Priority changed from Normal to Urgent

@mkittler as you are working on that within #124412 please take this into account

Actions #3

Updated by mkittler over 1 year ago

  • Status changed from New to Feedback

I haven't received a mail about failing systemd services today anymore. I suppose this ticket has been created for a mail from yesterday.

This mentioned log is even from 15 Feb. So it had happened before my changes for #124412 were deployed. The most recent encounter is:

sudo journalctl --since '1 day ago' -fu logrotate.service -u logrotate-openqa.service
…
Feb 27 15:00:00 openqa logrotate[10366]: uncompress_prog is now /usr/bin/xzdec
Feb 27 15:00:00 openqa logrotate[10366]: error: state file /var/lib/misc/logrotate.status is already locked
Feb 27 15:00:00 openqa logrotate[10366]: logrotate does not support parallel execution on the same set of logfiles.
Feb 27 15:00:00 openqa systemd[1]: logrotate-openqa.service: Main process exited, code=exited, status=3/NOTIMPLEMENTED
Feb 27 15:00:00 openqa systemd[1]: logrotate-openqa.service: Failed with result 'exit-code'.
Feb 27 15:00:00 openqa systemd[1]: Failed to start Rotate openQA log files.
Feb 27 15:48:25 openqa logrotate[29411]: error: destination /var/log/rsyncd.log-20230227.xz already exists, skipping rotation
Feb 27 15:48:25 openqa logrotate[29411]: error: destination /var/log/zypper.log-20230227.xz already exists, skipping rotation
Feb 27 15:48:25 openqa systemd[1]: logrotate.service: Deactivated successfully.
Feb 27 15:48:25 openqa systemd[1]: Finished Rotate log files.
…

and also that was (2 hours) before the MR has been merged.

I suppose we can keep this ticket open (in addition to #124412) as it is specific to the OSD problem only (although it is basically duplicating #124412).

Looks like my MR has still one mistake in it:

Feb 27 18:00:00 openqa bash[19337]: ++ /usr/bin/systemctl --user is-active logrotate.service
Feb 27 18:00:01 openqa bash[19340]: Failed to connect to bus: $DBUS_SESSION_BUS_ADDRESS and $XDG_RUNTIME_DIR not defined (consider using --machine=<user>@.host --user to connect to bus of other user)
Feb 27 18:00:01 openqa bash[19322]: + is_active=
Feb 27 18:00:01 openqa bash[19322]: + [[ '' == active ]]
Feb 27 18:00:01 openqa bash[19322]: + [[ '' == activating ]]
Feb 27 18:00:01 openqa bash[19322]: + exit 0

The --user flag shouldn't have been used. I've created a MR to fix this (https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/800) and have also applied the change manually on OSD.

Actions #4

Updated by okurz over 1 year ago

  • Priority changed from Urgent to Normal
Actions #5

Updated by okurz over 1 year ago

  • Status changed from Feedback to Resolved

We will know if the problem reappears from alerting.

Actions

Also available in: Atom PDF