Project

General

Profile

Actions

action #181304

closed

systemd service+timer for cleanup of OSD database dumps instead of cron size:S

Added by okurz about 1 month ago. Updated 12 days ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Feature requests
Start date:
2025-04-23
Due date:
% Done:

0%

Estimated time:
Tags:

Description

Motivation

Based on #181301 we will benefit from systemd service+timer for cleanup of OSD database dumps instead of cron.

Acceptance Criteria

  • AC1: openqa.suse.de database dump backup&cleanup is called from systemd timers
  • AC2: No duplicate cron job about the same

Suggestions

Mitigations

  • DONE Adjust cron job on o3 and in salt-states to avoid non-actionable daily emails

Files


Related issues 1 (0 open1 closed)

Blocked by openQA Infrastructure (public) - action #181301: Dangerous cleanup of OSD database dumps size:SResolvednicksinger

Actions
Actions #1

Updated by okurz about 1 month ago

  • Copied from action #181301: Dangerous cleanup of OSD database dumps size:S added
Actions #2

Updated by nicksinger about 1 month ago

we might want to block on #181301, otherwise the AC allow a completely unmanaged script on OSD which is hardly an improvement.

Actions #3

Updated by nicksinger about 1 month ago

  • Copied from deleted (action #181301: Dangerous cleanup of OSD database dumps size:S)
Actions #4

Updated by nicksinger about 1 month ago

  • Blocked by action #181301: Dangerous cleanup of OSD database dumps size:S added
Actions #5

Updated by okurz about 1 month ago

  • Target version changed from Tools - Next to Ready
Actions #6

Updated by okurz about 1 month ago

  • Subject changed from systemd service+timer for cleanup of OSD database dumps instead of cron to systemd service+timer for cleanup of OSD database dumps instead of cron size:S
  • Description updated (diff)
  • Status changed from New to Workable
Actions #7

Updated by gpathak 25 days ago · Edited

@okurz @livdywan Do we want email alerts of failed attempts or failure case? cron is capable of sending email on job/script failure.

Actions #8

Updated by gpathak 25 days ago

  • Assignee set to gpathak
Actions #9

Updated by okurz 25 days ago

gpathak wrote in #note-7:

@okurz @livdywan Do we want email alerts of failed attempts or failure case? cron is capable of sending email on job/script failure.

Yes but we already have alerts for failed systemd services so I want that, not any custom solution :)

Actions #10

Updated by gpathak 25 days ago

  • Status changed from Workable to In Progress
Actions #11

Updated by openqa_review 25 days ago

  • Due date set to 2025-05-23

Setting due date based on mean cycle time of SUSE QE Tools

Actions #12

Updated by gpathak 24 days ago

I did a quick test as suggested by @okurz, on a fresh TW VM installed openQA from OBS repository https://download.opensuse.org/repositories/devel:/openQA:/GitHub:/os-autoinst:/openQA:/PR-6437/openSUSE_Tumbleweed/devel:openQA:GitHub:os-autoinst:openQA:PR-6437.repo of the Pull Request.

The timers/service unit didn't get activated automatically as can be seen the screenshot.

The service unit files are dependent on openqa-setup-db.service and postgresql.service, since we instruct the user to enable and start postgresql service after a fresh installation, the other units dependent on this service gets activated/started automatically on every boot and timers also kicks in as soon as the service units are activated.

Actions #14

Updated by gpathak 21 days ago

Continuing with suggestions on the PR: https://github.com/os-autoinst/openQA/pull/6437

Actions #15

Updated by livdywan 20 days ago

  • Priority changed from Normal to High

I saw two emails again this morning. If it takes longer to implement a clean solution it should be mitigated. Raising priority accordingly.

https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/1458

Actions #16

Updated by livdywan 20 days ago

  • Description updated (diff)

I saw two emails again this morning. If it takes longer to implement a clean solution it should be mitigated. Raising priority accordingly.

Applied the same to /etc/cron.d/dump-openqa-db on o3.

Actions #17

Updated by gpathak 20 days ago

gpathak wrote in #note-10:

These two merge requests should be merged one after another.

Actions #18

Updated by livdywan 20 days ago

livdywan wrote in #note-15:

I saw two emails again this morning. If it takes longer to implement a clean solution it should be mitigated. Raising priority accordingly.

https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/1458

FYI merged this one, too.

Actions #19

Updated by livdywan 20 days ago

  • Description updated (diff)
  • Priority changed from High to Normal

Applied the same to /etc/cron.d/dump-openqa-db on o3.

https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/1458
FYI merged this one, too.

Lowering priority again as per the above mitigations.

Actions #20

Updated by gpathak 20 days ago

gpathak wrote in #note-17:

gpathak wrote in #note-10:

These two merge requests should be merged one after another.

One more merge request in backup-server-salt repository: https://gitlab.suse.de/qa-sle/backup-server-salt/-/merge_requests/25

Actions #21

Updated by okurz 19 days ago

The update was deployed to o3. I enabled the timer and executed the service once. I moved /var/lib/openqa/SQL-DUMPS/* to /var/lib/openqa/backup/. Merged https://gitlab.suse.de/qa-sle/backup-server-salt/-/merge_requests/25. Deployed changes to backup.qa.suse.de.

Actions #22

Updated by gpathak 19 days ago

okurz wrote in #note-21:

The update was deployed to o3. I enabled the timer and executed the service once. I moved /var/lib/openqa/SQL-DUMPS/* to /var/lib/openqa/backup/. Merged https://gitlab.suse.de/qa-sle/backup-server-salt/-/merge_requests/25. Deployed changes to backup.qa.suse.de.

On o3 the timer is enabled but inactive

gpathak@ariel:~> sudo systemctl status openqa-dump-db.timer 
○ openqa-dump-db.timer - Daily openQA database dump and cleanup task
     Loaded: loaded (/usr/lib/systemd/system/openqa-dump-db.timer; enabled; preset: disabled)
     Active: inactive (dead)
    Trigger: n/a
   Triggers: ● openqa-dump-db.service
gpathak@ariel:~> sudo systemctl status op^Cqa-dump-db.timer 
gpathak@ariel:~>
Actions #23

Updated by gpathak 19 days ago

  • Status changed from In Progress to Feedback
Actions #24

Updated by okurz 17 days ago

https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/1457 deployed. I removed the file OSD:/etc/cron.d/dump-openqa with content

40 23 * * * postgres openqa-dump-db 1>/dev/null

manually. Called `systemctl list-timers which shows that openqa-dump-timer is next triggered in 11h which looks good.

As the SQL database backup and sync to backup hosts is critical I suggest to monitor this over the next days, i.e. check again after the weekend.

Actions #25

Updated by gpathak 15 days ago

clipboard-202505191021-r56fp.png

Looks fine to me:

Resolving this ticket.

Actions #26

Updated by okurz 14 days ago

  • Due date deleted (2025-05-23)
Actions #27

Updated by okurz 12 days ago

I saw the file /etc/cron.d/dump-openqa on OSD again, removed it.

Actions

Also available in: Atom PDF