Project

General

Profile

Actions

action #134519

closed

QA (public) - coordination #121720: [saga][epic] Migration to QE setup in PRG2+NUE3 while ensuring availability

QA (public) - coordination #131525: [epic] Up-to-date and usable LSG QE NUE1 machines

We were not notified that backup.qa.suse.de did not create backups size:M

Added by tinita over 1 year ago. Updated over 1 year ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Start date:
2023-08-23
Due date:
% Done:

0%

Estimated time:
Tags:

Description

Motivation

We did not notice e.g. through alerts that backups were not being updated since July 26.

See #134489

Acceptance criteria

  • AC1: Alerts are received when backup jobs fail

Suggestions

  • ~cron.service was failing~ The cron job was failing, but we were never notified about it. The systemd service doesn't fail because of individual jobs.
  • Use a systemd timer which would give us systemd services alert failures

Out of scope

  • Try and see a simple check for the existence of recent backups
% journalctl -u cron.service
Aug 20 12:00:01 backup-vm rsnapshot[15218]: /usr/bin/rsnapshot alpha: ERROR: Errors were found in /etc/rsnapshot.conf, rsnapshot can not continue.

Related issues 3 (0 open3 closed)

Related to openQA Project (public) - action #134837: SLE test repo not updated on OSD, cron service was not running since 2023-08-29, fetchneedles not called size:MResolvedlivdywan

Actions
Related to openQA Infrastructure (public) - action #136370: systemd service rsnapshot@beta on backup-vm.qe.nue2.suse.org failed due to process conflictResolvedokurz2023-09-23

Actions
Copied from openQA Infrastructure (public) - action #134489: backup.qa.suse.de does not create backupsResolvedtinita2023-08-22

Actions
Actions

Also available in: Atom PDF