Project

General

Profile

Actions

action #160481

closed

backup-vm: partitions usage (%) alert & systemd services alert size:S

Added by tinita 7 months ago. Updated 7 months ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Regressions/Crashes
Start date:
2024-05-17
Due date:
2024-06-06
% Done:

0%

Estimated time:

Description

Observation

Fri, 17 May 2024 04:01:33 +0200

1 firing alert instance
[IMAGE]

📁 GROUPED BY 

hostname=backup-vm

  🔥 1 firing instances

Firing [stats.openqa-monitor.qa.suse.de]
backup-vm: partitions usage (%) alert
View alert [stats.openqa-monitor.qa.suse.de]
Values
A0=86.0003690373683 
Labels
alertname
backup-vm: partitions usage (%) alert
grafana_folder

http://stats.openqa-monitor.qa.suse.de/alerting/grafana/partitions_usage_alert_backup-vm/view?orgId=1

Also, possibly related:
https://stats.openqa-monitor.qa.suse.de/d/KToPYLEWz/failed-systemd-services?orgId=1

Not related, this was about backup-qam (and not backup-vm which this ticket is about):

Failed systemd services
2024-05-16 15:27:50    backup-qam    check-for-kernel-crash, kdump-notify

Suggestions

  • Check partition usage and which component contributes the most space usage
  • Check what happened that we had this short high usage surge
  • Consider increasing the size of the virtually attached storage
  • Consider tweaking our backup rules to either include less or less retention Not useful, it was the root partition (but backups are on the separate partition /dev/vdb1).
  • Or maybe don't do anything if this only happened once and is not likely to happen again based on monitoring data investigation
Actions

Also available in: Atom PDF