Project

General

Profile

Actions

action #128417

closed

[alert][grafana] openqaw5-xen: partitions usage (%) alert fired and quickly after recovered again size:M

Added by nicksinger over 1 year ago. Updated over 1 year ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Start date:
Due date:
% Done:

0%

Estimated time:
Tags:

Description

Observation

On 2023-04-28 16:30 the partition usage of w5-xen skyrocketed to >90% (https://stats.openqa-monitor.qa.suse.de/d/GDopenqaw5-xen/dashboard-for-openqaw5-xen?orgId=1&viewPanel=65090&from=1682657429086&to=1682699823248) and quickly after a alert was fired. Someone or something cleaned up a short time after to a reasonable 40% usage.

Suggestions

  • DONE: Check with e.g. @okurz if this was maybe a one-time thing because somebody moved around stuff manually
  • DONE: Manual cleanup of files in /var/lib/libvirt/images, ask in #eng-testing what the stuff is needed for
  • Plug in more SSDs. Likely we have some spare in FC Basement shelves
  • Check virsh XMLs to crosscheck openQA jobs before deleting anything for good
  • Adjust the alert to allow longer periods over the threshold We decided that our thresholds are feasible

Related issues 1 (1 open0 closed)

Related to openQA Infrastructure (public) - action #128222: [virtualization] The Xen specific host configuration on openqaw5-xen can be re-created from salt size:MNew2023-04-24

Actions
Actions

Also available in: Atom PDF