action #169750: [alert] backup-vm (backup-vm: partitions usage (%) alert Generic partitions_usage_alert_backup-vm generic) - openQA Infrastructure - openSUSE Project Management Tool

Actions

Copy link

action #169750

closed

[alert] backup-vm (backup-vm: partitions usage (%) alert Generic partitions_usage_alert_backup-vm generic)

Added by ybonatakis 6 days ago. Updated 4 days ago.

Status:

Resolved

Priority:

High

Assignee:

nicksinger

Category:

Regressions/Crashes

Target version:

openQA Project - Ready

Start date:

2024-11-12

Due date:

2024-11-27

% Done:

Estimated time:

Tags:

infra, reactive work

Description

I checked the machine and found
du -ah /path/to/directory | sort -rh | head -n 10

/dev/mapper/system-root   24G   14G  9.6G  59% /opt
/dev/vdb1                2.0T  1.7T  295G  86% /home

○ logrotate.service - Rotate log files
     Loaded: loaded (/usr/lib/systemd/system/logrotate.service; static)
     Active: inactive (dead) since Tue 2024-11-12 00:00:02 CET; 15h ago
TriggeredBy: ● logrotate.timer
       Docs: man:logrotate(8)
             man:logrotate.conf(5)
   Main PID: 29592 (code=exited, status=0/SUCCESS)
        CPU: 163ms

Warning: some journal files were not opened due to insufficient permissions.

action taken: I restarted logrotateand run sudo du -ah /home | sort -rh | head -n 10

Related issues 1 (1 open — 0 closed)

Actions

Copy link

Updated by ybonatakis 6 days ago

I also notice that the Disk I/O is significant bigger in the last 90 days at the relevant panels https://stats.openqa-monitor.qa.suse.de/d/GDbackup-vm/dashboard-for-backup-vm?orgId=1&from=now-90d&to=now&timezone=browser&var-datasource=000000001&refresh=1m

and I found Active: inactive (dead) since Tue 2024-11-12 16:33:35 CET; 37min ago after restart of logrotate

Actions

Copy link

Updated by ybonatakis 6 days ago

Is duplicate of action #167722: Efficient use of monitoring data within influxdb on monitor.qe.nue2.suse.org size:M added

Actions

Copy link

Updated by ybonatakis 6 days ago

Status changed from New to Closed

Actions

Copy link

Updated by okurz 5 days ago

Tags changed from infra to infra, reactive work
Due date set to 2024-11-27
Category set to Regressions/Crashes
Status changed from Closed to Feedback
Assignee set to ybonatakis
Priority changed from Normal to High
Target version set to Ready

@ybonatakis I don't understand. Please clarify why you see this ticket as duplicate of #167722. Checking on backup-vm I still see 86% usage of /dev/vdb1 and https://stats.openqa-monitor.qa.suse.de/d/GDbackup-vm/dashboard-for-backup-vm?orgId=1&from=now-90d&to=now&timezone=browser&var-datasource=000000001&refresh=1m&viewPanel=panel-65090 still shows the alerting state

Actions

Copy link

Updated by okurz 4 days ago

Assignee changed from ybonatakis to nicksinger

as discussed in the weekly coordination call

Actions

Copy link

Updated by nicksinger 4 days ago

Status changed from Feedback to Resolved

okurz wrote in #note-4:

@ybonatakis I don't understand. Please clarify why you see this ticket as duplicate of #167722. Checking on backup-vm I still see 86% usage of /dev/vdb1 and https://stats.openqa-monitor.qa.suse.de/d/GDbackup-vm/dashboard-for-backup-vm?orgId=1&from=now-90d&to=now&timezone=browser&var-datasource=000000001&refresh=1m&viewPanel=panel-65090 still shows the alerting state

that was based on a comment from me that my backup conducted in #167722 caused the disk to fill up on our backup-vm. I currently silenced the alert ( https://stats.openqa-monitor.qa.suse.de/alerting/silences ) and will take care of removing my backup again as part of #167722.

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

QA » openQA Project » openQA Infrastructure

Tags

Custom queries

action #169750

[alert] backup-vm (backup-vm: partitions usage (%) alert Generic partitions_usage_alert_backup-vm generic)

Updated by ybonatakis 6 days ago

Updated by ybonatakis 6 days ago

Updated by ybonatakis 6 days ago

Updated by okurz 5 days ago

Updated by okurz 4 days ago

Updated by nicksinger 4 days ago