action #164979: [alert][grafana] File systems alert for WebUI /results size:S - openQA Infrastructure (public) - openSUSE Project Management Tool

Actions

Copy link

action #164979

closed

[alert][grafana] File systems alert for WebUI /results size:S

Added by nicksinger 8 months ago. Updated 8 months ago.

Status:

Resolved

Priority:

High

Assignee:

mkittler

Category:

Regressions/Crashes

Target version:

openQA Project (public) - Ready

Start date:

Due date:

2024-08-21

% Done:

Estimated time:

Tags:

alert, infra

Description

Observation¶

Observed at 2024-08-06 07:17:00 +0200 CEST
~~One of the file systems~~ /results is too full (> 90%)
See http://stats.openqa-monitor.qa.suse.de/d/WebuiDb?orgId=1&viewPanel=74

Current usage:

Filesystem      Size  Used Avail Use% Mounted on
/dev/vdd        7.0T  6.4T  681G  91% /results

Acceptance criteria¶

AC1: There is enough space and headroom on the affected file system /results, i.e. considerably more than 20%

Suggestions¶

Check job group "logs" retention settings for "not-important" / "groupless" result and consider reducing the period
Consider extending the silence period if fixing takes too long: https://stats.openqa-monitor.qa.suse.de/alerting/silence/9ee9b299-3d06-4234-97bf-6b84e2ad9a24/edit?alertmanager=grafana
Reconsider the design of scheduling openqa-investigate for unreviewed jobs and possibly plan in a separate ticket
Tell the security squad that their test scenario(s) are problematic and should fail less or be properly reviewed
Tell the security squad about their test scenario(s) which is significantly bigger than other jobs and consider reducing the space usage, e.g. save less or compress stuff

Rollback steps¶

DONE ~~Remove silence https://stats.openqa-monitor.qa.suse.de/alerting/silence/9ee9b299-3d06-4234-97bf-6b84e2ad9a24/edit?alertmanager=grafana~~

Out of scope¶

Better accounting e.g. linking of investigation jobs to their original groups -> #164988

Files

Download all files

Screenshot_20240807_093724_results_usage_6_months.png (33.2 KB) Screenshot_20240807_093724_results_usage_6_months.png		okurz, 2024-08-07 07:41
clipboard-202408071619-oi5o1.png (73.6 KB) clipboard-202408071619-oi5o1.png		livdywan, 2024-08-07 14:19

Related issues 2 (0 open — 2 closed)

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

QA (public) » openQA Project (public) » openQA Infrastructure (public)

Tags

Custom queries

action #164979

[alert][grafana] File systems alert for WebUI /results size:S

Observation¶

Acceptance criteria¶

Suggestions¶

Rollback steps¶

Out of scope¶

Updated by nicksinger 8 months ago

Updated by okurz 8 months ago

Updated by mkittler 8 months ago

Updated by mkittler 8 months ago

Updated by mkittler 8 months ago · Edited

Updated by mkittler 8 months ago

Updated by okurz 8 months ago

Updated by livdywan 8 months ago

Updated by okurz 8 months ago

Updated by mkittler 8 months ago

Updated by okurz 8 months ago

Updated by mkittler 8 months ago

Updated by openqa_review 8 months ago

Updated by okurz 8 months ago · Edited

Updated by livdywan 8 months ago

Updated by mkittler 8 months ago

Updated by mkittler 8 months ago

Updated by mkittler 8 months ago

Updated by livdywan 8 months ago

Updated by mkittler 8 months ago

Updated by mkittler 8 months ago

Updated by livdywan 8 months ago

Updated by livdywan 8 months ago

Updated by livdywan 8 months ago

Updated by mkittler 8 months ago

Updated by mkittler 8 months ago

Updated by viktors.trubovics 8 months ago

Updated by mkittler 8 months ago

Updated by mkittler 8 months ago

Updated by mkittler 8 months ago · Edited

Updated by mkittler 8 months ago

Updated by mkittler 8 months ago