Actions
action #89821
closedalert: PROBLEM Service Alert: openqa.suse.de/fs_/srv is WARNING (flaky, partial recovery with OK messages)
Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
Start date:
2021-03-10
Due date:
% Done:
0%
Estimated time:
Description
Observation¶
Multiple alert email reports:
Notification: PROBLEM
Host: openqa.suse.de
State: WARNING
Date/Time: Tue Mar 9 13:17:18 UTC 2021
Info: WARN - 80.1% used (64.06 of 79.99 GB), trend: +573.77 MB / 24 hours
Service: fs_/srv
See Online: https://thruk.suse.de/thruk/cgi-bin/extinfo.cgi?type=2&host=openqa.suse.de&service=fs_%2Fsrv
Acceptance criteria¶
- AC1: /srv on osd has enough free space
- AC2: alert is handled
- AC3: icinga alert is only triggering if internal grafana alert is not handled or not effective
Suggestions¶
- Follow the above thruk link to understand the monitoring data
- Crosscheck alert limit "80%" with the limit we have in grafana
- Make sure the grafana limit is smaller
- Ensure there is enough space, e.g. ask EngInfra to increase or cleanup
Actions