action #70885
Updated by okurz over 3 years ago
## Observation
received alert email 2020-09-02 14:27Z
```
/*[Alerting] File systems alert*/
One of the file systems is too full
*Metric name*
*Value*
/assets: Used Percentage
94.207
```
30m later the status switched back to "OK" but I guess we can easily hit the limit again.
panel can be found on
https://monitor.qa.suse.de/d/WebuiDb/webui-summary?viewPanel=74&orgId=1
## Problem
The alert is flaky as it went back to "ok" without explicit user action.
## Suggestions
* Make sure some assets are cleaned up as we can not keep that many and 4.7TB for assets is too much.
* Research if a better hysteresis can be implemented in grafana, e.g. the alert would trigger if 94% is reached but only recover if usage goes below 92%
## Further notes
I did not pause the alert as it is currently "ok" and we need to be careful that the available disk space is not completely depleted.
94% usage on a filesystem is already much. We must not increase the alert threshold further.