action #158377
Updated by okurz 7 months ago
https://progress.opensuse.org/issues/158377
Detect from monitoring data which monitored machines show a too low CPU usage over time size:M
## Motivation
For machines which we *do* monitor we can also look into effectively "unused" machines.
## Acceptance criteria
* **AC1:** Alert for physical machines with too low system usage over longer time
## Suggestions
* Experiment with an alert for multiple machines
* Reference could be https://monitor.qa.suse.de/d/WDopenqaworker1/worker-dashboard-openqaworker1 which is showing not too high load but certainly enough load from time to time
* Virtual machines like tumblesle https://monitor.qa.suse.de/d/GDtumblesle/dashboard-for-tumblesle?orgId=1&refresh=1m&viewPanel=54694&from=now-7d&to=now can have a very low load but are not a problem so do not include those
* As necessary adjust the telegraf config to be able to distinguish between physical and virtual machines. Simple shortcut could be to just select machines with more than N cpus, like 2 or 4
Back