action #137522
Updated by livdywan 11 months ago
## Observations Fri, 06 Oct 2023 09:16:02 +0200 https://stats.openqa-monitor.qa.suse.de/alerting/grafana/d74e764d-6097-4d14-b77c-76c8d1da6ff0/view?orgId=1 It seems to be all host: sushil-linux-tw-kde ## Suggestions * Likely sushil just sends data over telegraf to our grafana instance. Prevent that! * Investigate where the list of machines we check here is taken from * Introduce an additional telegraf data tag to our salt-controlled machines and adjust grafana queries/alerts to match this tag * In queries/panels to only show "our" hosts * In the alerts (maybe? Do we want to provide alerts for others as well?) * In the notification channels to only receive mails for hosts we care about ## Out of scope * Confirm why it is allowed to push telegraf data from anywhere - should/can this be dropped? * Is there going to be a lot of (big) data unaccounted for? ## Rollback actions * Remove pause alert for `host=sushil-linux-tw-kde`