action #133130

Updated by okurz 12 months ago

## Observation 

 Received the following alert emails: 
 * sapworker1: host up alert 
 * sapworker1: OpenQA Ping time alert 
 * sapworker2: host up alert 
 * ... 
 * sapworker3: OpenQA Ping time alert 
 * sapworker3: Ping time alert 
 * Average Ping time (ms) alert 

 all for a singular reason: Problem with the Frankencampus network. Can we group alerts and also not have host up and openQA ping time *and* ping time alerts? 

 ## Acceptance criteria 
 * **AC1:** Grouped alerts, grafana supports this! 
 * **AC2:** No separate ping time alerts if there is a corresponding host up alert, at least the ping time should come much later than the host up 

 ## Suggestions 
 * Read and 
 * Look into "grafana alert grouping" and configure alerts accordingly 
 * Crosscheck alerting time threshold, like pick sensible values for "host up" vs. "ping time" or "packet loss". 

 ## Rollback steps 
 * Remove according silences from either referencing this ticket or anything concerning "host up" or "ping time"