Project

General

Profile

coordination #113674

Updated by livdywan almost 2 years ago

# Summary 
 With #112733 https://progress.opensuse.org/issues/112733 we got new I/O panels for the webui. Due to the nature of **repeating panels** repeating panels we cannot add an alert for the IO time with the current alerting backend we use. This should be possible with unified alerting: https://grafana.com/blog/2021/06/14/the-new-unified-alerting-system-for-grafana-everything-you-need-to-know/ 

 ## Acceptance criteria 
 * **AC1:** alerts for each disk on the webui with according thresholds 
 * **AC2:** grouping of alerts is properly configured and understood 
 * **AC3:** alerts can be configured across multiple panels (using repeated panels) 

 ## Suggestions 
 * Take a look at our previous alerting rule: https://gitlab.suse.de/openqa/salt-states-openqa/-/blob/1c505df5e92420d0f266e7ea4b3a049aae892dd5/monitoring/grafana/webui.dashboard.json#L3757-3842 
 * Find out how to migrate to the new system, automatically/ manually 
 * Repeating panels are important here so we can let Grafana create multiple panels based on different variables i.e. as opposed to having to copy and duplicate panels via salt 
   * Currently we have panels that consist of variables, which can't support alerts 
   * Ask Nick in case it's unclear 
 * Try out with an official test instance of Grafana available from their website 
 * Test with a container 
 * Confirm what we end up with e.g. new JSON or different layout 
 * Keep in mind this is the default for Grafana 10 and our current setup may not be supportable long-term

Back