Actions
coordination #113674
closedopenQA Project (public) - coordination #109846: [epic] Ensure all our database tables accomodate enough data, e.g. bigint for ids
[epic] Configure I/O alerts again for the webui after migrating to the "unified alerting" in grafana size:M
Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
Start date:
2023-01-09
Due date:
% Done:
100%
Estimated time:
(Total: 0.00 h)
Tags:
Description
Summary¶
With #112733 we got new I/O panels for the webui. Due to the nature of repeating panels we cannot add an alert for the IO time with the current alerting backend we use. This should be possible with unified alerting: https://grafana.com/blog/2021/06/14/the-new-unified-alerting-system-for-grafana-everything-you-need-to-know/
Acceptance criteria¶
- AC1: alerts for each disk on the webui with according thresholds
- AC2: grouping of alerts is properly configured and understood
- AC3: alerts can be configured across multiple panels (using repeated panels)
Suggestions¶
- Take a look at our previous alerting rule: https://gitlab.suse.de/openqa/salt-states-openqa/-/blob/1c505df5e92420d0f266e7ea4b3a049aae892dd5/monitoring/grafana/webui.dashboard.json#L3757-3842
- Find out how to migrate to the new system, automatically/ manually
- Repeating panels are important here so we can let Grafana create multiple panels based on different variables i.e. as opposed to having to copy and duplicate panels via salt
- Currently we have panels that consist of variables, which can't support alerts
- Ask Nick in case it's unclear
- Try out with an official test instance of Grafana available from their website
- Test with a container
- Confirm what we end up with e.g. new JSON or different layout
- Keep in mind this is the default for Grafana 10 and our current setup may not be supportable long-term
Actions