action #122845
closed
openQA Project (public) - coordination #109846: [epic] Ensure all our database tables accomodate enough data, e.g. bigint for ids
coordination #113674: [epic] Configure I/O alerts again for the webui after migrating to the "unified alerting" in grafana size:M
Migrate our Grafana setup to "unified alerting"
Added by livdywan over 2 years ago.
Updated about 2 years ago.
Description
Summary¶
With #112733 we got new I/O panels for the webui. Due to the nature of repeating panels we cannot add an alert for the IO time with the current alerting backend we use. This should be possible with unified alerting: https://grafana.com/blog/2021/06/14/the-new-unified-alerting-system-for-grafana-everything-you-need-to-know/
Acceptance criteria¶
- AC1: alerts work at least the way they did before
- AC2: the Grafana instance runs stable with unified alerting
Suggestions¶
- Find out how to migrate to the new system, automatically/ manually
- Try out with an official test instance of Grafana available from their website
- Test with a container
- Confirm what we end up with e.g. new JSON or different layout
- Keep in mind this is the default for Grafana 10 and our current setup may not be supportable long-term
- Blocks action #122842: Configure I/O alerts again for the webui after migrating to the "unified alerting" in grafana size:M added
- Blocks action #122848: Configure grouped alerts in Grafana correctly size:M added
- Target version set to Ready
- Assignee set to robert.richardson
- Status changed from Workable to In Progress
- Due date set to 2023-01-31
Setting due date based on mean cycle time of SUSE QE Tools
We talked about it in the unblock. Robert's trying to setup Grafana locally, since we don't want to just upgrade in production. We could have that setup on a staging machine i.e. openqa-staging-1.qa.suse.de or openqa-staging-2.qa.suse.de, assuming the setup is basically the same effort and others can more easily check it and help out.
- Status changed from In Progress to Workable
- Assignee deleted (
robert.richardson)
The grafana instance on staging-1 is up, for the setup i've copied the grafana database /var/lib/grafana/grafana.db as well as the auth section of /etc/grafana/grafana.ini and the ldap setup file /etc/grafana/ldap.toml
I've set '[unified_alerting] enabled = true' in the config file and all alert-rules have been successfully migrated and respective contact points, notification policies and silences showed up in the UI after i restarted the grafana service.
You can check out the migrated alerts in the new layout/UI here:
http://openqa-staging-1.qa.suse.de:3000/alerting/list
(Login via NIS account)
I was also able to revert the migration without problems as described here: https://grafana.com/docs/grafana/latest/alerting/migrating-alerts/roll-back/
Setting this ticket back to workable for now, as i'm in school next week
- Due date deleted (
2023-01-31)
Due date is not generally applicable to workable
When creating https://build.opensuse.org/request/show/1063633 I had to go through the lengthy list of changes to format them. I've noticed that there are several fixes and improvements regarding the migration of alerts to the new system. Maybe it is worthwhile to wait until the new version is deployed (currently my SR is even still pending) so we can benefit from these changes.
- Status changed from Workable to In Progress
- Assignee set to nicksinger
- Due date set to 2023-02-23
Setting due date based on mean cycle time of SUSE QE Tools
- Status changed from In Progress to Resolved
- Status changed from Resolved to Feedback
- Subject changed from Migrate our Grafana setup to "unified alerting" size:M to Migrate our Grafana setup to "unified alerting"
- Due date deleted (
2023-02-23)
- Related to action #125303: prevent confusing "no data" alerts size:M added
- Priority changed from Normal to High
- Due date set to 2023-03-15
You may resolve this ticket as the migration is done. However, please create a follow-up for #120267#note-28 then (storing alert config in JSON).
- Status changed from Feedback to Resolved
- Due date deleted (
2023-03-15)
Also available in: Atom
PDF