action #135470
closedGrafana: Average Ping time (ms) alert with unexpanded variable "${tag_url}", which machine is this about? size:M
0%
Description
Observation¶
alertname Average Ping time (ms) alert
grafana_folder Salt
rule_uid Fm02cmf4z
The machine ${tag_url} was not pingable for several minutes.
See http://stats.openqa-monitor.qa.suse.de/alerting/grafana/Fm02cmf4z/view?orgId=1
Maybe in the migration to grafana 9 we have lost the variable substitution for ${tag_url}
Acceptance criteria¶
- AC1: It is understood what this alert means
- AC1: ${tag_url} is filled in
Suggestions¶
- Check if influx tag-variables now work different with a newer grafana version
- Look into the git log of changes when migrating to unified alerting with grafana 9 for "Average Ping time" and take a look if something broke there
- Compare to other alert definitions to see if we do anything special here
Updated by livdywan about 1 year ago
- Copied from action #133130: Lots of alerts for a single cause. Can we group and de-duplicate? added
Updated by okurz about 1 year ago
- Subject changed from Grafana: Average Ping time (ms) alert to Grafana: Average Ping time (ms) alert with unexpanded variable "${tag_url}", which machine is this about? size:M
- Description updated (diff)
- Status changed from New to Workable
Updated by okurz about 1 year ago
- Target version changed from Ready to Tools - Next
Updated by openqa_review 12 months ago
- Due date set to 2023-12-13
Setting due date based on mean cycle time of SUSE QE Tools
Updated by mkittler 12 months ago
- Status changed from In Progress to Resolved
The MR has been merged and deployed (see https://stats.openqa-monitor.qa.suse.de/alerting/grafana/Fm02cmf4z/view). Not sure how to test this except from trying with a similar non-provisioned alert like I already did. So I'd resolve the issue for now.
Not sure what to do about the missing line breaks. That's a different issue, though.
I keep my test alert https://stats.openqa-monitor.qa.suse.de/alerting/grafana/ab735516-b49e-4ce8-bee8-b2ebbdd1c6f5 in case further tinkering is required (because it was not that easy to create it). I paused the evaluation for the alert and put in a contact that doesn't match anything so it shouldn't cause any problems.