Project

General

Profile

Actions

action #135470

closed

Grafana: Average Ping time (ms) alert with unexpanded variable "${tag_url}", which machine is this about? size:M

Added by livdywan 8 months ago. Updated 5 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
Start date:
2023-07-20
Due date:
2023-12-13
% Done:

0%

Estimated time:

Description

Observation

alertname     Average Ping time (ms) alert
grafana_folder     Salt
rule_uid     Fm02cmf4z

The machine ${tag_url} was not pingable for several minutes.

See http://stats.openqa-monitor.qa.suse.de/alerting/grafana/Fm02cmf4z/view?orgId=1

Maybe in the migration to grafana 9 we have lost the variable substitution for ${tag_url}

Acceptance criteria

  • AC1: It is understood what this alert means
  • AC1: ${tag_url} is filled in

Suggestions

  • Check if influx tag-variables now work different with a newer grafana version
  • Look into the git log of changes when migrating to unified alerting with grafana 9 for "Average Ping time" and take a look if something broke there
  • Compare to other alert definitions to see if we do anything special here

Related issues 1 (0 open1 closed)

Copied from openQA Infrastructure - action #133130: Lots of alerts for a single cause. Can we group and de-duplicate?Resolvednicksinger2023-07-20

Actions
Actions #1

Updated by livdywan 8 months ago

  • Copied from action #133130: Lots of alerts for a single cause. Can we group and de-duplicate? added
Actions #2

Updated by livdywan 8 months ago

  • Description updated (diff)
Actions #3

Updated by okurz 8 months ago

  • Subject changed from Grafana: Average Ping time (ms) alert to Grafana: Average Ping time (ms) alert with unexpanded variable "${tag_url}", which machine is this about? size:M
  • Description updated (diff)
  • Status changed from New to Workable
Actions #4

Updated by okurz 8 months ago

  • Target version changed from Ready to Tools - Next
Actions #5

Updated by okurz 6 months ago

  • Target version changed from Tools - Next to Ready
Actions #6

Updated by mkittler 5 months ago

  • Status changed from Workable to In Progress
  • Assignee set to mkittler
Actions #8

Updated by openqa_review 5 months ago

  • Due date set to 2023-12-13

Setting due date based on mean cycle time of SUSE QE Tools

Actions #9

Updated by mkittler 5 months ago

  • Status changed from In Progress to Resolved

The MR has been merged and deployed (see https://stats.openqa-monitor.qa.suse.de/alerting/grafana/Fm02cmf4z/view). Not sure how to test this except from trying with a similar non-provisioned alert like I already did. So I'd resolve the issue for now.

Not sure what to do about the missing line breaks. That's a different issue, though.

I keep my test alert https://stats.openqa-monitor.qa.suse.de/alerting/grafana/ab735516-b49e-4ce8-bee8-b2ebbdd1c6f5 in case further tinkering is required (because it was not that easy to create it). I paused the evaluation for the alert and put in a contact that doesn't match anything so it shouldn't cause any problems.

Actions

Also available in: Atom PDF