Project

General

Profile

Actions

action #125303

closed

prevent confusing "no data" alerts size:M

Added by mkittler almost 2 years ago. Updated over 1 year ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Start date:
2023-03-02
Due date:
2023-04-07
% Done:

0%

Estimated time:
Tags:

Description

Observation

We used "no data" as alert trigger for our "host up" alerts. This caused confusion after switching to the new unified alerting system in grafana because we thought that no data was provided by telegraf while in reality it was a valid alert.

Acceptance criteria

  • AC1: We don't rely on "no data"-triggers for other purposes (e.g. host up, etc)

Suggestions

  • Wait for a Grafana 9.1 update so we can provision alerts from files
  • Change the "host up"-alert from using "average_response_ms" to "result_code"
  • Crosscheck if we already have a solution for telegraf not being able to push data to influxdb

Related issues 2 (0 open2 closed)

Related to openQA Infrastructure (public) - action #122845: Migrate our Grafana setup to "unified alerting"Resolvednicksinger2023-01-09

Actions
Blocked by openQA Infrastructure (public) - action #125642: Manage "unified alerting" via salt size:MResolvedmkittler2023-01-09

Actions
Actions

Also available in: Atom PDF