Project

General

Profile

Actions

action #99837

open

coordination #80142: [saga][epic] Scale out: Redundant/load-balancing deployments of openQA, easy containers, containers on kubernetes

coordination #96263: [epic] Exclude certain Minion tasks from "Too many Minion job failures alert" alert

configurable exclusion rules for /influxdb/minion

Added by okurz about 3 years ago. Updated about 3 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
QA (public, currently private due to #173521) - future
Start date:
2021-10-06
Due date:
% Done:

0%

Estimated time:

Description

Motivation

We have minion jobs that sometimes fail and as admins we are not always interested in all types of minion job failures. So far we exclude certain jobs in OSD telegraf+grafana alerting but for the openQA route /influxdb/minion we can not do that. So I suggest to create configurable exclusion rules for the route /influxdb/minion to only present failures that admins care about

Acceptance criteria

  • AC1: grafana panels monitoring minion job failures do not include any failed minion jobs that match the exclusion, e.g. "obs_rsync_run"
Actions

Also available in: Atom PDF