Project

General

Profile

action #99837

coordination #80142: [saga][epic] Scale out: Redundant/load-balancing deployments of openQA, easy containers, containers on kubernetes

coordination #96263: [epic] Exclude certain Minion tasks from "Too many Minion job failures alert" alert

configurable exclusion rules for /influxdb/minion

Added by okurz about 2 months ago. Updated 3 days ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
Start date:
2021-10-06
Due date:
% Done:

0%

Estimated time:
Difficulty:

Description

Motivation

We have minion jobs that sometimes fail and as admins we are not always interested in all types of minion job failures. So far we exclude certain jobs in OSD telegraf+grafana alerting but for the openQA route /influxdb/minion we can not do that. So I suggest to create configurable exclusion rules for the route /influxdb/minion to only present failures that admins care about

Acceptance criteria

  • AC1: grafana panels monitoring minion job failures do not include any failed minion jobs that match the exclusion, e.g. "obs_rsync_run"

History

#1 Updated by okurz about 2 months ago

  • Target version changed from future to Ready

#2 Updated by okurz about 2 months ago

  • Target version changed from Ready to future

Also available in: Atom PDF