Project

General

Profile

Actions

action #133397

closed

openQA Project (public) - coordination #110833: [saga][epic] Scale up: openQA can handle a schedule of 100k jobs with 1k worker instances

openQA Project (public) - coordination #108209: [epic] Reduce load on OSD

HTTP Response alert Salt alerting and autoresolving shortly size:M

Added by livdywan over 1 year ago. Updated over 1 year ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Start date:
2023-07-26
Due date:
% Done:

0%

Estimated time:

Description

Observation

From Grafana/ osd-admins@suse.de

Values
B0=19.585438379 
Labels
alertname     HTTP Response alert
grafana_folder     Salt
rule_uid     tm0h5mf4k

see https://monitor.qa.suse.de/d/WebuiDb/webui-summary?viewPanel=78&orgId=1&from=1690139276867&to=1690449757191

Acceptance criteria

  • AC1: No more too strict alerts for http responses are observed

Steps to reproduce

  • Bump the sensitivity of the alert
  • Investigate what if any underlying problem

Suggestions

  • Do not come up with the conclusion that OSD is overloaded sometimes. We already know that! That's what our alerts need to account for

Related issues 2 (0 open2 closed)

Related to openQA Infrastructure (public) - action #133325: osd http response alerts - bump threshold further upRejectedokurz2023-07-25

Actions
Copied to openQA Infrastructure (public) - action #154426: HTTP Response alert Salt alerting and autoresolving shortly size:MResolvedjbaier_cz

Actions
Actions

Also available in: Atom PDF