Actions
action #181415
open[osd][grafana] Add alert for inactive minion jobs
Status:
New
Priority:
Normal
Assignee:
-
Category:
Feature requests
Target version:
Start date:
2025-04-25
Due date:
% Done:
0%
Estimated time:
Description
Observation¶
As observed in #181400 there can be a long queue of minion jobs pending leading to problematic situations and we do not immediately notice. We already have an alert for failed minion jobs. We should also have an alert for inactive minion jobs.
Acceptance criteria¶
- AC1: There is an alert linked to https://monitor.qa.suse.de/d/WebuiDb/webui-summary?orgId=1&from=2025-04-24T10:39:57.865Z&to=2025-04-25T10:05:46.827Z&var-host_disks=$__all&refresh=15m&timezone=UTC&editPanel=19&tab=alert when there are too many inactive jobs
Suggestions¶
- Based on https://monitor.qa.suse.de/d/WebuiDb/webui-summary?orgId=1&from=2025-04-24T10:39:57.865Z&to=2025-04-25T10:05:46.827Z&var-host_disks=$__all&refresh=15m&timezone=UTC&editPanel=19&tab=alert and the existing alert about failed minion jobs add an alert definition for inactive minion jobs as well with a sensible threshold, maybe 10k?
- Ensure that the alert is properly deployed and active and not firing
Actions