Actions
action #95443
closedVariants of Job age (scheduled) alerts on Grafana on Sunday and Monday size:S
Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
Start date:
2021-07-13
Due date:
% Done:
0%
Estimated time:
Description
Observation¶
I observed several unhandled alerts on Grafana on Sunday and Monday.
[Alerting] Job age (scheduled) (max) alert
Jobs not scheduled for 4 days (345600s). Possible reasons: * There are no online workers for selected scheduled jobs, misconfiguration on the side of tests likely See https://progress.opensuse.org/issues/73174#note-2 for an explanation of the selection of the specific value
Metric name
Value
50% percentile (max)
501773.500
[Alerting] Job age (scheduled) (median) alert
Check for overall decrease of "time to start". Possible reasons for regression: * Not enough ressources * Too many tests scheduled due to misconfiguration 2020-11-27: Alert limit set to 259200s = 3d, see https://progress.opensuse.org/issues/73174#note-2 about the decision Related progress issue: https://progress.opensuse.org/issues/65975
Metric name
Value
50% percentile (median)
501113.500
[Alerting] Job age (scheduled) (max) alert
Jobs not scheduled for 4 days (345600s). Possible reasons: * There are no online workers for selected scheduled jobs, misconfiguration on the side of tests likely See https://progress.opensuse.org/issues/73174#note-2 for an explanation of the selection of the specific value
Metric name
Value
50% percentile (max)
954811.000
[No Data] Incomplete jobs (not restarted) of last 24h alert click
Acceptance criteria¶
- AC1: The cause of the alerts is clear or a follow-up ticket is filed with a feature request to have the necessary details next time
Suggestions¶
Actions