Actions
action #94237
closedNo alert about too many scheduled tests size:S
Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
Start date:
2021-06-18
Due date:
2021-08-30
% Done:
0%
Estimated time:
Description
Observation¶
https://monitor.qa.suse.de/d/nRDab3Jiz/openqa-jobs-test?tab=alert&editPanel=9&viewPanel=9&orgId=1&from=1623621600000&to=1623967199000 shows that for a longer time there had been more than 3k scheduled tests but grafana does not show that it would have sent alert messages.
Acceptance criteria¶
- AC1: Alert messages are sent when scheduled jobs exceed a defined alert threshold
Suggestions¶
- Investigate why there was this sudden surge in nearly 9k blocked jobs on 2021-06-14. Was that because openQABot was offline in the days before that? -> yes, we assume that was the case
- Investigate if there was an alert message which maybe just does not show in grafana but was still sent -> according to http://mailman.suse.de/mlarch/SuSE/osd-admins/2021/osd-admins.2021.06/maillist.html there was no such email received by osd-admins@suse.de
- Crosscheck system logs on the grafana instance
- Ensure that alert messages are sent
Updated by jbaier_cz over 3 years ago
The openQABot was offline since Jun 10th without anyone noticing. It was re-enabled on Monday and all the missing jobs were scheduled during the afternoon.
Updated by okurz over 3 years ago
- Subject changed from No alert about too many scheduled tests to No alert about too many scheduled tests size:S
- Description updated (diff)
Updated by okurz over 3 years ago
- Due date set to 2021-08-30
- Status changed from Workable to Feedback
- Assignee set to okurz
- Target version changed from future to Ready
just stumbled over the reason for the non-working alert. Have a fix ready https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/553
Updated by okurz over 3 years ago
- Status changed from Feedback to Resolved
MR merged, alert triggered as we are above the schedule. Continuing in #97043
Updated by okurz over 3 years ago
- Related to action #97043: job queue hitting new record 14k jobs added
Actions