action #94237
Updated by okurz almost 3 years ago
## Observation https://monitor.qa.suse.de/d/nRDab3Jiz/openqa-jobs-test?tab=alert&editPanel=9&viewPanel=9&orgId=1&from=1623621600000&to=1623967199000 shows that for a longer time there had been more than 3k scheduled tests but grafana does not show that it would have sent alert messages. ## Acceptance criteria * **AC1:** Alert messages are sent when scheduled jobs exceed a defined alert threshold ## Suggestions * Investigate why there was this sudden surge in nearly 9k blocked jobs on 2021-06-14. Was that because openQABot was offline in the days before that? -> yes, we assume that was the case * Investigate if there was an alert message which maybe just does not show in grafana but was still sent -> according to http://mailman.suse.de/mlarch/SuSE/osd-admins/2021/osd-admins.2021.06/maillist.html there was no such email received by osd-admins@suse.de * Crosscheck system logs on the grafana instance * Ensure that alert messages are sent