alert about too many failed minion jobs but https://openqa.suse.de/minion/jobs?state=failed shows none
I see the alert but https://openqa.suse.de/minion/jobs?state=failed shows 0
shows the numbers jumping from a high value to 0 every minute.
Received an alert email notification 2020-06-14 04:46 "Too many failed Minion jobs", Value Failed 26.797
- Status changed from New to Resolved
- Assignee set to okurz
somehow we overlooked it. With the help of the team we looked over the issues and found out that the difference is that the workers also publish their minion job status. All the data was intermingled in grafana. Fixed in https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/315