Project

General

Profile

Actions

action #68077

closed

alert about too many failed minion jobs but https://openqa.suse.de/minion/jobs?state=failed shows none

Added by okurz over 4 years ago. Updated over 4 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Start date:
2020-06-15
Due date:
% Done:

0%

Estimated time:

Description

Observation

I see the alert but https://openqa.suse.de/minion/jobs?state=failed shows 0
failed.
https://stats.openqa-monitor.qa.suse.de/d/WebuiDb/webui-summary?
orgId=1&refresh=30s&fullscreen&panelId=19&from=now-1h&to=now
shows the numbers jumping from a high value to 0 every minute.
Received an alert email notification 2020-06-14 04:46 "Too many failed Minion jobs", Value Failed 26.797

Actions #1

Updated by okurz over 4 years ago

  • Target version set to Ready
Actions #2

Updated by okurz over 4 years ago

  • Status changed from New to Resolved
  • Assignee set to okurz

somehow we overlooked it. With the help of the team we looked over the issues and found out that the difference is that the workers also publish their minion job status. All the data was intermingled in grafana. Fixed in https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/315

Actions

Also available in: Atom PDF