action #133889
Updated by tinita over 1 year ago
## Observation
```
Firing [stats.openqa-monitor.qa.suse.de]
web UI: Minion jobs failed hook alert
View alert [stats.openqa-monitor.qa.suse.de]
Values
A0=21
Labels
alertname
web UI: Minion jobs failed hook alert
grafana_folder
Salt
rule_uid
minion_jobs_failed_hook_alert
Annotations
message
Too many minion jobs with failed hook scripts.
```
https://stats.openqa-monitor.qa.suse.de/alerting/grafana/e06a9f3f-205f-4733-b63b-4a84dfea1535/view?orgId=1
https://stats.openqa-monitor.qa.suse.de/d/WebuiDb/webui-summary?viewPanel=201&orgId=1&from=1691201258115&to=1691238718573
According to the grafana panel there were 21 failures between 08:00 and 08:10, however I don't see matching entries in the gru journal around that time. `journalctl -u openqa-gru --since "2023-08-05 00:00:00"`
There are much more failures around 06:00.
Also looking at other occasions of failures (under the alarm threshold) the time in grafana doesn't fit the journal.
The journal timestamps are in CEST. Am I missing something?