action #108743
closedopenQA Project (public) - coordination #80142: [saga][epic] Scale out: Redundant/load-balancing deployments of openQA, easy containers, containers on kubernetes
qa-power8-5-kvm minions alert is heart-broken
0%
Description
Observations¶
worker-dashboard-qa-power8-5-kvm shows a broken heart for Minion Jobs.
Rollback steps¶
- Un-pause alert
Updated by okurz almost 3 years ago
- Copied from action #108740: qa-power8-5-kvm minions alert is heart-broken 💔️ added
Updated by nicksinger almost 3 years ago
- Status changed from New to In Progress
- Assignee set to nicksinger
Apparently we accumulated 107 failed minion jobs. Most of them where older then 1 year according to the minion dashboard. I cleaned them now as they are too old to react on anyway. We now have 7 failed jobs left every other day with a fail in locking the database. I remember there was some work done but it might be just alright.
Updated by nicksinger almost 3 years ago
- Status changed from In Progress to Resolved
I activated the alert again. Dashboard is not heart-broken any longer: https://stats.openqa-monitor.qa.suse.de/d/WDQA-Power8-5-kvm/worker-dashboard-qa-power8-5-kvm?viewPanel=65104&orgId=1&refresh=1m