coordination #112718
Updated by okurz almost 3 years ago
# Observation
We received a lot of alerts over the weekend regarding failed minion jobs and others. Checking Grafana I can see that the problem started Saturday, 18th of June around 13:00 CET: https://stats.openqa-monitor.qa.suse.de/d/WebuiDb/webui-summary?orgId=1&from=1655549105000&to=now
The amount of returned PostgreSQL rows looks very suspicious and is now five times as high as before: https://stats.openqa-monitor.qa.suse.de/d/WebuiDb/webui-summary?orgId=1&from=1655475539000&to=now&viewPanel=89
## Cleanup tasks
*
## Rollback and cleanup steps
* on osd `systemctl enable --now telegraf`
* on osd `systemctl enable --now openqa-scheduler`
* Retrigger all incomplete jobs since 2022-06-18 with https://github.com/os-autoinst/scripts/blob/master/openqa-advanced-retrigger-jobs