action #94399
Updated by okurz almost 3 years ago
## Observation On 2021-06-22, all arm workers (arm-1, arm-2, arm-3) couldn't be connected by using `ssh` or `ping`. But https://stats.openqa-monitor.qa.suse.de/d/4KkGdvvZk/osd-status-overview?orgId=1 showed that all of them were `Online`. ## Acceptance criteria * ~~**AC1:** **AC1:** We can receive the alerting e-mail when arm workers down~~ * down. **AC2:** https://stats.openqa-monitor.qa.suse.de/d/4KkGdvvZk/osd-status-overview?orgId=1 should show the correct state * **AC3:** We receive alert notices for errors in telegraf on osd ## Suggestions * We should look into feeding something into influxdb when the telegraf service especially on OSD shows errors or log error monitoring state.