action #128561
Updated by okurz over 1 year ago
## Observation
In https://suse.slack.com/archives/C02CANHLANP/p1683041198918179 DimStar noted:
> (Dominique Leuenberger) @Oliver Kurz Hi; seems jenkins is no longer scheduling QA runs for GNOME:Next - https://openqa.opensuse.org/group_overview/35 lists last run 3 days ago
> (Fabian Vogt) Time to migrate to obs_rsync? Probably. The jenkins host is unreachable. Either down or overloaded.
There is a monitoring panel "host up" with an according alert. Looking at
https://monitor.qa.suse.de/d/GDjenkins/dashboard-for-jenkins?orgId=1&viewPanel=65105&from=1682650896818&to=1683189154032&editPanel=65105
one can see that there was a long window with no response but no alert. Likely we went a bit too far to ignore all "no data" conditions but the panel should not look at the response time but the "result_code" of the ping which always has a valid value and can be checked for host responses
## Acceptance criteria
* **AC1:** There is an alert when the machine is down
* **AC2:** There is no alert for usual planned reboots
## Suggestions
* Change "host up" to look at "result_code" and pick a sensible alert linked to that
* Check if we do not already have a ticket for the same