Better documentation on jenkins.qa.suse.de alerts and recovery
It seems the alert regarding "packet loss" is not very clear. And maybe when there's many alerts it's not obvious how to address it.
- AC1: The alert is understood by the team
- AC1: There's documentation about how to recover jenkins when it's down
- Write some documentation, or dig up existing docs
- Consider a little mob session on alert handling and recovery of machines
- Look at https://stats.openqa-monitor.qa.suse.de/d/EML0bpuGk/monitoring?orgId=1