Project

General

Profile

action #109253

Updated by mkittler over 2 years ago

## Motivation 
 As we found out during our investigation work on #108845 it was us pointing EngInfra to network problems that were also affecting other teams and components within SUSE Nue server rooms but nobody noticed. 

 ## Acceptance criteria 
 * **AC1**: Alerting is defined for common SLE OSD test requirements regarding network Suggestions 

 ## Suggestions 
 * Add monitoring, e.g. ping checks in telegraf from each openQA worker (or monitor.qa as source) to qanet.qa, dist.suse.de, download.opensuse.org, scc.suse.com, proxy.scc.suse.de 
 * Optional: Ping between switches (check out https://gitlab.suse.de/nicksinger/network-scripts/-/blob/main/find_mac.py for an example how to execute commands on switches directly) 
 * Optional: Add more HTTP response checks

Back