Project

General

Profile

action #138005

Updated by okurz about 1 year ago

## Observation 
 https://monitor.qa.suse.de/d/EML0bpuGk/monitoring?orgId=1&viewPanel=4 should show the results from ping regarding packet loss to "other hosts", i.e. the ones that are not in our salt control, like dist.suse.de. However the panel shows a very long and slowly rendering list including multiple redundant entries that are already covered by other panels, e.g. "worker40 - openqa.suse.de" and also "openqa - tumblesle.qe.nue2.suse.org", all of which are salt controlled. 

 ## Acceptance criteria 
 * **AC1:** The panel should not show nor alert on any packet loss to salt controlled machines 
 * **AC2:** The panel still shows and alerts on packet loss to any "other host" external host 


 ## Suggestions 
 * Look into references of "inputs.ping" in https://gitlab.suse.de/openqa/salt-states-openqa , in particular in monitoring/telegraf/telegraf-worker.conf , maybe we should use a special "tag" as in 6c3e70e and bce9156 for #137522 to distinguish generic ping from this ping to "external_hosts" 
 * Consider changing how data is pushed by telegraf into influxdb 
 * Consider changing the monitoring query in https://monitor.qa.suse.de/d/EML0bpuGk/monitoring?orgId=1&viewPanel=4&editPanel=4 
 * Consider changing the according alert accordingly as well 
 * Optional: Consider changing the panel to include more than just from openQA workers 

 ## Further details 
 * "other host" means any entry of "host" in https://gitlab.suse.de/openqa/salt-pillars-openqa/-/blob/master/openqa/workerconf.sls?ref_type=heads#L15 in "required_external_networks"

Back