action #107437
Updated by okurz almost 3 years ago
## Observation
I am receiving multiple emails since we had the QA labs move regarding "no data" that resolve themselves shortly afterwards. At first I suspected our maintenance work when actually changing the cabling or so but by now I think there is another recurring problem as I doubt at times I have seen the alert we had someone doing something on the network or switches or configuration.
## Suggestions
* Crosscheck network bandwidth between different machines in different locations to find out if monitor.qa.suse.de can receive data with sufficient bandwidth
* Crosscheck monitoring data from switches if there is anything excessive
* Take a look into logs on monitor.qa if there are problems reported about receiving data, maybe to influxdb
* Take a look into logs on osd or workers if telegraf has problems to write to monitor.qa and influxdb
`journalctl -u telegraf` on osd lists:
```
Feb 24 11:45:15 openqa telegraf[13914]: 2022-02-24T10:45:15Z E! [outputs.influxdb] when writing to [http://openqa-monitor.qa.suse.de:8086]: Post "http://openqa-monitor.qa.suse.de:8086/write?db=telegraf": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
Feb 24 11:45:15 openqa telegraf[13914]: 2022-02-24T10:45:15Z E! [agent] Error writing to outputs.influxdb: could not write any address
Feb 24 11:45:20 openqa telegraf[13914]: 2022-02-24T10:45:20Z W! [outputs.influxdb] Metric buffer overflow; 259 metrics have been dropped
Feb 24 11:45:25 openqa telegraf[13914]: 2022-02-24T10:45:25Z E! [outputs.influxdb] when writing to [http://openqa-monitor.qa.suse.de:8086]: Post "http://openqa-monitor.qa.suse.de:8086/write?db=telegraf": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
Feb 24 11:45:25 openqa telegraf[13914]: 2022-02-24T10:45:25Z E! [agent] Error writing to outputs.influxdb: could not write any address
Feb 24 11:45:25 openqa telegraf[13914]: 2022-02-24T10:45:25Z W! [outputs.influxdb] Metric buffer overflow; 123 metrics have been dropped
Feb 24 11:45:30 openqa telegraf[13914]: 2022-02-24T10:45:30Z E! [outputs.influxdb] when writing to [http://openqa-monitor.qa.suse.de:8086]: Post "http://openqa-monitor.qa.suse.de:8086/write?db=telegraf": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
Feb 24 11:45:30 openqa telegraf[13914]: 2022-02-24T10:45:30Z E! [agent] Error writing to outputs.influxdb: could not write any address
```