Project

General

Profile

action #133385

Updated by okurz 10 months ago

## Observation 
 This is about o3, so zabbix, not grafana! 

 From zabbix@suse.de 

 ``` 
 -Problem started at 11:05:17 on 2023.07.26 
 Problem name: Interface tun5: Link down 
 Host: ariel.suse-dmz.opensuse.org 
 Severity: Average 
 Operational data: Current state: down (2) 
 Original problem ID: 510998085 
 ``` 

 followed by 

 ``` 
 Problem has been resolved at 11:12:17 on 2023.07.26 
 Problem name: Interface tun5: Link down 
 Problem duration: 7m 0s 
 Host: ariel.suse-dmz.opensuse.org 
 Severity: Average 
 Original problem ID: 510998085 
 ``` 

 likely during the time when o3 was rebooting as planned 

 ## Acceptance criteria 
 * **AC1:** No more alerts for tun5 are observed if o3 or the tunnel is just down for some minutes 

 ## Steps to reproduce 
 * Temporarily shut down Bump the autossh-old-ariel.service on new-ariel or try to trigger with reboots of o3 

 ## Suggestions 
 * Login on https://zabbix.nue.suse.com/ and play around to find your way around. If in doubt ask jbaier_cz 
 * Just bump the sensitivity of the alert or delay the actual notification 
 * Try the effect with multiple reboots of o3 Investigate what if any underlying problem

Back