Project

General

Profile

action #104350

Updated by okurz almost 3 years ago

## Observation 

 From `journalctl -b -u os-autoinst-openvswitch.service` on grenache-1: 

 ``` 
 -- Logs begin at Mon 2021-12-20 21:50:34 CET, end at Sun 2021-12-26 14:26:34 CET. -- 
 Dec 26 03:34:34 grenache-1 systemd[1]: Starting os-autoinst openvswitch helper... 
 Dec 26 03:35:34 grenache-1 wicked[3515]: device br1: unable to apply configuration to nanny 
 Dec 26 03:36:04 grenache-1 systemd[1]: os-autoinst-openvswitch.service: start-pre operation timed out. Terminating. 
 Dec 26 03:36:04 grenache-1 systemd[1]: os-autoinst-openvswitch.service: Control process exited, code=killed, status=15/TERM 
 Dec 26 03:36:04 grenache-1 systemd[1]: os-autoinst-openvswitch.service: Failed with result 'timeout'. 
 Dec 26 03:36:04 grenache-1 systemd[1]: Failed to start os-autoinst openvswitch helper. 
 Dec 26 04:38:08 grenache-1 systemd[1]: Starting os-autoinst openvswitch helper... 
 Dec 26 04:38:08 grenache-1 systemd[1]: Started os-autoinst openvswitch helper. 
 ``` 

 this triggered an alert shortly as visible on https://stats.openqa-monitor.qa.suse.de/d/KToPYLEWz/failed-systemd-services?orgId=1&from=1640478716612&to=1640495459276 but as can be seen shortly after resolved itself roughly one hour later. 

 ## Acceptance criteria 
 * **AC1:** At least the code flow in https://github.com/os-autoinst/os-autoinst/blob/master/os-autoinst-openvswitch and the corresponding systemd service has been reviewed once 

 ## Suggestions 
 * Investigate why os-autoinst-openvswitch.service times out after 2.5m ~ 180s, when the config file in salt-states says 300s and https://github.com/os-autoinst/os-autoinst/blob/master/os-autoinst-openvswitch#L30 reads like there should be indefinite waiting time 
 * Check if one of the related systemd units has retry. If not, add one or extend timeout 

 * on grenache-1.qa there is already 

 ``` 
 # /etc/systemd/system/os-autoinst-openvswitch.service.d/override.conf 
 [Service] 
 ExecStartPre=/bin/sh -c 'command -v wicked >/dev/null && wicked ifstatus br1 | grep -q up || wicked ifup br1' 
 ``` 

 not sure if this is manually maintained or where this comes from

Back