Actions
action #104350
closed[alert] failed systemd service on grenache-1, os-autoinst-openvswitch, turned to "ok" automatically size:M
Description
Observation¶
From journalctl -b -u os-autoinst-openvswitch.service
on grenache-1:
-- Logs begin at Mon 2021-12-20 21:50:34 CET, end at Sun 2021-12-26 14:26:34 CET. --
Dec 26 03:34:34 grenache-1 systemd[1]: Starting os-autoinst openvswitch helper...
Dec 26 03:35:34 grenache-1 wicked[3515]: device br1: unable to apply configuration to nanny
Dec 26 03:36:04 grenache-1 systemd[1]: os-autoinst-openvswitch.service: start-pre operation timed out. Terminating.
Dec 26 03:36:04 grenache-1 systemd[1]: os-autoinst-openvswitch.service: Control process exited, code=killed, status=15/TERM
Dec 26 03:36:04 grenache-1 systemd[1]: os-autoinst-openvswitch.service: Failed with result 'timeout'.
Dec 26 03:36:04 grenache-1 systemd[1]: Failed to start os-autoinst openvswitch helper.
Dec 26 04:38:08 grenache-1 systemd[1]: Starting os-autoinst openvswitch helper...
Dec 26 04:38:08 grenache-1 systemd[1]: Started os-autoinst openvswitch helper.
this triggered an alert as visible on https://stats.openqa-monitor.qa.suse.de/d/KToPYLEWz/failed-systemd-services?orgId=1&from=1640478716612&to=1640495459276 but as can be seen resolved itself roughly one hour later.
Acceptance criteria¶
- AC1: At least the code flow in https://github.com/os-autoinst/os-autoinst/blob/master/os-autoinst-openvswitch and the corresponding systemd service has been reviewed once
Suggestions¶
- Investigate why os-autoinst-openvswitch.service times out after 2.5m ~ 180s, when the config file in salt-states says 300s and https://github.com/os-autoinst/os-autoinst/blob/master/os-autoinst-openvswitch#L30 reads like there should be indefinite waiting time
Check if one of the related systemd units has retry. If not, add one or extend timeout
on grenache-1.qa there is already
# /etc/systemd/system/os-autoinst-openvswitch.service.d/override.conf
[Service]
ExecStartPre=/bin/sh -c 'command -v wicked >/dev/null && wicked ifstatus br1 | grep -q up || wicked ifup br1'
not sure if this is manually maintained or where this comes from
Actions