action #62567
closedopenqa services can fail when network is not up (yet) "Can't create listen socket: Address family for hostname not supported"
0%
Description
Observation¶
On a system where the network setup is not instantanious, e.g. NetworkManager+DHCP, when openQA systemd services are enabled to automatically startup, they can fail like
Jan 22 21:42:29 falafel openqa-scheduler[1282]: Can't create listen socket: Address family for hostname not supported at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/IOLoop.pm line 124.
Jan 22 21:42:29 falafel openqa-websockets[1283]: Can't create listen socket: Address family for hostname not supported at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/IOLoop.pm line 124.
Jan 22 21:42:31 falafel openqa-livehandler[1248]: Can't create listen socket: Address family for hostname not supported at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/IOLoop.pm line 124.
Jan 22 21:42:32 falafel.suse.cz openqa[1284]: Can't create listen socket: Address family for hostname not supported at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/IOLoop.pm line 124.
Reproducible¶
I think the issue is reproducible on any system, just with slow DHCP it is more likely to observe unless reproduced differently, e.g. on a system without any network
Problem¶
Currently the systemd services do not depend on the network being up, just the network controller stack initialized.
Expected result: Programs should be designed to work regardless of a ready external network.
Suggestions¶
- Check startup of services in an environment where network is not up (yet), e.g. container with removed network
- Ensure all our network related services start up fine regardless of network state
Workaround¶
As a workaround the systemd services can wait for the network being online as described on https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/ :
# systemctl cat openqa-scheduler
# /usr/lib/systemd/system/openqa-scheduler.service
[Unit]
Description=The openQA Scheduler
After=postgresql.service openqa-setup-db.service
Wants=openqa-setup-db.service
[Service]
User=geekotest
ExecStart=/usr/share/openqa/script/openqa-scheduler daemon -m production
TimeoutStopSec=120
[Install]
WantedBy=multi-user.target
# /etc/systemd/system/openqa-scheduler.service.d/override.conf
[Unit]
After=network-online.target
Wants=network-online.target
same is necessary in /etc/systemd/system/openqa-livehandler.service.d/override.conf
Files