action #75232
Updated by okurz almost 4 years ago
## Observation Currently openqaworker8 has problems to bring the network up due to #73633 , all workers seem to gracefully handle the slow startup but there is an error (disguised as debug message), from `journalctl -b -u openqa-worker@1`: ``` -- Logs begin at Wed 2018-03-07 16:47:21 CET, end at Sat 2020-10-24 13:11:47 CEST. -- Oct 24 13:09:14 linux-fwcx systemd[1]: Starting openQA Worker #1... Oct 24 13:09:15 linux-fwcx systemd[1]: Started openQA Worker #1. Oct 24 13:09:16 linux-fwcx worker[3296]: [2020-10-24T13:09:16.359 CEST] [debug] Unable to serialize fatal error: Can't open file "base_state.json": Permission denied at /usr/lib/os-autoinst/bmwqemu.pm line 86. Oct 24 13:09:16 linux-fwcx worker[3296]: [info] [pid:3296] worker 1: Oct 24 13:09:16 linux-fwcx worker[3296]: - config file: /etc/openqa/workers.ini Oct 24 13:09:16 linux-fwcx worker[3296]: - worker hostname: linux-fwcx Oct 24 13:09:16 linux-fwcx worker[3296]: - isotovideo version: 20 Oct 24 13:09:16 linux-fwcx worker[3296]: - websocket API version: 1 Oct 24 13:09:16 linux-fwcx worker[3296]: - web UI hosts: openqa.suse.de Oct 24 13:09:16 linux-fwcx worker[3296]: - class: caasp_x86_64,tap,qemu_x86_64,openqaworker8 Oct 24 13:09:16 linux-fwcx worker[3296]: - no cleanup: no Oct 24 13:09:16 linux-fwcx worker[3296]: - pool directory: /var/lib/openqa/pool/1 Oct 24 13:09:16 linux-fwcx worker[3296]: [info] [pid:3296] CACHE: caching is enabled, setting up /var/lib/openqa/cache/openqa.suse.de Oct 24 13:09:16 linux-fwcx worker[3296]: [info] [pid:3296] Project dir for host openqa.suse.de is /var/lib/openqa/share Oct 24 13:09:16 linux-fwcx worker[3296]: [info] [pid:3296] Registering with openQA openqa.suse.de Oct 24 13:09:16 linux-fwcx worker[3296]: [warn] [pid:3296] Failed to register at openqa.suse.de - connection error: Can't connect: Name or service not known - trying again in 10 seconds Oct 24 13:09:26 openqaworker8 worker[3296]: [info] [pid:3296] Registering with openQA openqa.suse.de Oct 24 13:09:26 openqaworker8 worker[3296]: [info] [pid:3296] Establishing ws connection via ws://openqa.suse.de/api/v1/ws/1358 Oct 24 13:09:26 openqaworker8 worker[3296]: [info] [pid:3296] Registered and connected via websockets with openQA host openqa.suse.de and worker ID 1358 ``` see the message "Unable to serialize fatal error: Can't open file "base_state.json": Permission denied at /usr/lib/os-autoinst/bmwqemu.pm line 86." ## Acceptance criteria * **AC1:** No error message about the problem to open the file on startup, e.g. when there is no active network connection (yet) ## Suggestions Either prevent the error condition after analyzing the code or first try to reproduce the issue, e.g. in an environment with simulated broken network connection, e.g. using the tool "unshare" as we do in some of our tests or a clean container environment where this can be simulated. It might be that this has nothing to do with "network" but just startup of a worker. simulated ## Workaround Error message can be ignored