Project

General

Profile

action #75055

Updated by nicksinger 11 months ago

due to the ongoing v6 problems we realized that grenache-1 workers disappear one by one if there is no working ipv6 connectivity. This currently results in many blocked jobs since grenache-1 is our main jump host for more exotic testing environments. This ticket is mainly a tracker on what I did to make the workers appear in OSD again:

* 20.10.2020: problem was realized by workers not connecting to baremetal-support.qa.suse.de
* 20.10.2020: IPv6 route was missing, created https://infra.nue.suse.com/SelfService/Display.html?id=178626
* 20.10.2020: IPv6 route was manually added with `ip -6 r a fe80::1`, after that the worker appeared on all webui's again
* 21.10.2020: Due to severe performance problems with the workers we decided to remove the v6 route again (details: https://progress.opensuse.org/issues/73633?issue_count=67&issue_position=1&next_issue_id=73501#note-2)
* 22.10.2020: Several reports stated that grenache-1 workers are once again unavailable. Things I did:
* Stopped all openqa-worker instances
* `umount /var/lib/openqa/share` since it was connected over v6
* disable v6 completely on the external interface with `echo 1 > /proc/sys/net/ipv6/conf/eth0/disable_ipv6`
* `mount /var/lib/openqa/share && systemctl start openqa-worker@{1..40}`
* ==> workers came back on OSD. First jobs are running. Reducing priority for now :)

Back