action #180122
opencoordination #161414: [epic] Improved salt based infrastructure management
Run openqa-piworker as part of our infrastructure while still being CC compliant
0%
Description
Observation¶
In #178576 we realized the MTU of the NUE2 FC-basement network changed to 1360. According to tests while running the machines, this would require us to use a MTU of 1260 for our wireguard tunnels which is too small for IPv6 to properly work and results in failed systemd services and other problems.
All other machines in FC-basement now use no wireguard tunnel because they can reach OSD directly over secure protocols. The only machine where this apparently is not working is openqa-piworker.qe.nue2.suse.org because salt constantly produces timeouts - we need to find a solution to manage this machine reliable again in our infrastructure.
Suggestions¶
@nicksinger made several changes to the network config on the workers:
- https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/1413/diffs
- https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/1416/diffs
- https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/1420/diffs
- https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/1421/diffs
Rollback actions¶
- Add back to salt and production:
salt-key -y -a openqa-piworker.qe.nue2.suse.org
Updated by nicksinger about 1 month ago
- Copied from action #178576: Workers unresponsive in salt pipelines including openqa-piworker, sapworker1 and monitor size:S added
Updated by okurz about 1 month ago
- Subject changed from Run openqa-piworker as part of our infrastructure while still being CC compliant size:S to Run openqa-piworker as part of our infrastructure while still being CC compliant
Updated by okurz about 1 month ago
- Target version changed from Ready to Tools - Next
Updated by nicksinger about 1 month ago
- Related to action #180857: [qe-tools][RPi] asset failure: Cannot find HDD_1 asset hdd/sle-15-SP7-aarch64-JeOS-for-RaspberryPi-2.109.qcow2 added