action #177459
closedqemu_ppc64le worker fails to reach gitlab.suse.de
0%
Description
Observation¶
HanaSR test in Multi-Machine configuration started to fail recently while trying to reach gitlab.suse.de:
wmp_check_process
Tools Team was contacted over slack and a joint debug session was done. Some details:
- SUTs (qemu_ppc64le VMs in diesel and petrol) running in Multi-Machine setup, fail to download assets from gitlab.suse.de.
- Command is:
curl -s -k https://gitlab.suse.de/qa-css/wmp_basic_tests/-/archive/master/wmp_basic_tests-master.tgz -o-
- Same command from the worker (diesel or petrol) works
- Taking control of the jobs it was possible to see that SUTs are able to resolve gitlab.suse.de to both its IPv4 and IPv6 addresses
- ICMP from the SUTs to gitlab.suse.de over IPv4 fail with a timeout.
- ICMP from the SUTs to gitlab.suse.de over IPv6 fail with
Network is Unreachable
. (IPv6 is not configured in the SUTs) - IP addresses of the SUTs are 10.0.2.15 and 10.0.2.16
- Name server in both nodes is 10.0.2.1 (IP address of the support server running in parallel to the jobs)
- After checking with
tcpdump
in the worker, 2 extra addresses were identified: 10.1.13.76 (related to the node with IP address 10.0.2.15) and 10.1.13.78 (related to the node with IP address 10.0.2.16) - Same scenario is working in qemu_x86_64.
- After modifying the firewall rules in the worker with
ip rule add from 10.1.13.78 table nowg prio 1
andip rule add from 10.1.13.76 table nowg prio 1
, SUTs were able to ping gitlab.suse.de and download assets from it. Tests passed: https://openqa.suse.de/tests/16816617 & https://openqa.suse.de/tests/16816618
Reproducible¶
Fails since (at least) Build 119.6
Expected result¶
Last good: 117.1 (or more recent)
Further details¶
Always latest result in this scenario: latest
Updated by nicksinger 12 days ago
- Status changed from New to Feedback
- Assignee set to nicksinger
https://github.com/os-autoinst/os-autoinst/blob/bb8218001f4ec0f21833424e5e1fcb9b8c7a5c2b/doc/openvswitch-init-example#L28 explains where these strange "10.1.x.x"-IPs come from. I extended our existing script with a more generic rule: https://gitlab.suse.de/openqa/salt-states-openqa/-/merge_requests/1381
Updated by nicksinger 10 days ago
- Status changed from Feedback to Resolved
Changes merged and rule manually added on relevant hosts (sapworker1, diesel, petrol) determined by salt -C '*.nue2.suse.org and G@roles:worker' test.ping
. https://openqa.suse.de/tests/16844790#step/wmp_check_process/11 passed with the broader ip-rule so I think we're fine now. @acarvajal feel free to reopen if you find further problems.