action #80128
closedopenqaworker-arm-2 fails to download from openqa
0%
Description
https://openqa.suse.de/tests/5047755
Even with wget you can't download. I stopped the workers and the minion
Updated by okurz about 4 years ago
- Related to action #73633: OSD partially unresponsive, triggering 500 responses, spotty response visible in monitoring panels but no alert triggered (yet) added
Updated by okurz about 4 years ago
ping -6 openqa.suse.de
does not work
that works on most other machines
sysctl net.ipv6.conf.eth1.accept_ra
is ok as well, ip r
does not show a default route though
Updated by okurz about 4 years ago
- Status changed from New to In Progress
- Assignee set to okurz
- Target version set to Ready
ip r
does not show a default IPv6 route, ip -6 r
does though. After I did systemctl stop firewalld
the ping openqa -c 1 -6 openqa.suse.de
worked.
could not find anything in logs either, even if enabling in /etc/firewalld to log all "dropped" packages. Then later ping did not work even if firewalld disabled. I did systemctl restart network
and my ssh connection never recovered. over IPMI SOL I can ping the own IPV6 address of openqaworker-arm-2 but not osd:
# ping -c 1 -6 2620:113:80c0:8080:10:160:0:227
PING 2620:113:80c0:8080:10:160:0:227(2620:113:80c0:8080:10:160:0:227) 56 data bytes
64 bytes from 2620:113:80c0:8080:10:160:0:227: icmp_seq=1 ttl=64 time=0.061 ms
--- 2620:113:80c0:8080:10:160:0:227 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.061/0.061/0.061/0.000 ms
e246:~ # ping -c 1 -6 openqa.suse.de
PING openqa.suse.de(openqa.suse.de (2620:113:80c0:8080:10:160:0:207)) 56 data bytes
which resolves the right address for openqa.suse.de at least. Then after some minutes the system would not even react properly over SOL. Triggered power reset
.
EDIT: Same symptoms after reboot, then sudo systemctl disable --now openqa-worker.target openqa-worker-cacheservice openqa-worker-cacheservice-minion.service
salt -l error \* cmd.run 'dig openqa.suse.de AAAA'
is fine on all machines.
salt -l error \* cmd.run 'ping -c 1 -4 openqa.suse.de ; ping -c 1 -6 openqa.suse.de'
is not ok on QA-Power8-4-kvm.qa.suse.de, openqaworker-arm-1.suse.de, openqaworker-arm-2.suse.de
I have applied
salt -l error -L 'QA-Power8-4-kvm.qa.suse.de,openqaworker-arm-1.suse.de,openqaworker-arm-2.suse.de' cmd.run 'echo net.ipv6.conf.all.disable_ipv6 = 1 > /etc/sysctl.d/poo73633_poo80128_debugging.conf && sysctl --load /etc/sysctl.d/poo73633_poo80128_debugging.conf && systemctl restart openqa-worker@\* openqa-worker-cacheservice openqa-worker-cacheservice-minion.service os-autoinst-openvswitch.service && systemctl mask --now postfix'
Updated by okurz about 4 years ago
- Status changed from Feedback to Workable
- Assignee deleted (
okurz) - Priority changed from Normal to Low
- Target version changed from Ready to future
this seems to have worked. So far the machines seem to be ok with only IPv4. I don't know why IPv6 does not work but I guess we can live with that for the time being.
Updated by nicksinger about 4 years ago
- Status changed from Workable to Resolved
- Assignee set to nicksinger
While working on grenache-1 I realized that there where some leftovers in sysctl on QA-Power8-4-kvm.qa.suse.de resulting in ipv6 being disabled only on some interfaces (IIRC "lo" was one of them). This resulted in these strange errors you discovered in #3 that resolving and all works but pings get stuck.
I removed your workaround file now and made sure all disable_ipv6
entries are set to 0 (on QA-Power8-4-kvm.qa.suse.de, openqaworker-arm-1.suse.de and openqaworker-arm-2.suse.de). To validate I ran the following command on osd:
openqa:~ # salt -l error --no-color -C 'G@roles:worker' cmd.run 'curl -s -6 openqa.suse.de | grep changelog'
openqaworker2.suse.de:
<a href="/changelog">4.6.1607440298.36d0dfbf9</a>
openqaworker5.suse.de:
<a href="/changelog">4.6.1607440298.36d0dfbf9</a>
openqaworker6.suse.de:
<a href="/changelog">4.6.1607440298.36d0dfbf9</a>
openqaworker8.suse.de:
<a href="/changelog">4.6.1607440298.36d0dfbf9</a>
QA-Power8-4-kvm.qa.suse.de:
<a href="/changelog">4.6.1607440298.36d0dfbf9</a>
openqaworker9.suse.de:
<a href="/changelog">4.6.1607440298.36d0dfbf9</a>
QA-Power8-5-kvm.qa.suse.de:
<a href="/changelog">4.6.1607440298.36d0dfbf9</a>
openqaworker13.suse.de:
<a href="/changelog">4.6.1607440298.36d0dfbf9</a>
openqaworker10.suse.de:
<a href="/changelog">4.6.1607440298.36d0dfbf9</a>
openqaworker-arm-2.suse.de:
<a href="/changelog">4.6.1607440298.36d0dfbf9</a>
openqaworker-arm-1.suse.de:
<a href="/changelog">4.6.1607440298.36d0dfbf9</a>
grenache-1.qa.suse.de:
<a href="/changelog">4.6.1607440298.36d0dfbf9</a>
Please reopen if the problem still persists