Project

General

Profile

Actions

action #81198

closed

[tracker-ticket] openqaworker-arm-{1..3} have network problems (cacheservice, OSD reachability). IPv6 disabled for now

Added by nicksinger over 3 years ago. Updated 3 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
Start date:
2020-12-18
Due date:
% Done:

0%

Estimated time:
Tags:

Description

As we face repeated network problems with our arm workers (e.g. https://progress.opensuse.org/issues/81026) we decided to disable ipv6 once again completely on all our arm workers.
This ticket is to track this change to revisit it after the Christmas holidays


Related issues 1 (0 open1 closed)

Related to openQA Infrastructure - action #81026: many jobs incomplete with auto_review:"(?s)Running on openqaworker-arm-2.*failed: 521 Connect timeout.*Result: setup failure":retryResolvednicksinger2020-12-14

Actions
Actions #1

Updated by nicksinger over 3 years ago

I disabled it on arm-2 for now with:

sysctl -a | grep disable_ipv6 | grep -v tap | cut -d= -f 1 | awk '{$1=$1;print}' | xargs -I{} echo {}=1 > /etc/sysctl.d/99-poo81198.conf

rebooting now to verify and afterwards I will apply it to arm-1 and arm-3 too

Actions #2

Updated by Xiaojing_liu over 3 years ago

On 2020-12-29, the networker problem happened on arm-3. Here is an example: https://openqa.suse.de/tests/5227454
I disabled IPv6 on arm-3 according to @nicksinger comments, and reboot the machine.

Actions #3

Updated by okurz over 3 years ago

  • Target version set to future
Actions #4

Updated by mkittler over 3 years ago

Looks like IPv6 was not actually disabled on openqaworker-arm-1 today. ip addr showed IPv6 addresses. wget http://openqa.suse.de/… was using IPv6 which did not work (only wget -4 … worked). After restarting wicked it worked again (also wget -6 …). The output of ip addr for eth0 looks like before though; only the "sec" values differ slightly.

Actions #5

Updated by mkittler over 3 years ago

After restarting wicked it worked again (also wget -6 …).

And now not anymore. This didn't last very long. I applied the same command as @nicksinger did for arm-2 on arm-1 and will reboot the machine.

Actions #6

Updated by okurz about 3 years ago

  • Related to action #81026: many jobs incomplete with auto_review:"(?s)Running on openqaworker-arm-2.*failed: 521 Connect timeout.*Result: setup failure":retry added
Actions #7

Updated by okurz about 3 years ago

  • Status changed from Feedback to New
Actions #8

Updated by nicksinger about 3 years ago

nicksinger wrote:

I disabled it on arm-2 for now with:

sysctl -a | grep disable_ipv6 | grep -v tap | cut -d= -f 1 | awk '{$1=$1;print}' | xargs -I{} echo {}=1 > /etc/sysctl.d/99-poo81198.conf

rebooting now to verify and afterwards I will apply it to arm-1 and arm-3 too

I've excluded the loopback interface from that file now, see https://progress.opensuse.org/issues/88225 for details why

Actions #9

Updated by okurz about 1 year ago

  • Tags set to infra
Actions #10

Updated by okurz 3 months ago

  • Target version changed from future to Ready
Actions #11

Updated by okurz 3 months ago

  • Status changed from New to Resolved
  • Assignee set to okurz

sudo salt -C 'G@roles:worker' cmd.run 'ls -l /etc/sysctl.d/' shows no occurences of the workarounds. For all machines we currently have in production IPv6 works fine.

Actions

Also available in: Atom PDF