Project

General

Profile

Actions

action #120807

closed

QA (public) - coordination #121720: [saga][epic] Migration to QE setup in PRG2+NUE3 while ensuring availability

QA (public) - coordination #116623: [epic] Migration of SUSE Nbg based openQA+QA+QAM systems to new security zones

[alert] openqa.suse.de - worker12.oqa.suse.de 100% packet loss due to outdated AAAA record

Added by okurz about 2 years ago. Updated about 2 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Start date:
2022-11-17
Due date:
% Done:

0%

Estimated time:

Description

Observation

okurz@openqa:~> sudo mtr -Z 2 -r -c1 worker12.oqa.suse.de
Start: 2022-11-21T14:16:23+0100
HOST: openqa                      Loss%   Snt   Last   Avg  Best  Wrst StDev
  1.|-- 2620:113:80c0:8080::4      0.0%     1    1.0   1.0   1.0   1.0   0.0
  2.|-- 2a07:de40:a201:2::7ff      0.0%     1    0.2   0.2   0.2   0.2   0.0
  3.|-- ???                       100.0     1    0.0   0.0   0.0   0.0   0.0
okurz@openqa:~> sudo mtr -Z 2 -r -c1 worker13.oqa.suse.de
Start: 2022-11-21T14:16:38+0100
HOST: openqa                      Loss%   Snt   Last   Avg  Best  Wrst StDev
  1.|-- 2620:113:80c0:8080::4      0.0%     1    2.2   2.2   2.2   2.2   0.0
  2.|-- 2a07:de40:a201:2::7ff      0.0%     1    0.2   0.2   0.2   0.2   0.0
  3.|-- worker13.oqa.suse.de       0.0%     1    0.5   0.5   0.5   0.5   0.0
okurz@openqa:~> nslookup worker12.oqa.suse.de
Server:     10.160.0.1
Address:    10.160.0.1#53

Name:   worker12.oqa.suse.de
Address: 10.137.10.12
Name:   worker12.oqa.suse.de
Address: 2a07:de40:a203:12:31ed:aa97:d3ed:5454

okurz@openqa:~> sudo mtr -Z 2 -r -c1 2a07:de40:a203:12:31ed:aa97:d3ed:5454
Start: 2022-11-21T14:17:59+0100
HOST: openqa                      Loss%   Snt   Last   Avg  Best  Wrst StDev
  1.|-- 2620:113:80c0:8080::4      0.0%     1    0.9   0.9   0.9   0.9   0.0
  2.|-- 2a07:de40:a201:2::7ff      0.0%     1    0.2   0.2   0.2   0.2   0.0
  3.|-- ???                       100.0     1    0.0   0.0   0.0   0.0   0.0

but using the current IPv6 address that worker12 has:

okurz@openqa:~> sudo mtr -Z 2 -r -c1 2a07:de40:a203:12:ec4:7aff:fe7a:7736
Start: 2022-11-21T14:18:36+0100
HOST: openqa                      Loss%   Snt   Last   Avg  Best  Wrst StDev
  1.|-- 2620:113:80c0:8080::5      0.0%     1    1.0   1.0   1.0   1.0   0.0
  2.|-- 2a07:de40:a201:2::7ff      0.0%     1    1.8   1.8   1.8   1.8   0.0
  3.|-- 2a07:de40:a203:12:ec4:7af  0.0%     1    0.7   0.7   0.7   0.7   0.0

and

 nslookup worker12.oqa.suse.de
Server:     10.160.0.1
Address:    10.160.0.1#53

Name:   worker12.oqa.suse.de
Address: 10.137.10.12
Name:   worker12.oqa.suse.de
Address: 2a07:de40:a203:12:31ed:aa97:d3ed:5454

Rollback steps

  • Unpause alert "Packet loss between worker hosts and other hosts"

Related issues 1 (0 open1 closed)

Copied from QA (public) - action #119443: Conduct the migration of SUSE openQA systems from Nbg SRV1 to new security zones size:MResolvedokurz2022-11-17

Actions
Actions #1

Updated by okurz about 2 years ago

  • Copied from action #119443: Conduct the migration of SUSE openQA systems from Nbg SRV1 to new security zones size:M added
Actions #2

Updated by okurz about 2 years ago

  • Description updated (diff)
  • Status changed from In Progress to Blocked
Actions #3

Updated by okurz about 2 years ago

  • Project changed from 46 to openQA Infrastructure (public)
  • Status changed from Blocked to Resolved

The DNS issue was fixed. The alert about packet loss could not be unpaused due to #121282 now being the problem. Extended that ticket accordingly to unpause the alert as soon as storage.qa.suse.de is good again.

Actions

Also available in: Atom PDF