Project

General

Profile

Actions

action #164676

closed

[alert][o3] test fails in setup_multimachine to reach other machines/instances from w20 size:S

Added by okurz 5 months ago. Updated 5 months ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Regressions/Crashes
Start date:
2024-07-30
Due date:
% Done:

0%

Estimated time:

Description

Observation

openQA test in scenario opensuse-Tumbleweed-DVD-x86_64-ping_client@64bit fails in
setup_multimachine

with

Test died: ping with packet size 100 failed, problems with MTU size are expected. If it is multi-machine job, it can be GRE tunnel setup issue. at opensuse/lib/utils.pm line 1978.

Test suite description

Multimachine Ping client test. Maintainer: dheidler@suse.de

Reproducible

Fails since (at least) Build 2024-07-30T02:51+00:00 (current job)

Expected result

Last good: 20240729 (or more recent)

Suggestions

  • Confirm if the issue is reproducible e.g. by running 100 jobs on worker20, running 100 jobs on different workers, checking if there is any other jobs failing in a similar way
  • Check whether the MM setup on o3 is still valid
  • Consider using the PARALLEL_ONE_HOST_ONLY=1 setting on o3 (as we already do on OSD)

Further details

Always latest result in this scenario: latest

Actions #1

Updated by livdywan 5 months ago

  • Status changed from New to In Progress
  • Assignee set to livdywan

Looking at Next & previous results I see no other occurrences 🤔

Actions #2

Updated by livdywan 5 months ago · Edited

livdywan wrote in #note-1:

Looking at Next & previous results I see no other occurrences 🤔

openqa-clone-job --skip-download --skip-chained-deps --repeat=100 --within-instance https://openqa.opensuse.org 4365956 WORKER_CLASS=openqaworker20 _GROUP=0 BUILD+=poo#4365956

https://openqa.opensuse.org/tests/overview?distri=opensuse&build=2024-07-30T02%3A51%2B00%3A00&version=Tumbleweed

Actions #3

Updated by mkittler 5 months ago

  • Subject changed from [alert][o3] test fails in setup_multimachine to reach other machines/instances from w20 to [alert][o3] test fails in setup_multimachine to reach other machines/instances from w20 size:S
  • Description updated (diff)
Actions #4

Updated by livdywan 5 months ago · Edited

openqa-clone-job --skip-download --skip-chained-deps --repeat=100 --within-instance https://openqa.opensuse.org 4368352 WORKER_CLASS:ping_client=openqaworker20 WORKER_CLASS:ping_server=openqaworker23 _GROUP=0 BUILD+=poo#4365956-client20server23

https://openqa.opensuse.org/tests/overview?distri=opensuse&build=2024-07-30T13%3A51%2B00%3A00poo%234365956-client20server23&version=Tumbleweed

Actions #5

Updated by livdywan 5 months ago

I scheduled the tests wrong twice without immediately realizing. Apparently ,openqaworker20 is not a valid worker class and was never picked up and the second time around jobs got obsoleted.

Actions #6

Updated by livdywan 5 months ago

  • Priority changed from Urgent to High

I'm lowering priority since it seems clear it's not a more general issue.

Actions #7

Updated by openqa_review 5 months ago

  • Due date set to 2024-08-14

Setting due date based on mean cycle time of SUSE QE Tools

Actions #9

Updated by okurz 5 months ago

  • Due date deleted (2024-08-14)
  • Status changed from In Progress to Resolved

https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/19557 merged. We will see the effect of that in the tests that we maintain and will be notified so we don't need to wait for further confirmation in this ticket. The original issue seems to be a one-off which is not reproducible so no further action planned besides the mentioned improvement.

Actions

Also available in: Atom PDF