action #156067: [alert] test fails in setup_multimachine - openQA Tests (public) - openSUSE Project Management Tool

Actions

Copy link

action #156067

closed

[alert] test fails in setup_multimachine

Added by jbaier_cz about 1 year ago. Updated about 1 year ago.

Status:

Resolved

Priority:

Urgent

Assignee:

mkittler

Category:

Bugs in existing tests

Target version:

openQA Project (public) - Ready

Start date:

2024-02-26

Due date:

2024-03-12

% Done:

Estimated time:

Difficulty:

Tags:

alert, reactive work

Description

Observation¶

This job is scheduled via openqa-schedule-mm-ping-test, see #156052

openQA test in scenario sle-15-SP5-Server-DVD-Updates-x86_64-ping_client@64bit fails in
setup_multimachine

Test died: ping with packet size 100 failed, problems with MTU size are expected. If it is multi-machine job, it can be GRE tunnel setup issue. at sle/lib/utils.pm line 1943.
	utils::script_retry("ping -M do -s 100 -c 1 server", "retry", 3, "delay", 5, "fail_message", "ping with packet size 100 failed, problems with MTU size are "...) called at sle/lib/utils.pm line 2910
	utils::ping_size_check("server") called at sle/tests/network/setup_multimachine.pm line 57

Reproducible¶

Fails since (at least) Build 2024-02-26T10:08+00:00

Expected result¶

Last good: 2024-02-26T09:11+00:00 (or more recent)

Further details¶

Always latest result in this scenario: latest

Related issues 2 (0 open — 2 closed)

Actions

Copy link

Updated by jbaier_cz about 1 year ago

Related to action #156052: [alert] Scripts CI pipeline failing after logging multiple Job state of job ID 13603796: running, waiting size:S added

Actions

Copy link

Updated by mkittler about 1 year ago

Status changed from New to In Progress
Assignee set to mkittler

Actions

Copy link

Updated by mkittler about 1 year ago

Status changed from In Progress to Feedback

The network is completely unreachable within the SUT: ping: connect: Network is unreachable

The most likely culprit is https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/18713 which was merged 6 hours ago. The first failure is also from 6 hours ago. The last good still shows the root console and the first bad already shows the use of the serial console.

I'll reopen #155170 as this should supposedly be handled as part of the original ticket introducing the change.

Actions

Copy link

Updated by mkittler about 1 year ago

Related to action #155170: [openqa-in-openqa] [sporadic] test fails in test_running: parallel_failed size:M added

Actions

Copy link

Updated by mkittler about 1 year ago

PR: https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/18739

Actions

Copy link

Updated by mkittler about 1 year ago

Status changed from Feedback to In Progress

Actions

Copy link

Updated by openqa_review about 1 year ago

Due date set to 2024-03-12

Setting due date based on mean cycle time of SUSE QE Tools

Actions

Copy link

Updated by mkittler about 1 year ago

Status changed from In Progress to Resolved

The scenario looks good again and Antonios mentioned in the chat that he restarted relevant jobs.

Actions

Copy link

Updated by livdywan about 1 year ago

Status changed from Resolved to Workable

This looks like the same issue to me: https://openqa.opensuse.org/tests/3982221#step/setup_multimachine/140 from https://gitlab.suse.de/openqa/scripts-ci/-/jobs/2337490

Actions

Copy link

#10

Updated by mkittler about 1 year ago

Status changed from Workable to Resolved

This issue was about a concrete regression caused by https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/18713.

The new issue is something different. It seems like a sporadic issue considering all newer jobs are passing. Also judging by the logs it looks like a sporadic networking issue (which is a known problem for our MM jobs). The fail ratio (in that scenario) doesn't seem high enough to investigate further but if you nevertheless want to investigate I suggest we create a new ticket.

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

QA (public) » openQA Project (public) » openQA Tests (public)

Tags

Custom queries

action #156067

[alert] test fails in setup_multimachine

Observation¶

Reproducible¶

Expected result¶

Further details¶

Updated by jbaier_cz about 1 year ago

Updated by mkittler about 1 year ago

Updated by mkittler about 1 year ago

Updated by mkittler about 1 year ago

Updated by mkittler about 1 year ago

Updated by mkittler about 1 year ago

Updated by openqa_review about 1 year ago

Updated by mkittler about 1 year ago

Updated by livdywan about 1 year ago

Updated by mkittler about 1 year ago