Project

General

Profile

Actions

action #161381

closed

multi-machine test network issues reported 2024-06-03 due to missing content in the salt mine size:S

Added by okurz 7 months ago. Updated 7 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Regressions/Crashes
Start date:
2024-06-03
Due date:
2024-06-18
% Done:

0%

Estimated time:

Description

Observation

Same problem as in #160646

From https://suse.slack.com/archives/C02CANHLANP/p1717381703517509

(Lili Zhao) Hi, multi machine issues found today, for example: https://openqa.suse.de/tests/14504387#step/iscsi_client/8 (ping with packet size 100 failed, problems with MTU size are expected) and https://openqa.suse.de/tests/14504397#step/suseconnect_scc/25 (curl: (7) Couldn't connect to server)

possibly related https://suse.slack.com/archives/C02CANHLANP/p1717400281975529

(Anton Smorodskyi) when I see such error https://openqa.suse.de/tests/14492957#step/prepare_instance/27 No route to host at /usr/lib/perl5/vendor_perl/5.26.1/Mojo/Transaction.pm line 54. I conclude that worker's network is down . Is my assumption correct ?

also
https://monitor.qa.suse.de/d/nRDab3Jiz/openqa-jobs-test?orgId=1&from=1717347718902&to=1717408634010
shows the significantly higher ratio of multi-machine test failures happening

Acceptance criteria

Suggestions

Out of scope

  • Fixing the false positive salt-lint #161393
  • Ensuring that we check YAML validity of the workerconf #161396
  • Fixing and preventing the actual issue

Related issues 2 (1 open1 closed)

Related to openQA Infrastructure (public) - action #160646: multiple multi-machine test failures, no GRE tunnels are setup between machines anymore at all size:MResolvedybonatakis2024-05-21

Actions
Copied to openQA Infrastructure (public) - coordination #161735: [epic] Better error detection on GRE tunnel misconfigurationBlockedokurz2024-06-21

Actions
Actions

Also available in: Atom PDF