Project

General

Profile

Actions

action #180098

closed

Multi-Machine tests failing on network access

Added by dimstar 10 days ago. Updated 7 days ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Regressions/Crashes
Target version:
Start date:
2025-04-07
Due date:
% Done:

0%

Estimated time:

Description

Observation

This is being observed for the last 2 days - all MM machines seem to fail to access network resources

openQA test in scenario microos-Tumbleweed-DVD-x86_64-remote_ssh_target@64bit-2G fails in
networking

Test suite description

Maintainer: jrivera Boot with ssh=1 parameter and wait for parallel job (remote_ssh_controller) to install the system.

Reproducible

Fails since (at least) Build 20250404

Expected result

Last good: 20250403 (or more recent)

Further details

Always latest result in this scenario: latest


Files

Actions #1

Updated by dimstar 10 days ago

  • Priority changed from Urgent to Immediate

Raising - this blocks all MM tests completely from what we could see

Actions #2

Updated by livdywan 10 days ago

Hum. I was just taking a look, the error seems clear:

curl: (6) Could not resolve host: openqa.opensuse.org

However just now there's a passing test: https://openqa.opensuse.org/tests/4976935

Actions #3

Updated by okurz 10 days ago

Actions #4

Updated by okurz 10 days ago

  • Tags set to reactive work
  • Project changed from openQA Tests (public) to openQA Project (public)
  • Category changed from Bugs in existing tests to Regressions/Crashes
  • Priority changed from Immediate to Urgent
  • Target version set to Ready

According to rfan1 and dzedro in https://suse.slack.com/archives/C02CANHLANP/p1744022074232239?thread_ts=1744022074.232239&cid=C02CANHLANP no relevant changes from core or osado side so likely something regressed in the o3 infra. Please also see changes applied by @favogt as mitigation.

Actions #5

Updated by livdywan 9 days ago

  • Status changed from New to In Progress
  • Assignee set to livdywan

So I'm not clear on what this is about, but taking the ticket anyway since it's urgent to figure what needs to happen.

Actions #6

Updated by mkittler 9 days ago

  • Assignee changed from livdywan to mkittler

I'm busy but with almost the same type of work so I'm having a look regarding this forwarding setting on o3 workers.

Actions #7

Updated by mkittler 9 days ago ยท Edited

Note that we enable ip forwarding in os-autoinst/script/os-autoinst-setup-multi-machine. The setting is nevertheless only present on a few hosts. So I re-executed this line on all hosts mentioned on https://progress.opensuse.org/projects/openqav3/wiki/#Manual-command-execution-on-o3-workers - just to be sure. So we should now have forwarding enabled on all hosts in a persistent way. Considering our setup script already does this I don't see anything to improve upstream. I have no idea how this was working before.

Note that I haven't seen the "one host only" setting on o3 workers (except on the power pc worker). I do remember configuring this. Either I misremember or this configuration was lost. The latter would explain how this was working before but it is probably the former.

Actions #8

Updated by mkittler 9 days ago

  • Status changed from In Progress to Feedback
  • Priority changed from Urgent to High

I'm lowering the prio due to what I wrote in my last comment. In my opinion we can also resolve this ticket.

Actions #9

Updated by mkittler 7 days ago

  • Status changed from Feedback to Resolved

There haven't been further problems with MM tests reported so I'm considering this resolved.

Actions

Also available in: Atom PDF