Project

General

Profile

Actions

action #157834

closed

[openQA][ipmi] IPMI backend machines in NUE2 can not be reached auto_review:"Reason: backend died: ipmitool.*Address lookup for.*qe.nue2.suse.org":retry

Added by waynechen55 9 months ago. Updated 9 months ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Regressions/Crashes
Start date:
2024-03-25
Due date:
% Done:

0%

Estimated time:

Description

Observation

All ipmi backend SUT machines in NUE can not be reached:

localhost:~ # ping -c5 openqaipmi5-sp.qe.nue2.suse.org
PING openqaipmi5-sp.qe.nue2.suse.org (10.168.192.202) 56(84) bytes of data.
From 81.95.8.245 (81.95.8.245) icmp_seq=1 Destination Host Unreachable
From 81.95.8.245 (81.95.8.245) icmp_seq=2 Destination Host Unreachable
From 81.95.8.245 (81.95.8.245) icmp_seq=3 Destination Host Unreachable
From 81.95.8.245 (81.95.8.245) icmp_seq=4 Destination Host Unreachable
From 81.95.8.245 (81.95.8.245) icmp_seq=5 Destination Host Unreachable

--- openqaipmi5-sp.qe.nue2.suse.org ping statistics ---
5 packets transmitted, 0 received, +5 errors, 100% packet loss, time 4060ms
pipe 4
localhost:~ # ping -c5 scooter-sp.qe.nue2.suse.org
PING scooter-sp.qe.nue2.suse.org (10.168.192.86) 56(84) bytes of data.
From 81.95.8.245 (81.95.8.245) icmp_seq=1 Destination Host Unreachable
From 81.95.8.245 (81.95.8.245) icmp_seq=2 Destination Host Unreachable
From 81.95.8.245 (81.95.8.245) icmp_seq=3 Destination Host Unreachable
From 81.95.8.245 (81.95.8.245) icmp_seq=4 Destination Host Unreachable
From 81.95.8.245 (81.95.8.245) icmp_seq=5 Destination Host Unreachable

--- scooter-sp.qe.nue2.suse.org ping statistics ---
5 packets transmitted, 0 received, +5 errors, 100% packet loss, time 4054ms
pipe 4
localhost:~ # ping -c5 kermit-sp.qe.nue2.suse.org
PING kermit-sp.qe.nue2.suse.org (10.168.192.88) 56(84) bytes of data.
From 81.95.8.245 (81.95.8.245) icmp_seq=1 Destination Host Unreachable
From 81.95.8.245 (81.95.8.245) icmp_seq=2 Destination Host Unreachable
From 81.95.8.245 (81.95.8.245) icmp_seq=3 Destination Host Unreachable
From 81.95.8.245 (81.95.8.245) icmp_seq=4 Destination Host Unreachable
From 81.95.8.245 (81.95.8.245) icmp_seq=5 Destination Host Unreachable

--- kermit-sp.qe.nue2.suse.org ping statistics ---
5 packets transmitted, 0 received, +5 errors, 100% packet loss, time 4058ms
pipe 4
localhost:~ # ping -c5 ix64ph1075-sp.qe.nue2.suse.org
PING ix64ph1075-sp.qe.nue2.suse.org (10.168.192.204) 56(84) bytes of data.
From 81.95.8.245 (81.95.8.245) icmp_seq=1 Destination Host Unreachable
From 81.95.8.245 (81.95.8.245) icmp_seq=2 Destination Host Unreachable
From 81.95.8.245 (81.95.8.245) icmp_seq=3 Destination Host Unreachable
From 81.95.8.245 (81.95.8.245) icmp_seq=4 Destination Host Unreachable
From 81.95.8.245 (81.95.8.245) icmp_seq=5 Destination Host Unreachable

--- ix64ph1075-sp.qe.nue2.suse.org ping statistics ---
5 packets transmitted, 0 received, +5 errors, 100% packet loss, time 4061ms
pipe 4
localhost:~ # ping -c5 amd-zen3-gpu-sut1-sp.qe.nue2.suse.org
PING amd-zen3-gpu-sut1-sp.qe.nue2.suse.org (10.168.192.83) 56(84) bytes of data.
From 81.95.8.245 (81.95.8.245) icmp_seq=1 Destination Host Unreachable
From 81.95.8.245 (81.95.8.245) icmp_seq=2 Destination Host Unreachable
From 81.95.8.245 (81.95.8.245) icmp_seq=3 Destination Host Unreachable
From 81.95.8.245 (81.95.8.245) icmp_seq=4 Destination Host Unreachable
From 81.95.8.245 (81.95.8.245) icmp_seq=5 Destination Host Unreachable

--- amd-zen3-gpu-sut1-sp.qe.nue2.suse.org ping statistics ---
5 packets transmitted, 0 received, +5 errors, 100% packet loss, time 4078ms
pipe 4
localhost:~ # ping -c5 gonzo-sp.qe.nue2.suse.org
PING gonzo-sp.qe.nue2.suse.org (10.168.192.90) 56(84) bytes of data.
From 81.95.8.245 (81.95.8.245) icmp_seq=1 Destination Host Unreachable
From 81.95.8.245 (81.95.8.245) icmp_seq=2 Destination Host Unreachable
From 81.95.8.245 (81.95.8.245) icmp_seq=3 Destination Host Unreachable
From 81.95.8.245 (81.95.8.245) icmp_seq=4 Destination Host Unreachable
From 81.95.8.245 (81.95.8.245) icmp_seq=5 Destination Host Unreachable

--- gonzo-sp.qe.nue2.suse.org ping statistics ---
5 packets transmitted, 0 received, +5 errors, 100% packet loss, time 4079ms
pipe 4
localhost:~ # ping -c5 unreal2-sp.qe.nue2.suse.org
PING unreal2-sp.qe.nue2.suse.org (10.168.192.160) 56(84) bytes of data.
From 81.95.8.245 (81.95.8.245) icmp_seq=1 Destination Host Unreachable
From 81.95.8.245 (81.95.8.245) icmp_seq=2 Destination Host Unreachable
From 81.95.8.245 (81.95.8.245) icmp_seq=3 Destination Host Unreachable
From 81.95.8.245 (81.95.8.245) icmp_seq=4 Destination Host Unreachable
From 81.95.8.245 (81.95.8.245) icmp_seq=5 Destination Host Unreachable

--- unreal2-sp.qe.nue2.suse.org ping statistics ---
5 packets transmitted, 0 received, +5 errors, 100% packet loss, time 4066ms
pipe 4
localhost:~ # ping -c5 unreal3-sp.qe.nue2.suse.org
PING unreal3-sp.qe.nue2.suse.org (10.168.192.162) 56(84) bytes of data.
From 81.95.8.245 (81.95.8.245) icmp_seq=1 Destination Host Unreachable
From 81.95.8.245 (81.95.8.245) icmp_seq=2 Destination Host Unreachable
From 81.95.8.245 (81.95.8.245) icmp_seq=3 Destination Host Unreachable
From 81.95.8.245 (81.95.8.245) icmp_seq=4 Destination Host Unreachable
From 81.95.8.245 (81.95.8.245) icmp_seq=5 Destination Host Unreachable

--- unreal3-sp.qe.nue2.suse.org ping statistics ---
5 packets transmitted, 0 received, +5 errors, 100% packet loss, time 4062ms
pipe 4

Test run also failed, for example, failure 1 and failure 2.

Steps to reproduce

  • Ping SUT BMC
  • Schedule ipmi backend test run

Impact

Only two machines in PRG2 are available, no enough resources and not convenient for debugging work or test run.

Problem

  • Network glitch ???
  • Power outage ???

Suggestions

  • Look into power supply
  • Look into network status

Workaround

n/a


Related issues 1 (0 open1 closed)

Related to openQA Infrastructure (public) - action #155659: [openQA][infra][sut] Failed to establish connnection to ix64ph1075-sp.qe.nue2.suse.orgResolvedokurz2024-02-20

Actions
Actions #2

Updated by okurz 9 months ago

  • Tags set to infra, nue2, network
  • Category set to Regressions/Crashes
  • Status changed from New to In Progress
  • Assignee set to okurz
  • Priority changed from Normal to High
  • Target version set to Ready
Actions #3

Updated by okurz 9 months ago

  • Subject changed from [openQA][ipmi] IPMI backend machines in NUE2 can not be reached to [openQA][ipmi] IPMI backend machines in NUE2 can not be reached auto_review:"Reason: backend died: ipmitool.*Address lookup for.*qe.nue2.suse.org":retry

The problem was the same as in #157816. An outage of DHCP/DNS services in NUE2. I checked most machine that this ticket mentioned and I could successfully ping over IPv4 and IPv6.

Actions #4

Updated by okurz 9 months ago

  • Status changed from In Progress to Resolved

Retriggered originally mentioned scenarios and https://openqa.suse.de/tests/13857499 and https://openqa.suse.de/tests/13857542 have now sufficiently progressed. Resolving.

Actions #5

Updated by waynechen55 9 months ago

okurz wrote in #note-4:

Retriggered originally mentioned scenarios and https://openqa.suse.de/tests/13857499 and https://openqa.suse.de/tests/13857542 have now sufficiently progressed. Resolving.

There is still failure like this one:

[2024-03-25T08:39:55.624444Z] [debug] [pid:30100] Launching external video encoder: ffmpeg -y -hide_banner -nostats -r 24 -f image2pipe -vcodec ppm -i - -pix_fmt yuv420p -c:v libvpx-vp9 -crf 35 -b:v 1500k -cpu-used 1 'video.webm'
[2024-03-25T08:40:19.868625Z] [info] [pid:30100] ::: backend::baseclass::die_handler: Backend process died, backend errors are reported below in the following lines:
  ipmitool -I lanplus -H ix64ph1075-sp.qe.nue2.suse.org -U admin -P [masked] mc guid: Error: Unable to establish IPMI v2 / RMCP+ session at /usr/lib/os-autoinst/backend/ipmi.pm line 45.
[2024-03-25T08:40:19.870210Z] [debug] [pid:30100] Passing remaining frames to the video encoder
[image2pipe @ 0x55b0f1e5a480] Could not find codec parameters for stream 0 (Video: ppm, none): unspecified size
Consider increasing the value for the 'analyzeduration' (0) and 'probesize' (5000000) options
Input #0, image2pipe, from 'pipe:':
  Duration: N/A, bitrate: N/A
  Stream #0:0: Video: ppm, none, 24 fps, 24 tbr, 24 tbn, 24 tbc
Output #0, webm, to 'video.webm':
Output file #0 does not contain any stream
[2024-03-25T08:40:19.880736Z] [debug] [pid:30100] Waiting for video encoder to finalize the video
[2024-03-25T08:40:19.881115Z] [debug] [pid:30100] The external video encoder (pid 30162) terminated
[2024-03-25T08:40:19.881411Z] [debug] [pid:30100] The built-in video encoder (pid 30163) terminated
[2024-03-25T08:40:19.884182Z] [debug] [pid:30100] sending magic and exit
[2024-03-25T08:40:19.885195Z] [debug] [pid:29863] received magic close
[2024-03-25T08:40:19.903251Z] [debug] [pid:29863] backend process exited: 0
[2024-03-25T08:40:20.005178Z] [warn] [pid:29863] !!! main: failed to start VM at /usr/lib/os-autoinst/backend/driver.pm line 104.
    backend::driver::start_vm(backend::driver=HASH(0x55d7cdf1f0a0)) called at /usr/lib/os-autoinst/OpenQA/Isotovideo/Backend.pm line 18
    OpenQA::Isotovideo::Backend::new("OpenQA::Isotovideo::Backend") called at /usr/lib/os-autoinst/OpenQA/Isotovideo/Runner.pm line 100
    OpenQA::Isotovideo::Runner::create_backend(OpenQA::Isotovideo::Runner=HASH(0x55d7c81ce208)) called at /usr/bin/isotovideo line 193
    eval {...} called at /usr/bin/isotovideo line 181

https://openqa.suse.de/tests/13857590
https://openqa.suse.de/tests/13857685

Actions #6

Updated by okurz 9 months ago

ix64ph1075 should be handled in the separate ticket #157618 for which I recommend to look into what was done in #155659

Actions #8

Updated by okurz 9 months ago

  • Related to action #155659: [openQA][infra][sut] Failed to establish connnection to ix64ph1075-sp.qe.nue2.suse.org added
Actions

Also available in: Atom PDF