Project

General

Profile

Actions

action #107917

closed

Recovery of imagetester via IPMI failed size:M

Added by mkittler about 2 years ago. Updated about 2 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Target version:
Start date:
2022-03-07
Due date:
2022-03-26
% Done:

0%

Estimated time:

Description

Observation

The corresponding GitLab pipeline failed: monitor-o3 | Failed pipeline for master | fee77e0e

$ ssh o3 'ping -q -c 1 imagetester >/dev/null' || ipmitool -I lanplus -C 3 -H 10.160.65.195 -U ADMIN -P $imagetester_ipmi_password power cycle
Error: Unable to establish IPMI v2 / RMCP+ session
Cleaning up project directory and file based variables 00:00
ERROR: Job failed: command terminated with exit code 1

I haven't restarted the job because imagetester seems to be online nevertheless. IPMI being sometimes unavailable is something I also experience when using it manually. We could implement a retry, though.

Suggestions

  • Check if imagester is currently actually online or needs recovery
  • Maybe the ping fails but the machine is online?
  • Crosscheck credentials and IPMI access
  • Re-try ipmi if it fails
  • Check our wiki because we stated that imagetester does not have a working IPMI anyway

Further info


Related issues 2 (0 open2 closed)

Related to openQA Infrastructure - action #135137: Bring back imagetester size:MResolvedokurz2023-09-04

Actions
Copied to openQA Infrastructure - action #108671: Resilient IPMI recovery of o3 machines in monitor-o3 size:MResolvedmkittler2022-03-07

Actions
Actions

Also available in: Atom PDF