Project

General

Profile

Actions

action #129032

open

ipmitool monitoring

Added by okurz 7 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:
Tags:

Description

Monitoring

As we saw in #128654 ipmitool itself can be very stable (30k runs without error grenache-1->ix64ph1075) but can misbehave in weird ways, in the last case because of a switch which we rebooted to fix the problem. As some openQA tests rely on IPMI we should introduce an according monitoring check that ensures ipmi connection stability, could be in telegraf, gitlab CI pipeline, separate systemd service, in front of openQA jobs on the worker, etc.


Related issues 1 (0 open1 closed)

Copied from openQA Infrastructure - action #128654: [sporadic] Fail to create an ipmi session to worker grenache-1:16 (ix64ph1075) in its vlanResolvedokurz2023-05-04

Actions
Actions

Also available in: Atom PDF