Project

General

Profile

Actions

action #68869

closed

automatic ARM recovery jobs fail due to caasp master running gitlab CI jobs have expired certificates

Added by okurz almost 4 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
Start date:
2020-07-12
Due date:
% Done:

0%

Estimated time:

Description

https://stats.openqa-monitor.qa.suse.de/d/1bNU0StZz/automatic-actions?orgId=1 shows two ARM machines being down for two days. The automatic recovery did not work. The good thing is that the long-time alerts have triggered. The problem is visible in https://gitlab.suse.de/openqa/grafana-webhook-actions/-/jobs/232053 on the side of gitlab runner machines.


Related issues 1 (0 open1 closed)

Related to openQA Infrastructure - action #113561: failed pipelines for openQABot and bot-ng because of an expired certResolvedokurz2022-07-13

Actions
Actions #1

Updated by okurz almost 4 years ago

  • Status changed from New to Blocked

Reported https://infra.nue.suse.com/SelfService/Display.html?id=174378

Have triggered reboot of both openqaworker-arm-1 and openqaworker-arm-2 manually now with:

ipmitool -I lanplus -H openqaworker-arm-1-ipmi.suse.de -U ADMIN -P ADMIN power cycle
ipmitool -I lanplus -H openqaworker-arm-2-ipmi.suse.de -U ADMIN -P ADMIN power cycle
Actions #2

Updated by okurz over 3 years ago

  • Status changed from Blocked to Resolved

infra ticket was resolved, all working fine again.

Actions #3

Updated by jbaier_cz almost 2 years ago

  • Related to action #113561: failed pipelines for openQABot and bot-ng because of an expired cert added
Actions

Also available in: Atom PDF