Project

General

Profile

Actions

action #128273

closed

[alert] openqaworker-arm-1+2+ failed to recover, problem in name resolution, network connection? size:M

Added by okurz about 1 year ago. Updated about 1 year ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Target version:
Start date:
2023-04-25
Due date:
2023-05-12
% Done:

0%

Estimated time:

Description

Observation

We received multiple emails on 2023-04-23 around 1500Z related to the attempted automatic recovery of openqaworker-arm-1+2+. It is unclear if an SD ticket was automatically created about that.

Acceptance criteria

  • AC1: The root problem was addressed
  • AC2: The reason for the multi-level recovery attempt is understood

Suggestions

  • DONE: Ensure that all three openqaworker-arm-1+2+3 are up and running again -> They are up and running, no problem there
  • Check timely order execution steps, e.g. from https://gitlab.suse.de/openqa/grafana-webhook-actions/-/pipelines/660173 for arm-3 and related jobs for arm-1+2
  • Understand the error source and address it, maybe we need to fix something there
Actions

Also available in: Atom PDF