Project

General

Profile

Actions

action #154624

closed

openQA Project - coordination #112862: [saga][epic] Future ideas for easy multi-machine handling: MM-tests as first-class citizens

openQA Project - coordination #111929: [epic] Stable multi-machine tests covering multiple physical workers

Periodically running simple ping-check multi-machine tests on x86_64 covering multiple physical hosts on OSD alerting tools team on failures size:M

Added by okurz 3 months ago. Updated 3 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
Start date:
2024-01-30
Due date:
% Done:

0%

Estimated time:

Description

Motivation

In cases like #154552 multi-machine issues (still) happen and while we monitor multi-machine test results there are cases where users notify us about problems that we don't see in our monitoring. Because we now (#138302) have a good simple ping-check multi-machine test scenario created by dheidler we can use that scenario similar to openQA-in-openQA tests running periodically very often and whenever that scenario fails - because it's so simple likely the cause is multi-machine infrastructure related problems we want to know about - then alert the tools team directly, e.g. email to Slack #team-qa-tools or something, using openqa-label-known-issues

Acceptance criteria

  • AC1: simple ping-check multi-machine tests executed on x86_64 on OSD periodically covering multiple physical hosts
  • AC2: The tools team is alerted directly if those tests fail

Suggestions


Related issues 4 (1 open3 closed)

Related to openQA Project - action #154021: [alert] Ratio of not restarted multi-machine tests by resultResolvedmkittler2024-01-222024-02-12

Actions
Related to openQA Project - action #155278: o3 aarch64 multi-machine tests on openqaworker-arm21 and 22 fail to resolve codecs.opensuse.org size:MResolveddheidler2024-02-09

Actions
Copied from openQA Project - action #154552: [ppc64le] test fails in iscsi_client - zypper reports Error Message: Could not resolve host: openqa.suse.deResolvedmkittler2024-01-30

Actions
Copied to openQA Infrastructure - action #155200: Periodically running simple ping-check multi-machine tests on ppc64le covering multiple physical hosts on OSD alerting tools team on failures size:MWorkable2024-01-30

Actions
Actions

Also available in: Atom PDF