Project

General

Profile

Actions

action #95783

closed

coordination #103962: [saga][epic] Easy multi-machine handling: MM-tests as first-class citizens

coordination #103971: [epic] Easy *re*-triggering and cloning of multi-machine tests

Provide support for multi-machine scenarios handled by openqa-investigate size:M

Added by okurz over 3 years ago. Updated over 2 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Feature requests
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:

Description

Motivation

See #81859#note-7

Acceptance criteria

  • AC1: "openqa-investigate" triggers complete sets of multi-machine scenarios for investigation
  • AC2: The investigation multi-machine scenarios are categorized like other investigation jobs, e.g. ":investigate:" in the name, outside any job group, no build, etc.

Suggestions

  • Have openqa-clone-job's --skip-chained-deps option only affect chained dependencies but not directly chained dependencies
  • Add --skip-directly-chained-deps
  • Add a hook in openQA to invoke a script once all jobs in a dependency tree are done.
  • Make openqa-investigate use that hook and investigate all jobs that weren't successful. It should use --max-depth 0 to ensure parallel clusters are always fully cloned (so it is not necessary to distinguish between parallel parents and children). It needs to keep track of handled job IDs to avoid investigating jobs multiple times as openqa-clone-job will already handle dependencies as needed (and therefore might clone already multiple jobs we need to investigate in one go). The tracking should be easy because openqa-clone-job --json-output has already been implemented.
  • Add an opt-out (e.g. by specifying a certain test variable) so users who consider these tests as a waste of time won't complain e.g. configurable via a test variable
  • * We could also make it in opt-in. So we'd keep the current behavior of skipping the investigation of jobs with parallel and directly chained dependencies unless a user specifies some test variable.

Out of scope

  • Multiple root jobs. We can consider that a future ticket for now.
  • Spawning too many investigation jobs under high load. We could consider such jobs as low priority and drop them (user story, not technical definition).

Related issues 8 (0 open8 closed)

Related to openQA Project (public) - action #103425: Ratio of multi-machine tests alerting with ratio_mm_failed 5.280 size:MResolvedmkittler

Actions
Related to openQA Project (public) - action #71809: Enable multi-machine jobs trigger without "isos post"Resolvedmkittler2020-09-24

Actions
Related to openQA Project (public) - action #69976: Show dependency graph for cloned jobsResolvedmkittler2020-08-13

Actions
Related to QA (public) - action #107014: trigger openqa-trigger-bisect-jobs from our automatic investigations whenever the cause is not already known size:MResolvedtinita2022-02-17

Actions
Related to openQA Project (public) - action #110518: Call job_done_hooks if requested by test setting (not only openQA config as done so far) size:MResolvedmkittler2021-09-18

Actions
Related to openQA Project (public) - action #110530: Do NOT call job_done_hooks if requested by test settingResolvedmkittler2021-09-18

Actions
Related to QA (public) - action #110176: [spike solution] [timeboxed:10h] Restart hook script in delayed minion job based on exit code size:MResolvedkraih2022-06-15

Actions
Copied from openQA Project (public) - action #81859: openqa-investigate triggers incomplete sets for multi-machine scenariosResolvedmkittler2021-01-07

Actions
Actions

Also available in: Atom PDF