Project

General

Custom queries

Profile

Actions

action #95783

closed

coordination #103962: [saga][epic] Easy multi-machine handling: MM-tests as first-class citizens

coordination #103971: [epic] Easy *re*-triggering and cloning of multi-machine tests

Provide support for multi-machine scenarios handled by openqa-investigate size:M

Added by okurz over 3 years ago. Updated over 2 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Feature requests
Target version:
Start date:
Due date:
% Done:

0%

Estimated time:

Description

Motivation

See #81859#note-7

Acceptance criteria

  • AC1: "openqa-investigate" triggers complete sets of multi-machine scenarios for investigation
  • AC2: The investigation multi-machine scenarios are categorized like other investigation jobs, e.g. ":investigate:" in the name, outside any job group, no build, etc.

Suggestions

  • Have openqa-clone-job's --skip-chained-deps option only affect chained dependencies but not directly chained dependencies
  • Add --skip-directly-chained-deps
  • Add a hook in openQA to invoke a script once all jobs in a dependency tree are done.
  • Make openqa-investigate use that hook and investigate all jobs that weren't successful. It should use --max-depth 0 to ensure parallel clusters are always fully cloned (so it is not necessary to distinguish between parallel parents and children). It needs to keep track of handled job IDs to avoid investigating jobs multiple times as openqa-clone-job will already handle dependencies as needed (and therefore might clone already multiple jobs we need to investigate in one go). The tracking should be easy because openqa-clone-job --json-output has already been implemented.
  • Add an opt-out (e.g. by specifying a certain test variable) so users who consider these tests as a waste of time won't complain e.g. configurable via a test variable
  • * We could also make it in opt-in. So we'd keep the current behavior of skipping the investigation of jobs with parallel and directly chained dependencies unless a user specifies some test variable.

Out of scope

  • Multiple root jobs. We can consider that a future ticket for now.
  • Spawning too many investigation jobs under high load. We could consider such jobs as low priority and drop them (user story, not technical definition).

Related issues 8 (0 open8 closed)

Related to openQA Project (public) - action #103425: Ratio of multi-machine tests alerting with ratio_mm_failed 5.280 size:MResolvedmkittler

Actions
Related to openQA Project (public) - action #71809: Enable multi-machine jobs trigger without "isos post"Resolvedmkittler2020-09-24

Actions
Related to openQA Project (public) - action #69976: Show dependency graph for cloned jobsResolvedmkittler2020-08-13

Actions
Related to QA (public) - action #107014: trigger openqa-trigger-bisect-jobs from our automatic investigations whenever the cause is not already known size:MResolvedtinita2022-02-17

Actions
Related to openQA Project (public) - action #110518: Call job_done_hooks if requested by test setting (not only openQA config as done so far) size:MResolvedmkittler2021-09-18

Actions
Related to openQA Project (public) - action #110530: Do NOT call job_done_hooks if requested by test settingResolvedmkittler2021-09-18

Actions
Related to QA (public) - action #110176: [spike solution] [timeboxed:10h] Restart hook script in delayed minion job based on exit code size:MResolvedkraih2022-06-15

Actions
Copied from openQA Project (public) - action #81859: openqa-investigate triggers incomplete sets for multi-machine scenariosResolvedmkittler2021-01-07

Actions
#1

Updated by okurz over 3 years ago

  • Copied from action #81859: openqa-investigate triggers incomplete sets for multi-machine scenarios added
#3

Updated by okurz over 3 years ago

  • Related to action #103425: Ratio of multi-machine tests alerting with ratio_mm_failed 5.280 size:M added
#4

Updated by okurz over 3 years ago

  • Related to action #71809: Enable multi-machine jobs trigger without "isos post" added
#5

Updated by okurz over 3 years ago

  • Description updated (diff)
#6

Updated by okurz over 3 years ago

  • Parent task set to #103971
#7

Updated by okurz about 3 years ago

  • Target version changed from future to Ready
#8

Updated by mkittler about 3 years ago

  • Assignee set to mkittler
#10

Updated by mkittler about 3 years ago

  • Status changed from In Progress to Feedback
#13

Updated by okurz about 3 years ago

  • Status changed from Feedback to In Progress
#17

Updated by mkittler about 3 years ago

  • Status changed from In Progress to Feedback
#20

Updated by okurz about 3 years ago

  • Status changed from Feedback to In Progress
#23

Updated by okurz about 3 years ago

  • Priority changed from Low to Urgent
#27

Updated by openqa_review about 3 years ago

  • Due date set to 2022-02-26
#28

Updated by okurz about 3 years ago

  • Related to action #69976: Show dependency graph for cloned jobs added
#29

Updated by okurz about 3 years ago

  • Due date deleted (2022-02-26)
  • Status changed from In Progress to Blocked
  • Priority changed from Urgent to Normal
#30

Updated by mkittler about 3 years ago

  • Status changed from Blocked to Feedback
#38

Updated by okurz about 3 years ago

  • Due date set to 2022-03-25
#39

Updated by mkittler about 3 years ago

  • Due date changed from 2022-03-25 to 2022-04-25
#40

Updated by mkittler about 3 years ago

  • Related to action #107014: trigger openqa-trigger-bisect-jobs from our automatic investigations whenever the cause is not already known size:M added
#42

Updated by mkittler almost 3 years ago

  • Due date changed from 2022-04-25 to 2022-05-02
#44

Updated by okurz almost 3 years ago

  • Description updated (diff)
  • Due date changed from 2022-05-02 to 2022-05-09
#45

Updated by okurz almost 3 years ago

  • Related to action #110518: Call job_done_hooks if requested by test setting (not only openQA config as done so far) size:M added
#46

Updated by okurz almost 3 years ago

  • Related to action #110530: Do NOT call job_done_hooks if requested by test setting added
#47

Updated by okurz almost 3 years ago

  • Due date deleted (2022-05-09)
  • Status changed from Feedback to Blocked
#49

Updated by okurz almost 3 years ago

  • Subject changed from Provide support for multi-machine scenarios handled by openqa-investigate to Provide support for multi-machine scenarios handled by openqa-investigate size:M
#50

Updated by mkittler almost 3 years ago

  • Related to action #110176: [spike solution] [timeboxed:10h] Restart hook script in delayed minion job based on exit code size:M added
#51

Updated by okurz almost 3 years ago

  • Status changed from Blocked to Workable
#54

Updated by mkittler almost 3 years ago

  • Status changed from Workable to In Progress
#55

Updated by openqa_review almost 3 years ago

  • Due date set to 2022-07-19
#57

Updated by okurz almost 3 years ago

  • Description updated (diff)
#58

Updated by mkittler almost 3 years ago

  • Status changed from In Progress to Feedback
#62

Updated by mkittler over 2 years ago

  • Status changed from Feedback to In Progress
#66

Updated by mkittler over 2 years ago

  • Status changed from In Progress to Resolved
#67

Updated by okurz over 2 years ago

  • Due date deleted (2022-07-19)
Actions

Also available in: Atom PDF