Project

General

Profile

Actions

action #112001

closed

coordination #112862: [saga][epic] Future ideas for easy multi-machine handling: MM-tests as first-class citizens

coordination #111929: [epic] Stable multi-machine tests covering multiple physical workers

action #111908: Multimachine failures between multiple physical workers

[timeboxed:20h][spike solution] Pin multi-machine cluster jobs to same openQA worker host based on configuration

Added by okurz over 2 years ago. Updated 5 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Feature requests
Target version:
Start date:
2022-06-03
Due date:
% Done:

0%

Estimated time:

Description

Motivation

In general openQA supports multi-machine clusters with jobs running on different physical hosts, e.g. qemu VMs on two different physical hosts connected by GRE tunnels. For often unknown reasons such combinations might not provide stable testing results. To mitigate we should provide means to ask the openQA scheduler to restrict the scheduling of jobs that are parts of multi-machine clusters to end up only one a single physical host if configured accordingly.

Acceptance criteria

  • AC1: Given multiple worker hosts matching the same worker class When multi-machine parallel cluster jobs are scheduled And configured to restrict to a single worker host Then parallel jobs are all assigned to the same worker host
  • AC2: By default multi-machine cluster jobs are still scheduled on multiple physical hosts

Workaround

If you run into problems with parallel jobs that are running on multiple worker hosts then ensure to select a worker class that only includes a single worker host at a time.


Related issues 3 (1 open2 closed)

Related to openQA Project (public) - action #135035: Optionally restrict multimachine jobs to a single workerResolvedmkittler2023-09-01

Actions
Related to openQA Project (public) - action #158146: Prevent scheduling across-host multimachine clusters to hosts that are marked to exclude themselves size:MResolvedmkittler2024-03-27

Actions
Related to openQA Project (public) - action #158143: Make workers unassign/reject/incomplete jobs when across-host multimachine setup is requested but not availableNew

Actions
Actions #1

Updated by okurz 11 months ago

  • Subject changed from [timeboxed][spike solution] Pin multi-machine cluster jobs to same openQA worker host based on configuration to [timeboxed:20h][spike solution] Pin multi-machine cluster jobs to same openQA worker host based on configuration
Actions #2

Updated by okurz 11 months ago

  • Category set to Feature requests
Actions #3

Updated by okurz 8 months ago

  • Related to action #135035: Optionally restrict multimachine jobs to a single worker added
Actions #4

Updated by okurz 7 months ago

  • Related to action #158146: Prevent scheduling across-host multimachine clusters to hosts that are marked to exclude themselves size:M added
Actions #5

Updated by okurz 7 months ago

  • Related to action #158143: Make workers unassign/reject/incomplete jobs when across-host multimachine setup is requested but not available added
Actions #6

Updated by okurz 7 months ago

  • Target version changed from future to Tools - Next

#160646 makes it necessary that we apply more priority

Actions #7

Updated by okurz 6 months ago

  • Status changed from New to Blocked
  • Assignee set to okurz

#158146 first

Actions #8

Updated by okurz 5 months ago

  • Status changed from Blocked to Resolved
  • Target version changed from Tools - Next to Ready

#158146 by now covers that completely

Actions

Also available in: Atom PDF