action #111908: Multimachine failures between multiple physical workers - openQA Project (public) - openSUSE Project Management Tool

Actions

Copy link

action #111908

closed

coordination #112862: [saga][epic] Future ideas for easy multi-machine handling: MM-tests as first-class citizens

coordination #111929: [epic] Stable multi-machine tests covering multiple physical workers

Multimachine failures between multiple physical workers

Added by dzedro almost 3 years ago. Updated 11 months ago.

Status:

Resolved

Priority:

Normal

Assignee:

okurz

Category:

Feature requests

Target version:

Ready

Start date:

2022-06-03

Due date:

% Done:

100%

Estimated time:

(Total: 0.00 h)

Description

Observation¶

There are "random unexpected" MM failures due to some issue between multiple workers.
Below is list of support_server jobs of failed MM HA/SAP jobs in last two weeks.
This jobs I restarted on same openQA worker and they didn't fail.

Same experience I have with local HA/SAP instance, when I use one worker, there are nearly no "random unexpected" failures.
When I use two physical workers, the rate of "random unexpected" failures does increase.

Steps to reproduce¶

The failures are random, I could reproduce this failures on local instance with multiple physical worker.

Problem¶

I assume it's network/openvswitch/GRE issue between servers.

Workaround¶

Run the jobs on one physical worker via WORKER_CLASS e.g. WORKER_CLASS=qemu_x86_64,tap,openqaworker8

Subtasks 1 (0 open — 1 closed)

Related issues 1 (0 open — 1 closed)

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

QA (public) » openQA Project (public)

Tags

Custom queries

action #111908

Multimachine failures between multiple physical workers

Observation¶

Steps to reproduce¶

Problem¶

Workaround¶

Updated by dzedro almost 3 years ago

Updated by okurz almost 3 years ago

Updated by okurz almost 3 years ago

Updated by dzedro almost 3 years ago

Updated by okurz almost 3 years ago

Updated by livdywan almost 2 years ago

Updated by livdywan over 1 year ago

Updated by okurz 11 months ago