Project

General

Profile

Actions

action #40415

closed

Concurrent jobs with dependencies don't work if they are on different machines.

Added by jlausuch over 6 years ago. Updated about 6 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Feature requests
Target version:
Start date:
2018-08-29
Due date:
% Done:

0%

Estimated time:

Description

Reproducibility:

We have 2 jobs, let's say PARENT and CHILD, where CHILD has PARALLEL_WITH=PARENT.

I have created 2 tests that are on the same machine "64bit", qemu with some other options:
http://fromm.arch.suse.de/tests/1394
http://fromm.arch.suse.de/tests/1395

The parent job needs the child ID for the mutex command:

my $children       = get_children();
my $child_id       = (keys %$children)[0];
...
script_run("echo Waiting for child with child_id=$child_id");
mutex_wait("child_ready", $child_id);

This is one line of the parent's output:

Waiting for child with child_id=1399

Everything OK so far. CHILD recognizes PARENT as its parent and locking api works without problems.

Then, I have created another machine "64bit-other" with the exact same characteristics as the other one. http://fromm.arch.suse.de/admin/machines
And assign CHILD to "64bit-other" in the job group.

The result is that CHILD doesn't have the parent job in the settings panel any more, and the PARENT's output is now:

Waiting for child with child_id=

Therefore, the command

mutex_wait("child_ready", $child_id);

waits forever.

Why having different machines? Well, for virtual jobs it doesn't make sense, but for BareMetal jobs like NFV and InfiniBand tests we are using different workers and machines:
ipmi-sonic and ipmi-tails with different worker classes: 64bit-mlx_con5_sonic and 64bit-mlx_con5_tails respectively.


Related issues 2 (0 open2 closed)

Related to openQA Project (public) - action #25892: Scheduling parallel jobsResolvedEDiGiacinto2017-10-10

Actions
Related to openQA Tests (public) - action #42857: [qe-core][functional][s390x] Change structure of s390x KVM hosts on production (o.s.d)Resolved2018-10-24

Actions
Actions

Also available in: Atom PDF