Project

General

Profile

coordination #103971

coordination #103962: [saga][epic] Easy multi-machine handling: MM-tests as first-class citizens

[epic] Easy *re*-triggering and cloning of multi-machine tests

Added by okurz 5 months ago. Updated 14 days ago.

Status:
Blocked
Priority:
Normal
Assignee:
Category:
Feature requests
Target version:
Start date:
2020-08-13
Due date:
% Done:

50%

Estimated time:
(Total: 0.00 h)
Difficulty:

Description

Acceptance criteria

  • AC1: DONE¹: Multi-machine tests can be retriggered over webUI similar to single-machine tests
  • AC2: DONE¹: Multi-machine tests can be retriggered over API similar to single-machine tests
  • AC3: DONE: Multi-machine tests can be cloned using CLI tools²
  • AC4: When cloning multi-machine tests settings can be partially overwritten and customized

¹ I assume "retriggered" means pressing the restart button or using the restart API. That works already for most cases. https://github.com/os-autoinst/openQA/pull/4498 provides a pending tweak for the only case I can currently think of were restarting wouldn't treat the dependency cluster as expected (and that case isn't strictly related to MM tests).

² I assume "CLI tools" refers to the openqa-clone-job script which allows cloning jobs from another openQA instance. That is now supported, see #103971#note-13.


Subtasks

action #69979: Advanced job restarting via the web UIResolvedokurz

action #95783: Provide support for multi-machine scenarios handled by openqa-investigate size:MBlockedmkittler


Related issues

Related to openQA Project - action #66071: TEST is overridden in parent job when doing `openqa-clone-custom-git-refspec`Resolved2020-04-26

History

#1 Updated by okurz 5 months ago

  • Subject changed from Easy *re*-triggering of multi-machine tests to [epic] Easy *re*-triggering of multi-machine tests

#2 Updated by okurz 5 months ago

  • Subject changed from [epic] Easy *re*-triggering of multi-machine tests to [epic] Easy *re*-triggering and cloning of multi-machine tests

#3 Updated by okurz 5 months ago

  • Description updated (diff)

#4 Updated by okurz 4 months ago

  • Target version changed from future to Ready

#5 Updated by mkittler 4 months ago

  • Assignee set to mkittler
  • Target version changed from Ready to future

#6 Updated by okurz 4 months ago

  • Target version changed from future to Ready

#7 Updated by mkittler 3 months ago

  • Description updated (diff)

I marked AC1 and 2 as done and added some remarks.

#8 Updated by mkittler 3 months ago

For AC4 we need to come up with a syntax to specify the job(s) to which settings should be applied. The syntax should be consistent between the clone-job script and the restart API.

Likely it makes sense to refer to the jobs by their TEST setting here, similar to how PARALLEL_WITH already uses TEST to refer to a job. Since we rarely use a : in job setting keys the following would make sense:

openqa-cli api -X POST jobs/16/restart SUPPORT_SERVER_ROLES:ha_delta_supportserver=ssh HA_CLUSTER_JOIN:ha_delta_node02=delta-node01override

The first setting would only be used for the job where TEST=ha_delta_supportserver and the second setting would only be used for the job where TEST=ha_delta_node02. Existing settings are overridden as usual. Existing settings in jobs not matching the specified TEST would be left unchanged.

This example uses the restart API but I assume the same should be made available for openqa-clone-job.

#9 Updated by okurz 3 months ago

  • Status changed from New to Blocked

this has one blocked subtask so keep it as blocked by subtasks for now

#10 Updated by mkittler 3 months ago

Note that implementing AC3 and AC4 is not blocked by improving the investigate script. I assume you simply want to have the investigate script be tackled first (although it is likely not less complicated than implementing AC3 and AC4 as described in my previous comment).

#11 Updated by okurz 3 months ago

mkittler wrote:

Note that implementing AC3 and AC4 is not blocked by improving the investigate script. I assume you simply want to have the investigate script be tackled first (although it is likely not less complicated than implementing AC3 and AC4 as described in my previous comment).

Well, yes, you can adress these points separately. At any time you or anyone else can suggest further subtasks for this or other epics. At the time we only had that one. And given that currently our backlog is already full of important tasks and cool ideas I suggest to still try to follow up with the tasks we have already defined with the exception of trying something out and suddenly ending up having a feature developed and finished super quick regardless ;) So in short: Try it out, if it takes less than a day, no problem at all.

#12 Updated by mkittler 3 months ago

I suppose for posting new jobs the syntax would indeed look similar to the restarting example mentioned in #103971#note-8:

openqa-cli api -X POST jobs SUPPORT_SERVER_ROLES:ha_delta_supportserver=ssh HA_CLUSTER_JOIN:ha_delta_node02=delta-node01override _PARALLEL_JOBS:ha_delta_node02=ha_delta_supportserver

However, there would be a semantic difference: In this case the suffixes like ha_delta_supportserver are not matched with existing TEST settings. (Obviously in the context of creating new jobs there are no existing jobs anyways.)

Instead, the suffixes are only used to distinguish which setting should go into which job and could be freely chosen by the caller¹. So this example would spawn two new jobs (two because there are two different suffixes), one with the setting SUPPORT_SERVER_ROLES=ssh and one with the setting HA_CLUSTER_JOIN=delta-node01override². The suffixes can also be used as value for _PARALLEL_JOBS, _START_AFTER_JOBS and _START_DIRECTLY_AFTER_JOBS to refer to a job which is created within the same API call. (So far these variables only allow specifying job IDs of jobs created in previous API calls.) The suffixes itself aren't stored anywhere; their only purpose is to make sense of the API call.

I suppose it would be actually quite easy to use that way (e.g. within the clone-job script).


¹For instance the clone job script could for simplicity use the ID of the original job here. Then making the parameters _PARALLEL_JOBS, _START_AFTER_JOBS and _START_DIRECTLY_AFTER_JOBS is also trivial because the script would simply use the original job IDs here (instead of the new jobs IDs).

#13 Updated by mkittler 3 months ago


If both PR are accepted AC3 is covered. The partial override mentioned in AC4 is still missing, though.

#14 Updated by okurz 2 months ago

https://github.com/os-autoinst/openQA/pull/4535 is merged. Please make sure to work on subtasks, not the epic itself.

#15 Updated by mkittler 2 months ago

I've been implementing the ACs for #95783 (which seemed to turn into a bigger task than the ACs mentioned in this epic).

#16 Updated by mkittler about 2 months ago

  • Description updated (diff)

I've just wanted to create a subtask for AC4 but I'm actually not quite sure what's expected there.

#17 Updated by okurz about 2 months ago

  • Related to action #66071: TEST is overridden in parent job when doing `openqa-clone-custom-git-refspec` added

Also available in: Atom PDF