coordination #103971
closedcoordination #103962: [saga][epic] Easy multi-machine handling: MM-tests as first-class citizens
[epic] Easy *re*-triggering and cloning of multi-machine tests
Description
Acceptance criteria¶
- AC1: DONE¹: Multi-machine tests can be retriggered over webUI similar to single-machine tests
- AC2: DONE¹: Multi-machine tests can be retriggered over API similar to single-machine tests
- AC3: DONE: Multi-machine tests can be cloned using CLI tools²
- AC4: DONE³: When cloning multi-machine tests settings can be partially overwritten and customized
¹ I assume "retriggered" means pressing the restart button or using the restart API. That works already for most cases. https://github.com/os-autoinst/openQA/pull/4498 provides a pending tweak for the only case I can currently think of were restarting wouldn't treat the dependency cluster as expected (and that case isn't strictly related to MM tests).
² I assume "CLI tools" refers to the openqa-clone-job
script which allows cloning jobs from another openQA instance. That is now supported, see #103971#note-13.
Updated by okurz about 3 years ago
- Subject changed from Easy *re*-triggering of multi-machine tests to [epic] Easy *re*-triggering of multi-machine tests
Updated by okurz about 3 years ago
- Subject changed from [epic] Easy *re*-triggering of multi-machine tests to [epic] Easy *re*-triggering and cloning of multi-machine tests
Updated by mkittler about 3 years ago
- Assignee set to mkittler
- Target version changed from Ready to future
Updated by mkittler almost 3 years ago
- Description updated (diff)
I marked AC1 and 2 as done and added some remarks.
Updated by mkittler almost 3 years ago
For AC4 we need to come up with a syntax to specify the job(s) to which settings should be applied. The syntax should be consistent between the clone-job script and the restart API.
Likely it makes sense to refer to the jobs by their TEST
setting here, similar to how PARALLEL_WITH
already uses TEST
to refer to a job. Since we rarely use a :
in job setting keys the following would make sense:
openqa-cli api -X POST jobs/16/restart SUPPORT_SERVER_ROLES:ha_delta_supportserver=ssh HA_CLUSTER_JOIN:ha_delta_node02=delta-node01override
The first setting would only be used for the job where TEST=ha_delta_supportserver
and the second setting would only be used for the job where TEST=ha_delta_node02
. Existing settings are overridden as usual. Existing settings in jobs not matching the specified TEST
would be left unchanged.
This example uses the restart API but I assume the same should be made available for openqa-clone-job
.
Updated by okurz almost 3 years ago
- Status changed from New to Blocked
this has one blocked subtask so keep it as blocked by subtasks for now
Updated by mkittler almost 3 years ago
Note that implementing AC3 and AC4 is not blocked by improving the investigate script. I assume you simply want to have the investigate script be tackled first (although it is likely not less complicated than implementing AC3 and AC4 as described in my previous comment).
Updated by okurz almost 3 years ago
mkittler wrote:
Note that implementing AC3 and AC4 is not blocked by improving the investigate script. I assume you simply want to have the investigate script be tackled first (although it is likely not less complicated than implementing AC3 and AC4 as described in my previous comment).
Well, yes, you can adress these points separately. At any time you or anyone else can suggest further subtasks for this or other epics. At the time we only had that one. And given that currently our backlog is already full of important tasks and cool ideas I suggest to still try to follow up with the tasks we have already defined with the exception of trying something out and suddenly ending up having a feature developed and finished super quick regardless ;) So in short: Try it out, if it takes less than a day, no problem at all.
Updated by mkittler almost 3 years ago
I suppose for posting new jobs the syntax would indeed look similar to the restarting example mentioned in #103971#note-8:
openqa-cli api -X POST jobs SUPPORT_SERVER_ROLES:ha_delta_supportserver=ssh HA_CLUSTER_JOIN:ha_delta_node02=delta-node01override _PARALLEL_JOBS:ha_delta_node02=ha_delta_supportserver
However, there would be a semantic difference: In this case the suffixes like ha_delta_supportserver
are not matched with existing TEST
settings. (Obviously in the context of creating new jobs there are no existing jobs anyways.)
Instead, the suffixes are only used to distinguish which setting should go into which job and could be freely chosen by the caller¹. So this example would spawn two new jobs (two because there are two different suffixes), one with the setting SUPPORT_SERVER_ROLES=ssh
and one with the setting HA_CLUSTER_JOIN=delta-node01override
². The suffixes can also be used as value for _PARALLEL_JOBS
, _START_AFTER_JOBS
and _START_DIRECTLY_AFTER_JOBS
to refer to a job which is created within the same API call. (So far these variables only allow specifying job IDs of jobs created in previous API calls.) The suffixes itself aren't stored anywhere; their only purpose is to make sense of the API call.
I suppose it would be actually quite easy to use that way (e.g. within the clone-job script).
¹For instance the clone job script could for simplicity use the ID of the original job here. Then making the parameters _PARALLEL_JOBS
, _START_AFTER_JOBS
and _START_DIRECTLY_AFTER_JOBS
is also trivial because the script would simply use the original job IDs here (instead of the new jobs IDs).
Updated by mkittler almost 3 years ago
- PR for extending the API as described in the previous comment: https://github.com/os-autoinst/openQA/pull/4535
- PR for extending the clone-job script to use that API: https://github.com/os-autoinst/openQA/pull/4537
If both PR are accepted AC3 is covered. The partial override mentioned in AC4 is still missing, though.
Updated by okurz almost 3 years ago
https://github.com/os-autoinst/openQA/pull/4535 is merged. Please make sure to work on subtasks, not the epic itself.
Updated by mkittler almost 3 years ago
I've been implementing the ACs for #95783 (which seemed to turn into a bigger task than the ACs mentioned in this epic).
Updated by mkittler almost 3 years ago
- Description updated (diff)
I've just wanted to create a subtask for AC4 but I'm actually not quite sure what's expected there.
Updated by okurz almost 3 years ago
- Related to action #66071: TEST is overridden in parent job when doing `openqa-clone-custom-git-refspec` added