Project

General

Profile

Actions

action #169510

open

coordination #58184: [saga][epic][use case] full version control awareness within openQA

coordination #152847: [epic] version control awareness within openQA for test distributions

Improve non-transactional creation of Minion jobs for Git updates when restarting jobs size:M

Added by mkittler about 1 month ago. Updated 14 days ago.

Status:
Workable
Priority:
Normal
Assignee:
-
Category:
Feature requests
Start date:
2024-11-07
Due date:
% Done:

0%

Estimated time:

Description

Observation

We invoke OpenQA::App->singleton->gru->enqueue_git_clones(\%clones, \@clone_ids) if keys %clones; outside of transactions when restarting jobs. This is problematic because for a moment it will simply look like the openQA jobs are not blocked by any Minion jobs so the scheduler might schedule them before the Git update is done.

See #169342#note-16 and notes referenced from there for further context. The short summary is that this is affecting restarted/cloned jobs in production and should therefore be fixed.

Note that after https://github.com/os-autoinst/openQA/pull/6049 has been merged the impact is really only that jobs are assigned before the related Git updates are done. There shouldn't be any bad consequence for parallel jobs anymore anyway.

Acceptance criteria

  • AC1: When restarting a job, it is ensured to run on the latest changes which have just been pushed to the test distribution version control repository
  • AC2: The scheduler does not assign restarted jobs prematurely to workers when those jobs are actually still waiting on pending Minion jobs.
  • AC3: Automatically triggered jobs, e.g. new builds of jobs being triggered, not restarts, still start without significant delay due to redundant updating

Suggestions

  • The simplest solution would be to make the enqueuing part of the transactions in which we create the new jobs. This has already been implemented (see https://github.com/os-autoinst/openQA/pull/6048) the solution might not be ideal, see comments on that PR.
  • Introduce a new initial job state that comes before "scheduled" (the current initial job state), e.g. "preparing" or simply "new". It would be ignored by the scheduler (which only looks for "scheduled"). So if we only transition to "scheduled" after the creation of the Minion jobs that would work. Of course we need to consider error cases so we never leave jobs in that new initial state forever. This would also require adjustments (or at least double-checking) in all code that deals with job states.

Related issues 3 (2 open1 closed)

Related to openQA Project (public) - action #169342: Fix scheduling parallel clusters with `PARALLEL_ONE_HOST_ONLY=1` when the openQA jobs depend on Minion jobs e.g. `git_clone` tasks started for the `git_auto_update` feature size:MResolvedmkittler2024-11-05

Actions
Blocks openQA Project (public) - action #168379: Enable automatic openQA git clone by default size:SBlockedmkittler2024-10-17

Actions
Blocks openQA Infrastructure (public) - action #168376: Enable automatic openQA git clone instead of fetchneedles on OSD size:SBlockedmkittler

Actions
Actions #1

Updated by mkittler about 1 month ago

  • Related to action #169342: Fix scheduling parallel clusters with `PARALLEL_ONE_HOST_ONLY=1` when the openQA jobs depend on Minion jobs e.g. `git_clone` tasks started for the `git_auto_update` feature size:M added
Actions #2

Updated by mkittler about 1 month ago

  • Description updated (diff)
Actions #3

Updated by okurz about 1 month ago

  • Target version set to future
Actions #4

Updated by mkittler about 1 month ago

  • Blocks action #168379: Enable automatic openQA git clone by default size:S added
Actions #5

Updated by okurz about 1 month ago

  • Target version changed from future to Ready

Moved to backlog to unblock #168379

Actions #6

Updated by okurz 29 days ago

  • Target version changed from Ready to Tools - Next
Actions #7

Updated by okurz 21 days ago

  • Target version changed from Tools - Next to Ready
Actions #8

Updated by mkittler 20 days ago

  • Blocks action #168376: Enable automatic openQA git clone instead of fetchneedles on OSD size:S added
Actions #9

Updated by okurz 16 days ago

  • Target version changed from Ready to Tools - Next
Actions #10

Updated by okurz 14 days ago

  • Parent task set to #152847
Actions #11

Updated by mkittler 14 days ago

  • Subject changed from Improve non-transactional creation of Minion jobs for Git updates when restarting jobs to Improve non-transactional creation of Minion jobs for Git updates when restarting jobs size:M
  • Description updated (diff)
  • Status changed from New to Workable
Actions

Also available in: Atom PDF