Project

General

Profile

Actions

coordination #80908

closed

coordination #80142: [saga][epic] Scale out: Redundant/load-balancing deployments of openQA, easy containers, containers on kubernetes

[epic] Continuous deployment (package upgrade or config update) without interrupting currently running openQA jobs

Added by okurz over 3 years ago. Updated almost 2 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Feature requests
Target version:
Start date:
2020-12-09
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)

Description

Motivation

We want to upgrade more often but not disrupt openQA jobs on package upgrades as well as re-read configuration whenever a job finishes

Acceptance criteria

  • AC1: DONE openQA worker packages can be upgraded continously without interrupting currently running openQA jobs
  • AC2: DONE openQA workers read updated configuration, e.g. WORKER_CLASS, whenever they are ready to pick up new jobs
  • AC3: Both o3 and osd deploy automatically after every change if all relevant checks have passed

Ideas

  • Use different git branches, e.g. "dev" or "main" and then "stable" or "tested" or "release" and create automatic merges by bots based on checks
  • Switch o3 workers to either deploy from worker containers which we update continuously or change the worker to allow non-transactional updates

Further details

One could try what apache does with apache2ctl graceful or systemctl reload apache2, e.g. see https://elearning.wsldp.com/pcmagazine/apache-graceful-restart-centos-7/

The restart of openQA workers could be simply prevented or delayed, e.g. with SendSIGKILL= in the openQA worker systemd service definitions which every openQA user is free to do, but then we could potentially wait hours until the service restarts if ever. Maybe we can still add a "graceful-stop" mode, wait a useful time for all jobs to finish and then restart (or even reboot the host).


Subtasks 11 (0 open11 closed)

action #80910: openQA workers read updated configuration, e.g. WORKER_CLASS, whenever they are ready to pick up new jobsResolvedmkittler2020-12-09

Actions
action #80986: terminate worker process after executing all currently assigned jobs based on config/env variableResolvedmkittler2020-12-11

Actions
openQA Infrastructure - action #81884: openqa-webui should automatically restart on config updatesResolvedokurz2021-01-08

Actions
action #89200: Switch OSD deployment to two-daily deploymentResolvedmkittler2021-02-26

Actions
action #90152: module results missing on quick job (on auto-restarting worker)Resolvedmkittler2021-03-16

Actions
action #104178: Increase OSD deployment rate from every second day to dailyResolvedokurz2021-12-20

Actions
action #104841: Prevent empty changelog messages from osd-deployment when there are no changes size:MResolvedmkittler2022-01-12

Actions
action #105379: Continuous deployment of o3 workers - one worker first size:MResolvedmkittler2022-01-24

Actions
action #105885: Continuous deployment of o3 workers - all the other o3 workers size:MResolvedmkittler

Actions
action #111028: Continuous update of o3 webUIResolvedokurz2022-05-12

Actions
action #111377: Continuous deployment of osd workers - similar as on o3 size:MRejectedokurz2022-05-20

Actions
Actions

Also available in: Atom PDF