coordination #98472


coordination #80142: [saga][epic] Scale out: Redundant/load-balancing deployments of openQA, easy containers, containers on kubernetes

[epic] Scale out: Disaster recovery deployments of existing openQA infrastructures

Added by okurz almost 3 years ago. Updated over 1 year ago.

Feature requests
Target version:
Start date:
Due date:
% Done:


Estimated time:
(Total: 0.00 h)



We investigated the general feasibility of running an OSD clone in cloud in #88341 . SUSE plans to improve our disaster recovery capabilities. To follow-up the initial work we should demonstrate in a real-world setup how an openQA instance can be created that is able to work in common SUSE workflows, e.g. SUSE SLE maintenance update testing

Acceptance criteria

  • AC1: An openQA infrastructure exists providing near-complete replication of any existing instance
  • AC2: The infrastructure is able to execute scheduled openQA tests (with reduced performance)
  • AC3: Emergency recovery guidelines exist how to recreate the setup


  • Setup cloud instance of openQA, at best using
  • Load latest database dump of an existing instance
  • Connect the instance to other infrastructure components, e.g., similar as currently o3 does

Out of scope

  • Recreating all former test results (subset ok)
  • "Exotic" worker infrastructure, e.g. no powerVM, s390x z/VM, hyperv, vmware is needed


Related issues 1 (1 open0 closed)

Copied to openQA Project - coordination #127040: [epic] Scale out: Easier and automated disaster recovery deployments of openQANew2022-12-01

Actions #1

Updated by okurz almost 3 years ago

  • Status changed from New to Blocked
  • Assignee set to okurz

with multiple subtasks created in the weekly estimation meeting I can track the epic as blocked by the specific subtasks

Actions #2

Updated by okurz over 2 years ago

The result from #98469 is documented in . Every member of the team should have AWS credentials that we can use to create and configure instances. We have tried it out mostly together so I am confident that we can continue with the other stories in this epic.

Actions #4

Updated by okurz over 1 year ago

  • Copied to coordination #127040: [epic] Scale out: Easier and automated disaster recovery deployments of openQA added
Actions #5

Updated by okurz over 1 year ago

  • Status changed from Blocked to Resolved

All subtasks resolved and very strictly speaking ACs are fulfilled also based on docs in , rest, e.g. terraform, moved to new future epic


Also available in: Atom PDF