Project

General

Profile

Actions

coordination #121720

closed

[saga][epic] Migration to QE setup in PRG2+NUE3 while ensuring availability

Added by okurz almost 2 years ago. Updated 5 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
Start date:
2018-06-29
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Tags:

Description

Motivation

SUSE is deprecating NUE1 (Maxtorhof) and setting up a Prague Co-Location datacenter "Prg CoLo" or "DC7" as primary location in particular for serving public services. This includes what we serve so far from VM clusters managed by EngInfra and in particular the openqa.opensuse.org infrastructure, likely also openqa.suse.de. Or defined differently: Everything that is currently served from NUE1-SRV1. We must participate in planning and setup and accordingly a migration until we can provide our services from Prg CoLo and do not rely on NUE1-SRV1 anymore except for the purpose of an optional fail-over datacenter in Nbg.
SUSE is deprecating NUE1 (Maxtorhof) and setting up replacement data centers. Additionally a new datacenter is planned as fail-over location

Acceptance criteria

  • AC1: SUSE QE Tools services are provided out of Prg CoLo #123800
  • AC2: NUE1 (Maxtorhof) is not relied upon by SUSE QE Tools anymore and has been evacuated by us #129280
  • AC3: Relevant SUSE QE Tools services are provided out of NUE3 #130955

Further details

Coordination chat room #dct-migration


Subtasks 158 (0 open158 closed)

coordination #116623: [epic] Migration of SUSE Nbg based openQA+QA+QAM systems to new security zonesResolvedokurz2022-09-14

Actions
action #116626: Migration of SUSE QA systems to new security zones - QAM systemsResolvedokurz2022-09-15

Actions
action #116629: Preparation planning for migration of SUSE openQA+QA systems to new security zones size:MResolvedokurz2022-09-15

Actions
openQA Infrastructure - action #116689: Do not rely on statically configured IPv4 addresses for the salt master in /etc/hosts size:SResolvedokurz2022-09-14

Actions
action #117043: Request DHCP+DNS services for new QE network zones, same as already provided for .qam.suse.de and .qa.suse.czResolvedokurz

Actions
action #119443: Conduct the migration of SUSE openQA systems from Nbg SRV1 to new security zones size:MResolvedokurz2022-11-17

Actions
action #119446: Conduct the migration of SUSE openQA+QA systems from Nbg SRV2 to new security zonesResolvedokurz2022-09-15

Actions
action #119449: Conduct the migration of SUSE openQA+QA systems from Nbg QA labs to new security zonesResolvedokurz2022-09-15

Actions
action #119638: Ensure every physical machine within .qam.suse.de has an IPMI+eth L2 address entry in racktables size:MResolvedokurz

Actions
openQA Infrastructure - action #120025: [openQA][ipmi][worker] Worker host hostname changed and broken networking connectionResolvedokurz2022-11-07

Actions
openQA Infrastructure - action #120163: Use salt grains instead of manually specifying IPs in "bridge_ip" size:MResolvedmkittler

Actions
action #120264: Conduct the migration of SUSE QA systems (non-tools-team maintained) from Nbg SRV1 to new security zones size:MResolvedokurz2022-09-15

Actions
action #120267: Conduct the migration of openqa-ses aka. "storage.qa.suse.de" size:MResolvedmkittler2022-09-15

Actions
openQA Infrastructure - action #120270: Conduct the migration of SUSE openQA systems IPMI from Nbg SRV1 to new security zones size:MResolvedmkittler

Actions
openQA Tests - action #120288: [tools] cloud based tests fail due to traffic to cloud blocked auto_review:"2022-11-0.*Test died: (Waiting for Godot.*ssh|Cannot find image after upload)":retryResolvedokurz2022-11-10

Actions
openQA Project - action #120333: [os-autoinst][ipmi] Add support for ssh jump host in IPMI backendRejectedokurz2022-11-11

Actions
openQA Infrastructure - action #120339: QEMU DNS fails to resolve openqa.suse.de via IP addressResolvedokurz2022-11-11

Actions
openQA Infrastructure - action #120441: OSD parallel jobs failed with "get_job_autoinst_url: No worker info for job xxx available" size:meowResolvedokurz2022-11-15

Actions
openQA Tests - action #120789: [virtualization] tests fail to upload to qadb on dbproxy.suse.de with "Access denied, this account is locked"Resolved

Actions
openQA Infrastructure - action #120807: [alert] openqa.suse.de - worker12.oqa.suse.de 100% packet loss due to outdated AAAA recordResolvedokurz2022-11-17

Actions
openQA Project - coordination #122650: [epic] Fix firewall block and improve error reporting when test fails in curl log uploadResolvedokurz2022-12-29

Actions
openQA Tests - action #122539: test fails in curl log from openqa and connect with FQDN worker2.oqa.suse.de always fails by time out size:MClosed2022-12-29

Actions
openQA Project - action #122608: exit code of shell command not received by script_runResolvedokurz2023-01-02

Actions
openQA Infrastructure - action #122653: Ask SUSE-IT network admins to REJECT packets instead of DROP so that we get more clear results size:SRejectedokurz2023-01-03

Actions
openQA Infrastructure - action #122656: Ask SUSE-IT network admins to *not* block this traffic which we need for tests regarding s390x within SUSE network size:MResolvedokurz2023-01-03

Actions
openQA Project - action #122659: Improved error reporting in openQA tests when curl times out on connection attemptsRejectedokurz2023-01-03

Actions
action #123697: Conduct the migration of SUSE QA systems s390x zVM instances to new security zones size:MResolvedokurz2022-09-15

Actions
openQA Infrastructure - action #124119: Conduct the migration of remaining SUSE openQA systems IPMI to new security zonesResolvedokurz2023-02-08

Actions
openQA Infrastructure - action #124715: Failing pipelines because of unreachable machine openqaworker-arm-1Rejected2023-02-08

Actions
coordination #124721: [epic] Ensure proper QE maintainership of Nbg QAM machinesResolvedokurz2023-02-17

Actions
action #124724: Ensure Nbg QAM machines have a current maintainer as "contact person" size:SResolvedokurz2023-02-17

Actions
action #125144: Give members of SUSE QE Tools team a chance to get familiar with Nbg QAM machines size:MResolvedokurz2023-02-17

Actions
action #125234: Decommission obsolete machines in qam.suse.de size:MResolvedokurz2023-03-01

Actions
openQA Infrastructure - action #124877: Failing pipelines because of unreachable machine openqaworker-arm-1Resolvedmkittler2023-02-08

Actions
action #107731: Salt all SUSE QA machines, at least passwords and ssh keys and automatic upgrading size:MResolvedokurz2022-03-01

Actions
openQA Infrastructure - action #125534: Consolidate the installation of openqaw5-xen with SUSE QE Tools maintained machines size:MResolvedokurz2023-03-07

Actions
openQA Infrastructure - action #125750: In salt-states-openqa support machines requiring ssh password login for root user size:MResolvedosukup

Actions
openQA Infrastructure - action #151390: Brute-force salt osiris so that we enable self-management of VMs for users size:MResolvedmkittler2023-11-24

Actions
openQA Infrastructure - action #151396: After osiris is now in salt decide about the fate of sethResolvedokurz

Actions
coordination #123800: [epic] Provide SUSE QE Tools services running in PRG2 aka. Prg CoLoResolvedokurz2021-10-06

Actions
openQA Project - action #117553: multiple people can not access openqa.suse.de but can access openqa.nue.suse.com, we should clarify the difference and maybe change our wordingResolvedokurz2022-10-04

Actions
openQA Infrastructure - action #132134: Setup new PRG2 multi-machine openQA worker for o3 size:MResolveddheidler2023-06-29

Actions
openQA Infrastructure - action #132137: Setup new PRG2 openQA worker for osd size:MResolvedmkittler2023-06-29

Actions
action #132140: Support move of PowerPC machines to PRG2 size:MResolvedokurz2023-06-29

Actions
openQA Infrastructure - action #132143: Migration of o3 VM to PRG2 - 2023-07-19 size:MResolvednicksinger2023-06-29

Actions
action #132146: Support migration of osd VM to PRG2 - 2023-08-29 size:MResolvedmkittler2023-06-29

Actions
action #132158: Ensure that osd can work without relying on any physical machine in NUE1 size:MResolvedokurz2023-06-29

Actions
openQA Infrastructure - action #132461: manage tls certificates on o3/ariel directly with dehydrated size:MResolvednicksinger2023-07-07

Actions
openQA Infrastructure - action #132647: Migration of o3 VM to PRG2 - bare-metal tests size:MResolvedokurz

Actions
openQA Infrastructure - action #133160: Setup a modern UEFI httpboot setup on o3 with dnsmasq size:MResolveddheidler2023-07-21

Actions
openQA Infrastructure - action #133181: Migration of o3 VM to PRG2 - Fix https://openqa.opensuse.org/snapshot-changes/opensuse/Tumbleweed/Resolvedokurz

Actions
openQA Infrastructure - action #133358: Migration of o3 VM to PRG2 - Ensure IPv6 is fully workingResolvedokurz

Actions
openQA Infrastructure - action #133364: Migration of o3 VM to PRG2 - Decommission old-ariel in NUE1 as soon as we do not need it anymoreResolvedokurz

Actions
openQA Infrastructure - action #133475: Migration of o3 VM to PRG2 - connection to rabbit.opensuse.orgResolvedmkittler

Actions
openQA Infrastructure - action #133490: Migration of o3 VM to PRG2 - Fix o3 bare metal hosts iPXE booting size:MResolveddheidler

Actions
openQA Infrastructure - action #134123: Setup new PRG2 openQA worker for o3 - two new arm workers size:MResolvednicksinger

Actions
openQA Infrastructure - action #134822: Migration of osd VM to PRG2 - Decommission old-osd in NUE1 as soon as we do not need it anymore size:MResolvedokurz2023-08-30

Actions
openQA Project - action #134837: SLE test repo not updated on OSD, cron service was not running since 2023-08-29, fetchneedles not called size:MResolvedlivdywan

Actions
openQA Infrastructure - action #134879: reverse DNS resolution PTR for openqa.oqa.prg2.suse.org. yields "3(NXDOMAIN)" for PRG1 workers (NUE1+PRG2 are fine) size:MResolvedokurz2023-08-31

Actions
openQA Infrastructure - action #134900: salt states fail to apply due to "Pillar openqa.oqa.prg2.suse.org.key does not exist"Resolvednicksinger2023-08-31

Actions
openQA Infrastructure - action #134912: Gradually phase out NUE1 based openQA workers size:MResolvedokurz

Actions
openQA Infrastructure - action #135191: Migration of o3 VM to PRG2 - Use direct zabbix connection size:MResolveddheidler

Actions
openQA Infrastructure - action #137408: Support move of s390x mainframe(s) to PRG2 - o3 size:MResolvedmgriessmeier2023-06-29

Actions
coordination #137630: [epic] QE (non-openQA) setup in PRG2Resolvedokurz2023-09-20

Actions
action #138356: Migration of qam.suse.de to PRG2 size:MResolvedokurz2023-10-23

Actions
action #139130: Migration of openqa-service to PRG2 size:MResolvedokurz

Actions
action #153721: Move of LSG QE non-openQA PowerPC machine NUE1 to PRG2 - legolasResolvedokurz2024-01-16

Actions
action #153724: Move of LSG QE non-openQA PowerPC machine NUE1 to PRG2 - blackcurrantResolvedokurz2024-01-16

Actions
action #153730: Move of LSG QE non-openQA PowerPC machine NUE1 to PRG2 - huckleberryResolvedokurz2024-01-16

Actions
action #153733: Move of LSG QE non-openQA PowerPC machine NUE1 to PRG2 - soapberry size:SResolvedokurz2024-01-16

Actions
action #153739: Move of openqa.opensuse.org machine NUE1 to PRG2 - blackbauhiniaResolvedokurz2024-01-16

Actions
action #153742: Move of OSD machine NUE1 to PRG2 - storage.qe.prg2.suse.orgResolvedokurz2024-01-16

Actions
action #153796: Prepare DHCP/DNS for qe.prg2.suse.org based on former qa.suse.de entries size:MResolvednicksinger2024-01-17

Actions
action #153799: Prepare DHCP/DNS for machines coming to qe.prg2.suse.org based on former qam.suse.de entries size:MResolvedmkittler2024-01-17

Actions
action #153802: Obsolete/remove former qam.suse.de DHCP/DNS davinci configuration or references size:MResolvedybonatakis2024-01-17

Actions
action #154447: Move of LSG QE non-openQA PowerPC machine NUE1 to PRG2 - gollumResolvedokurz

Actions
action #159306: Fix AAAA records in qe.prg2.suse.org size:SResolvedokurz2024-01-16

Actions
openQA Infrastructure - action #161756: IPMI access over IPv6 doesn't work on blackbauhinia size:SResolvedmkittler2024-04-24

Actions
action #139109: Support move of non-openQA PowerPC machines to PRG2, i.e. haldir, legolas, whale, blackcurrant, cloudberry, huckleberry, soapberry, nessberryResolvedokurz2023-06-29

Actions
action #139112: Ensure OSD openQA PowerPC machine grenache is operational from PRG2Resolvednicksinger2023-06-29

Actions
action #139115: Ensure o3 openQA PowerPC machine qa-power8-3 is operational from PRG2 size:MResolvednicksinger2023-06-29

Actions
action #139199: Ensure OSD openQA PowerPC machine redcurrant is operational from PRG2 size:MResolvednicksinger2023-06-29

Actions
openQA Infrastructure - action #150815: unable to login over ssh to o3 (gate.opensuse.org:2214) size:MRejectedokurz2023-11-13

Actions
openQA Infrastructure - action #150956: o3 cannot send e-mails via smtp relay size:MResolvedokurz2023-11-16

Actions
openQA Infrastructure - action #157243: Update HMC with vMF68994Resolvedokurz2024-03-14

Actions
action #159231: Bring back worker class "hmc_ppc64le-4disk" on redcurrant or another machine size:MResolvednicksinger

Actions
openQA Infrastructure - action #161318: Ensure we have a consistent racktables entry for OSDResolvedokurz2024-05-31

Actions
coordination #129280: [epic] Move from SUSE NUE1 (Maxtorhof) to new NBG DatacentersResolvedokurz2018-06-29

Actions
coordination #37910: [tools][epic] Migration of or away from qanet.qa.suse.deResolvedokurz2018-06-29

Actions
action #38012: [tools][labs][medium] Setup DHCPv6 and DNS AAAA records for VLAN12Rejectedokurz2018-06-30

Actions
action #38018: [labs][tools] Setup new qanetRejectedokurz2018-06-29

Actions
action #81192: [tools] Migrate (upgrade or replace) qanet.qa.suse.de to a supported, current OS size:MResolvedokurz2020-12-18

Actions
action #81200: [tools][labs] some partitions on qanet are 100% full, seems like /data/backups has no new archives since 20201009 due to thatResolvedokurz2020-12-18

Actions
openQA Infrastructure - action #134051: Eng-Infra maintained DNS server for .qa.suse.de taking over from qanet size:MResolveddheidler2023-08-09

Actions
action #124221: Repurpose quake.qe.nue2.suse.org (formerly known as cloud4) as employee-workstation replacement size:MResolvedokurz2023-02-09

Actions
action #129283: [tools] Help Needed: Active Inventory of Maxtorhof SRV1/SRV2/SRV2XResolvedokurz2023-05-15

Actions
action #130796: Use free blades on quake.qe.nue2.suse.org and unreal.qe.nue2.suse.org as openQA OSD bare-metal test machinesResolvedokurz2023-02-09

Actions
openQA Infrastructure - coordination #131519: [epic] Additional redundancy for OSD virtualization testingResolvedokurz2023-02-09

Actions
openQA Infrastructure - action #131549: [spike][timeboxed:20h] Additional redundancy for OSD virtualization testing - Hyperv 2016 worker host size:MResolvednanzhang2023-06-28

Actions
openQA Infrastructure - action #133247: Additional redundancy for OSD virtualization testing - Hyperv 2019 and 2022 (or 2012r2) worker host size:MResolvedrcai2023-07-25

Actions
openQA Infrastructure - action #133367: Evaluate if we have hardware alternatives for Windows Server 2016+ testingResolvedokurz2023-07-26

Actions
openQA Infrastructure - action #137306: Check unreal6 cabling, SP and system not reachable over network size:MResolvedokurz2023-10-02

Actions
openQA Infrastructure - action #138350: worker31 and likely more OSD machines get stuck on boot in grub command lineResolveddheidler2023-06-28

Actions
action #133706: Setup of former QAM machines from NUE1-SRV2 in FC BasementResolvedokurz2023-08-02

Actions
action #133748: Move of openqaworker-arm-1 to FC Basement size:MResolvedybonatakis

Actions
openQA Infrastructure - action #134087: Fix ix64ph1075 bare metal openQA test size:MResolveddheidler2023-08-10

Actions
openQA Infrastructure - action #134132: Bare-metal control openQA worker in NUE2 size:MResolvedokurz

Actions
openQA Infrastructure - action #136133: Migrate aarch64.openqanet.opensuse.org to FC Basement size:MResolveddheidler

Actions
action #137144: Ensure that we have less or no workstation left clogging our FC Basement space size:MResolvedokurz

Actions
action #156250: Ensure IPv6 works in qe.nue2.suse.orgResolvedokurz2024-02-28

Actions
action #157753: Bring back automatic recovery for openqaworker-arm-1 size:MResolvedybonatakis

Actions
action #157819: Can't login to walter1 and walter2 offlineResolvedokurz2024-03-24

Actions
openQA Infrastructure - action #158020: salt-states-openqa pipeline times outResolvedokurz2024-03-26

Actions
openQA Infrastructure - action #159270: openqaworker-arm-1 is Unreachable size:SResolvedybonatakis2024-04-19

Actions
openQA Infrastructure - action #159555: IPMI access over IPv6 doesn't work on imagetester - try to update BIOS with physical access size:SResolvedokurz2024-04-24

Actions
coordination #130955: [epic] Migration out of SUSE NUE1 - QE setup in NUE3Resolvedokurz2021-06-28

Actions
openQA Infrastructure - action #94765: Bring openqaworker12 into production (w/o multi-machine test support) size:MRejectedokurz

Actions
openQA Infrastructure - action #94783: Bring openqaworker11 into production including multi-machine test support (same as w12) size:MRejected2021-06-28

Actions
openQA Infrastructure - action #95167: Bring openqaworker12 into production including multi-machine test supportRejectedokurz2021-07-07

Actions
action #131144: Decide about all LSG QE machines in NUE1 size:MResolvedokurz2023-06-20

Actions
action #132620: Move of selected LSG QE machines NUE1 to NUE3 size:MResolvedokurz2023-07-12

Actions
action #132623: Decommissioning of selected selected LSQ QE machines from NUE1-SRV2Resolvedokurz2023-07-12

Actions
openQA Infrastructure - action #132671: Ensure everybody in SUSE QE Tools knows how to access netbox size:MResolvedlivdywan2023-07-07

Actions
coordination #131525: [epic] Up-to-date and usable LSG QE NUE1 machinesResolvedokurz2023-06-28

Actions
action #131528: Bring backup.qam.suse.de up-to-date size:MResolvedokurz2023-06-28

Actions
action #132320: Bring styx.qam.suse.de up-to-dateResolvedokurz2023-06-28

Actions
action #132323: Bring arm4.qe.suse.de up-to-dateResolvedokurz2023-06-28

Actions
action #132347: Bring borg.qam.suse.de up-to-dateResolvedokurz2023-06-28

Actions
action #132353: Bring enterprise-nx02.qam.suse.de up-to-date size:MResolvedokurz2023-06-28

Actions
action #132356: Bring fibonacci.qam.suse.de up-to-dateResolvedokurz2023-06-28

Actions
action #132359: Bring galileo.qam.suse.de up-to-date size:MResolvedokurz2023-06-28

Actions
action #132362: Bring openqa-service.qe.suse.de up-to-dateResolvedokurz2023-06-28

Actions
action #132452: Bring seth+osiris up-to-dateResolvedokurz2023-06-28

Actions
openQA Infrastructure - action #134453: backup.qam.suse.de is Failed according to netbox and not creating backups size:MResolvedmkittler

Actions
openQA Infrastructure - action #134489: backup.qa.suse.de does not create backupsResolvedtinita2023-08-22

Actions
openQA Infrastructure - action #134519: We were not notified that backup.qa.suse.de did not create backups size:MResolvedlivdywan2023-08-23

Actions
coordination #153685: [epic] Move from SUSE NUE1 (Maxtorhof) to PRG2eResolvedokurz2024-01-16

Actions
action #132617: Move of selected LSG QE machines NUE1 to PRG2e size:MResolvedokurz

Actions
action #153670: Move of selected LSG QE machines NUE1 to PRG2e - fozzie size:MResolveddheidler

Actions
action #153673: Move of selected LSG QE machines NUE1 to PRG2e - orionResolvedokurz2024-01-16

Actions
action #153679: Move of selected LSG QE machines NUE1 to PRG2e - andromedaResolvedokurz2024-01-16

Actions
action #153682: Move of selected LSG QE machines NUE1 to PRG2e - quinn size:MResolveddheidler2024-01-16

Actions
action #153688: Move of selected LSG QE machines NUE1 to PRG2e - openqaw9-hypervResolvedokurz2024-01-16

Actions
action #153691: Move of selected LSG QE machines NUE1 to PRG2e - openqaw5-xenResolvedokurz2024-01-16

Actions
action #153694: Move of selected LSG QE machines NUE1 to PRG2e - fibonacciResolvedokurz2024-01-16

Actions
action #153697: Move of selected LSG QE machines NUE1 to PRG2e - sauronResolvedokurz2024-01-16

Actions
action #153700: Move of selected LSG QE machines NUE1 to PRG2e - arm4Resolvedokurz2024-01-16

Actions
action #153703: Move of selected LSG QE machines NUE1 to PRG2e - voyagerResolvedokurz2024-01-16

Actions
action #153706: Move of selected LSG QE machines NUE1 to PRG2 - amd-zen2-gpu-sut1 size:MResolvednicksinger2024-01-16

Actions
action #153709: Move of selected LSG QE machines NUE1 to PRG2e - ada size:MResolvedokurz2024-01-16

Actions
action #153715: Move of LSG QE non-openQA PowerPC machine NUE1 to PRG2 - whaleResolvedokurz2024-01-16

Actions
action #153784: Move of selected LSG QE machines NUE1 to PRG2e - openqaworker19Resolvedokurz2024-01-16

Actions
action #153787: Move of selected LSG QE machines NUE1 to PRG2e - openqaworker20 size:MResolvedokurz2024-01-16

Actions
action #154450: Move of selected LSG QE machines NUE1 to PRG2e - openqaw7-hypervResolvedokurz2024-01-16

Actions
action #154453: Move of selected LSG QE machines NUE1 to PRG2e - openqaw8-vmwareResolvedokurz2024-01-16

Actions
openQA Infrastructure - action #155824: Support IPv6 SLAAC in our infrastructure size:MResolvednicksinger2024-02-22

Actions
coordination #154438: [epic] Move from SUSE NUE1 (Maxtorhof) to PRG+PRG2e - handling of overlooked machinesResolvedokurz2024-01-29

Actions
coordination #154444: Ensure SAP QA machines in PRG2 J06 are usableResolvedokurz2024-01-29

Actions
Actions #2

Updated by okurz almost 2 years ago

  • Priority changed from Normal to Low
  • Target version changed from Ready to future

Had meeting with EngInfra TL 2022-12-07 mflores. Prg CoLo will start migrating services 2023-03, bugzilla, gitlab, virtualization clusters. s390 and PowerPC will be moved as well, likely 2023-05. They should be offline for some days and then usable again after setup in Prg CoLo. x86_64+aarch64 is ordered as new. Nbg new DC will also be setup in that time. 40 racks for everything from NUE1 that does not fit/move to FC labs. Monitoring: Prg CoLo will have switches and firewalls. They shall be configured as IaC, maybe with salt or terraform. After that monitoring is planned, but I consider it doubtful if this will work out.

2022-12-08: Decided with mgriessmeier, nsinger, mflores to order 4x ARM machines for Prg CoLo to have redundancy for each o3+osd, i.e. 2xARM@o3, 2xARM@osd

Right now waiting for DC being ready for us to use or waiting for any pending questions

Actions #3

Updated by okurz almost 2 years ago

  • Description updated (diff)
  • Status changed from New to Blocked
  • Target version changed from future to Ready

-> subtasks

Actions #4

Updated by okurz almost 2 years ago

  • Target version changed from Ready to future

I would like to track this outside our current backlog as we don't need to conduct that much work now.

Actions #5

Updated by okurz over 1 year ago

  • Tags set to infra
Actions #6

Updated by okurz over 1 year ago

Actions #7

Updated by okurz over 1 year ago

  • Target version changed from future to Ready
Actions #8

Updated by okurz over 1 year ago

  • Description updated (diff)
Actions #9

Updated by okurz over 1 year ago

  • Subject changed from [saga][epic] QE setup in Prg CoLo to [saga][epic] QE setup in PRG2 aka. Prg CoLo
Actions #10

Updated by okurz over 1 year ago

  • Subject changed from [saga][epic] QE setup in PRG2 aka. Prg CoLo to [saga][epic] QE setup in PRG2+NUE3
  • Description updated (diff)

Combining #121720 and #130955 as there is too much overlap

I wrote a message in https://mailman.suse.de/mailman/private/qa-team/2023-June/005988.html

Hi all,
Be advised that there are plans to fully empty the old Nuremberg
datacenter at the old office location "Maxtorhof" aka. NUE1 until end of
this year. This means moving services and machines to other locations or
decomissioning services or machines that are not needed anymore.
The SUSE QE Tools team will organize, execute and lead any necessary
actions concerning LSG QE services and machines as far as we know of.

How will you be impacted by this? In the best case you will only see
short outages of services during the actual migrations. Maybe you will
need to reach specific machines by new domains (FQDNs). Likely over the
next weeks and months individual services will have outages and
performance degradations. In the worst case critical machines that no
one considered will be lost and services need to be recovered with
careful and tricky reverse engineering. Good planning and reviews of
plans can mitigate that risk :)

According to current plans we want to setup new openQA workers in the
following weeks and the service openqa.suse.de and according virtual
machine will move to PRG2 (the new Prague datacenter) on 2023-07-17.
Expect an outage on https://openqa.nue.suse.com and
https://openqa.suse.de on that day.

The equivalent migration will be conducted for
https://openqa.opensuse.org at beginning of 2023-08.

Find more details in
https://progress.opensuse.org/issues/121720

Have fun,
Oliver

and an according copy in https://suse.slack.com/archives/C02CANHLANP/p1687787065732719

Actions #12

Updated by okurz over 1 year ago

  • Project changed from 46 to QA
  • Category deleted (Infrastructure)
Actions #13

Updated by okurz about 1 year ago

  • Description updated (diff)
Actions #14

Updated by okurz about 1 year ago

  • Subtask #129280 added
Actions #15

Updated by okurz about 1 year ago

  • Subtask #116623 added
Actions #16

Updated by okurz about 1 year ago

  • Subtask #138476 added
Actions #17

Updated by okurz 12 months ago

  • Subtask #118636 added
Actions #19

Updated by okurz 10 months ago

  • Subtask deleted (#95167)
Actions #20

Updated by okurz 10 months ago

  • Subtask deleted (#94783)
Actions #21

Updated by okurz 10 months ago

  • Subtask deleted (#94765)
Actions #22

Updated by okurz 10 months ago

  • Subtask #153685 added
Actions #23

Updated by okurz 10 months ago

  • Subtask #154438 added
Actions #24

Updated by okurz 10 months ago

  • Subtask deleted (#121726)
Actions #25

Updated by okurz 9 months ago

  • Subject changed from [saga][epic] QE setup in PRG2+NUE3 to [saga][epic] Migration to QE setup in PRG2+NUE3 while ensuring high availability
Actions #26

Updated by okurz 9 months ago

  • Subject changed from [saga][epic] Migration to QE setup in PRG2+NUE3 while ensuring high availability to [saga][epic] Migration to QE setup in PRG2+NUE3 while ensuring availability
Actions #27

Updated by okurz 6 months ago

  • Subtask deleted (#138476)
Actions #28

Updated by okurz 5 months ago ยท Edited

  • Status changed from Blocked to Resolved

All remaining tasks done \o/

After more than 2.5 years starting with #100455 we concluded the work for the big multi-datacenter migration. With the great work of the QE Tools team members and with all your enduring patience we came to that achievement :)

We already have new ongoing tasks based on new hardware as well as older hardware that can be repurposed for better use. So back to work everyone! :)

The saga included 158 subtasks, pre-planning tasks, planning tasks, coordination tasks as well as actual hands-on work. Overall we migrated/decommissioned/reinstalled/repurposed 100+ physical machines plus many more virtual ones. All covering much more then just openQA but a variety of QE machines or old QA SLE and QAM machines. We helped to clean out and fully decommission 7 server rooms or labs all while ensuring operability as much as possible. For some hosts, like PowerPC, the downtime was actually longer than a year for a variety of reasons but for the most critical services the accumulated downtime was in the range of hours.

Actions

Also available in: Atom PDF