Project

General

Profile

Actions

action #168901

closed

coordination #168895: [saga][epic][infra] Support SUSE PRG office move while ensuring business continuity

coordination #168898: [epic][infra] Support SUSE PRG office datacenter "PRG1" move while ensuring business continuity

Support SUSE PRG office datacenter "PRG1" move to a new location "PRG3" while ensuring business continuity - pre-planning size:M

Added by okurz 4 months ago. Updated about 1 month ago.

Status:
Resolved
Priority:
High
Category:
Feature requests
Start date:
2024-10-25
Due date:
% Done:

0%

Estimated time:

Description

Motivation

Vit Pelcak informed me that he is invited "to discuss requirements for server room for the new Prague office." We will support the planning process.

Vit wrote:

So far I got these requirements:
Move all the machines
Working PXE, IPMI...Move in time not colliding with the product testing/release (@Jan Stehlík suggested Feb/Mar)
Move in phases so that we don't have all the machines offline at the same time

My addition: Do we actually need machines in a new office server room or just move them to PRG2e? Nobody from the QE tools team commonly visits the server rooms in neither PRG1 nor PRG2 so it wouldn't make a difference. There should be IPv6

Note: void.qam.suse.cz (aka openqa.qam.suse.cz) with its workers is currently in PRG1.

Acceptance Criteria

  • AC1: Assets of QE Tools inside PRG1 are known
  • AC2: Racktable entries for the identified assets are up-to-date
  • AC3: We know if we want/need machines in the new PRG site "PRG3"
  • AC4: For every identified asset, the future of the asset is decided (i.e. to be discarded, moved, replaced with new one)

Suggestions

  • Be aware about #170458
  • Review current state in racktables and update where there is already obvious outdated information or incomplete information, e.g. contact persons that have left the company or missing MAC addresses for machines
  • For unclear racktables entries talk to contact persons / machine owners / loaners
  • Identify groups of machines and for every identified asset decide about the future of the asset, i.e. to be discarded, moved, replaced with new one
  • Look at https://confluence.suse.com/display/~ewalker/Prague+%28in+office%29+Server+Room+requirements+for+Prague+teams
Actions #1

Updated by okurz 4 months ago

From Vit:

JFYI here is the document to gather our requirements for Prg server room
https://confluence.suse.com/display/~ewalker/Prague+%28in+office%29+Server+Room+requirements+for+Prague+teams

Actions #2

Updated by okurz 4 months ago

  • Status changed from Feedback to New
  • Assignee deleted (okurz)
  • Target version changed from future to Ready
Actions #3

Updated by jbaier_cz 3 months ago

  • Assignee set to jbaier_cz

As discussed I will take care of this one.

Actions #4

Updated by jbaier_cz 3 months ago

  • Description updated (diff)

At least from what I know, this won't be an issue until next year. We should be ready nevertheless.

Actions #5

Updated by jbaier_cz 3 months ago

And just to make sure it will not fall into oblivion, void.qam.suse.cz (aka openqa.qam.suse.cz) with its workers is currently in PRG1.

Actions #6

Updated by okurz 3 months ago

  • Subject changed from Support SUSE PRG office datacenter "PRG1" move while ensuring business continuity - pre-planning to Support SUSE PRG office datacenter "PRG1" move to a new location "PRG3" while ensuring business continuity - pre-planning size:M
  • Description updated (diff)
  • Status changed from New to Workable
Actions #8

Updated by jbaier_cz 3 months ago

  • Status changed from Workable to Blocked

Block this one on #170458 as that will help with AC1/AC2.

Actions #9

Updated by okurz 3 months ago

  • Description updated (diff)

Actually this shouldn't wait for #154042 which would be about a DHCP VM with unclear ownership. But this ticket here should focus on the physical machines within PRG1. With an evacuation of PRG1 possible even qanet.qa.suse.cz would be obsolete. Instead I added a relation to the sibling task #170458. @jbaier_cz the situation is unchanged, wait for the blocker you mentioned.

Actions #10

Updated by jbaier_cz 3 months ago

  • Status changed from Blocked to Workable
Actions #11

Updated by jbaier_cz about 2 months ago

  • Assignee deleted (jbaier_cz)
Actions #12

Updated by okurz about 2 months ago

  • Priority changed from Normal to High
Actions #13

Updated by robert.richardson about 2 months ago

  • Status changed from Workable to In Progress
  • Assignee set to robert.richardson
Actions #14

Updated by openqa_review about 2 months ago

  • Due date set to 2025-01-28

Setting due date based on mean cycle time of SUSE QE Tools

Actions #16

Updated by jbaier_cz about 1 month ago

robert.richardson wrote in #note-15:

Regarding AC3, since all the machines are in use they shouldn't be discarded, but rather moved or replaced.

I guess that might depend on the destination; for some of them it might be cheaper to just buy a new hardware on the new place if those hosts are just normal workers without any special features.

Actions #17

Updated by okurz about 1 month ago

what's with all the other machines in the domain .qa.suse.cz or qam.suse.cz e.g. on https://racktables.nue.suse.com/index.php?page=rack&rack_id=2552 ? Please crosscheck against the list https://racktables.nue.suse.com/index.php?page=depot&tab=default&andor=and&cfe=%7B%24typeid_4%7D&cft[]=449 which lists all machines with tag "QE LSG" for all locations.

Actions #18

Updated by jbaier_cz about 1 month ago

  • Description updated (diff)

This reminds me, there is also https://confluence.suse.com/display/~ewalker/Prague+%28in+office%29+Server+Room+requirements+for+Prague+teams which includes some of the servers from the qa/qam domain.

Actions #19

Updated by robert.richardson about 1 month ago

Working on adding tags in racktables in order to create a query listing all machines in question. (WIP)

Actions #21

Updated by okurz about 1 month ago

In doubt ask the contact persons listed in racktables

Actions #22

Updated by okurz about 1 month ago · Edited

As discussed next steps:

  1. Report extra ticket about all machines having non-employee contact person lee.martin. One example machine https://racktables.nue.suse.com/index.php?page=object&tab=edit&object_id=19416 -> goal updated contact person and updated state of machine. All machines are in https://racktables.nue.suse.com/index.php?page=rack&rack_id=2560 . If nobody else wants to use the hardware then we are happy to adopt them :)
  2. Report extra ticket about all machines having non-QE employee Paul Gonin without QE LSG tag. One example machine https://racktables.nue.suse.com/index.php?page=object&tab=edit&object_id=2988
  3. Update status about machine assigned to former QE employee thehehjik https://racktables.nue.suse.com/index.php?page=object&object_id=10236
  4. Update status about machine assigned to non-QE employee zkubala https://racktables.nue.suse.com/index.php?page=object&object_id=1210 . Machine shouldn't list QE LSG if machine is not assigned to QE LSG
  5. Add row to https://confluence.suse.com/display/~ewalker/Prague+%28in+office%29+Server+Room+requirements+for+Prague+teams listing "QE Tools" and query pointing to "QE LSG"+"PRG1"+"openQA" tags
Actions #23

Updated by robert.richardson about 1 month ago

  1. Done

  2. You mean the Ticket should be about QE LSG tag removal here ? Because all machines with paul.gonin as contact already have both the QAM and QE LSG tag.

  3. The example you provided does not list thehehjik as contact, and i also cant find any machine(s) whith that contact using the search function.. Am i missing something here, did you maybe already address this and the previous point ?

  4. Removed QE LSG, what about the QA tag ?

  5. Done, although i ended up not using the query after all, as that included machines managed (and already mentioned in the thread) by vpelcak.

Actions #24

Updated by okurz about 1 month ago

robert.richardson wrote in #note-23:

  1. You mean the Ticket should be about QE LSG tag removal here ? Because all machines with paul.gonin as contact already have both the QAM and QE LSG tag.

Yes, QE LSG should be removed as pgonin and his team are not in QE LSG

  1. The example you provided does not list thehehjik as contact, and i also cant find any machine(s) whith that contact using the search function.. Am i missing something here, did you maybe already address this and the previous point ?

  2. Removed QE LSG, what about the QA tag ?

Who is the owner then?

Actions #26

Updated by robert.richardson about 1 month ago

okurz wrote in #note-24:

robert.richardson wrote in #note-23:

  1. You mean the Ticket should be about QE LSG tag removal here ? Because all machines with paul.gonin as contact already have both the QAM and QE LSG tag.

Yes, QE LSG should be removed as pgonin and his team are not in QE LSG

Ok, i've simply removed those "QE LSG" tags as they where wrongly added by me yesterday going off of the FQDN and QAM tag.

  1. The example you provided does not list thehehjik as contact, and i also cant find any machine(s) whith that contact using the search function.. Am i missing something here, did you maybe already address this and the previous point ?

  2. Removed QE LSG, what about the QA tag ?

Who is the owner then?

For me it shows Jan Kohoutek

Actions #27

Updated by robert.richardson about 1 month ago

  • Status changed from In Progress to Resolved

I've added the qa tools section to the according planing page, which anyone may extend / edit in case i missed any special requirements.
Resolving the ticket as discussed on slack.

Actions #28

Updated by okurz about 1 month ago

  • Due date deleted (2025-01-28)
Actions

Also available in: Atom PDF