Project

General

Profile

Actions

action #129283

closed

coordination #121720: [saga][epic] Migration to QE setup in PRG2+NUE3 while ensuring availability

coordination #129280: [epic] Move from SUSE NUE1 (Maxtorhof) to new NBG Datacenters

[tools] Help Needed: Active Inventory of Maxtorhof SRV1/SRV2/SRV2X

Added by okurz over 1 year ago. Updated over 1 year ago.

Status:
Resolved
Priority:
Normal
Assignee:
Start date:
2023-05-15
Due date:
% Done:

0%

Estimated time:
Tags:

Description

Motivation

From email "Re: Help Needed: Active Inventory of Maxtorhof SRV1/SRV2/SRV2X"

On 5/12/23 04:08, Oliver Kurz wrote:

Vojtech, Dietrich, and I had a call with IT this afternoon (morning US) to discuss what the future of systems located in the Maxtorhof server rooms looks like. One thing that is uncertain is what systems need to be kept, which will be decommissioned, and hw that has been purchased and is being stored in Maxtorhof. In sum, how much space we'll need in a new data center. There is a lot of hardware in those server rooms that is no longer actively used but hasn't been removed. I know some systems have been forgotten/abandoned (e.g. the old HA hex bladecenter is still racked). It doesn't make sense to move systems that should be scrapped and it would be super helpful to get these systems off the board.

Can I ask you to tag systems in Netbox[1] with one of the following tags for systems that are still actively used?

  • BCL-LSG-Needed
  • BCL-LSG-Needed-CC (for systems subject to common criteria)
  • BCL-MGR-Needed
  • BCL-SAP-Needed
  • BCL-CSM-Needed

you want us to tag all systems, that is servers, switches, power rails, etc. in all SRV1/SRV2/SRV2X? If yes, what is the efficient way to do that better than walking over every single entry by hand?

Servers and switches, yes.  I don't think PDUs need to be documented explicitly: That'll be a function of how many systems are in each destination rack.

There is an API that can be used to query and update systems.  What parameters would you use to determine systems that need to be tagged?  I have some code that could be leveraged to help automate that

For bonus points, we have a To-be-decommissioned tag to mark a system affirmatively as to be scrapped. Otherwise, as we wind through the planning stages, we'll assume systems that aren't claimed aren't being used.  If we need additional tags, please let me know.
[1] https://netbox.dyn.cloud.suse.de

Can you please clarify the situation regarding
https://racktables.nue.suse.com/ vs. https://netbox.dyn.cloud.suse.de ?
To my knowledge https://racktables.nue.suse.com/ is still the reference and we are actively working with that system. It was never announced that nextbox should be used over racktables. If a full sync between both systems is ensured then of course we can add data to netbox but I would like to do that only after full confirmation.

Netbox is being periodically synced from Racktables.  Hannes is putting together the HLD for replacing Racktables with Netbox.  For now, continue tracking systems as you had in Racktables, but we're using Netbox for exercises like this because it has an API instead of requiring direct database access to do bulk operations.

-Jeff

--
Jeff Mahoney
VP Engineering, LSG Systems

Actions #1

Updated by okurz over 1 year ago

  • Tags set to infra
Actions #2

Updated by okurz over 1 year ago

Sent an email to jmahoney:

I would apply BCL-LSG-Needed to all systems that currently have one of the tags "QA", "QAM", "openQA".
There are some systems that are "needed" but might be better to be moved to FC Basement. I will for now still tag them with BCL-LSG-Needed. To decide where they should be moved I would need more information about the characteristics of any new nbg Datacenter. So far I have received only some incomplete verbal information. Can you help me to find more official information?

In the meantime I logged in to
https://netbox.dyn.cloud.suse.de/ using the openSUSE SSO. I applied the label "BCL-LSG-Needed" manually on QA-Power8-2.qa.suse.de for testing purposes.

Over https://netbox.dyn.cloud.suse.de/user/api-tokens/ I created an API token. Following https://demo.netbox.dev/static/docs/rest-api/overview/ I am trying that out.

I could call

curl -s -H "Authorization: Token $TOKEN" -H "Content-Type: application/json" https://netbox.dyn.cloud.suse.de/api/dcim/devices/3598/

and the result looks reasonable.

https://netbox.dyn.cloud.suse.de/dcim/devices/?tag=nuremberg&tag=qa&location_id=107&location_id=108&location_id=113 is a manually constructed filter for all "QA" devices in "Location: NUE Server Room 1, NUE Server Room 2, NUE Server Room 2 Extension".

A corresponding pretty-printed output over API can be retrieved with

curl -s -H "Authorization: Token $TOKEN" -H "Content-Type: application/json" "https://netbox.dyn.cloud.suse.de/api/dcim/devices/?tag=nuremberg&tag=qa&location_id=107&location_id=108&location_id=113" | jq .

next step: Add label to each over API

I narrowed down the filter to
https://netbox.dyn.cloud.suse.de/dcim/devices/?tag=nuremberg&tag=qa&location_id=107&location_id=108&location_id=113&status=active&role_id=5&role_id=6&role_id=7&role_id=8&role_id=10&role_id=42&role_id=15&role_id=17&role_id=19&role_id=28&role_id=29&role_id=32
which yields 126 devices. Using the batch edit functionality of the GUI I added the label or rather "tag" "BCL-LSG-Needed" to all 126 entries. I found that this missed some devices, e.g. unarmed.qa.suse.de and amd-zen2-gpu-sut1.qa.suse.de but not sure why.

https://netbox.dyn.cloud.suse.de/dcim/devices/?tag=nuremberg&tag=qa&location_id=107&location_id=108&location_id=113&status=active&role_id=5&role_id=6&role_id=7&role_id=8&role_id=10&role_id=42&role_id=15&role_id=17&role_id=19&role_id=28&role_id=29&role_id=32&sort=device_type&page=1 looks ok now. Should check the state in racktables the following days and crosscheck.

Actions #3

Updated by okurz over 1 year ago

  • Due date set to 2023-06-16
  • Status changed from New to Feedback

Have not seen BCL-LSG-Needed showing up in racktables. Did the same exercise with qam machines adding the tag in netbox.

Actions #4

Updated by okurz over 1 year ago

  • Due date deleted (2023-06-16)
  • Status changed from Feedback to Resolved

So there is no sync from netbox back to racktables. The request label is in netbox as requested and we should definitely prefer racktables over netbox as long as racktables is still the ultimate source of truth.

Actions

Also available in: Atom PDF