action #129283
closedcoordination #121720: [saga][epic] Migration to QE setup in PRG2+NUE3 while ensuring availability
coordination #129280: [epic] Move from SUSE NUE1 (Maxtorhof) to new NBG Datacenters
[tools] Help Needed: Active Inventory of Maxtorhof SRV1/SRV2/SRV2X
0%
Description
Motivation¶
From email "Re: Help Needed: Active Inventory of Maxtorhof SRV1/SRV2/SRV2X"
On 5/12/23 04:08, Oliver Kurz wrote:
Vojtech, Dietrich, and I had a call with IT this afternoon (morning US) to discuss what the future of systems located in the Maxtorhof server rooms looks like. One thing that is uncertain is what systems need to be kept, which will be decommissioned, and hw that has been purchased and is being stored in Maxtorhof. In sum, how much space we'll need in a new data center. There is a lot of hardware in those server rooms that is no longer actively used but hasn't been removed. I know some systems have been forgotten/abandoned (e.g. the old HA hex bladecenter is still racked). It doesn't make sense to move systems that should be scrapped and it would be super helpful to get these systems off the board.
Can I ask you to tag systems in Netbox[1] with one of the following tags for systems that are still actively used?
- BCL-LSG-Needed
- BCL-LSG-Needed-CC (for systems subject to common criteria)
- BCL-MGR-Needed
- BCL-SAP-Needed
- BCL-CSM-Needed
you want us to tag all systems, that is servers, switches, power rails, etc. in all SRV1/SRV2/SRV2X? If yes, what is the efficient way to do that better than walking over every single entry by hand?
Servers and switches, yes. I don't think PDUs need to be documented explicitly: That'll be a function of how many systems are in each destination rack.
There is an API that can be used to query and update systems. What parameters would you use to determine systems that need to be tagged? I have some code that could be leveraged to help automate that
For bonus points, we have a To-be-decommissioned tag to mark a system affirmatively as to be scrapped. Otherwise, as we wind through the planning stages, we'll assume systems that aren't claimed aren't being used. If we need additional tags, please let me know.
[1] https://netbox.dyn.cloud.suse.deCan you please clarify the situation regarding
https://racktables.nue.suse.com/ vs. https://netbox.dyn.cloud.suse.de ?
To my knowledge https://racktables.nue.suse.com/ is still the reference and we are actively working with that system. It was never announced that nextbox should be used over racktables. If a full sync between both systems is ensured then of course we can add data to netbox but I would like to do that only after full confirmation.
Netbox is being periodically synced from Racktables. Hannes is putting together the HLD for replacing Racktables with Netbox. For now, continue tracking systems as you had in Racktables, but we're using Netbox for exercises like this because it has an API instead of requiring direct database access to do bulk operations.
-Jeff
--
Jeff Mahoney
VP Engineering, LSG Systems
Updated by okurz over 1 year ago
Sent an email to jmahoney:
I would apply BCL-LSG-Needed to all systems that currently have one of the tags "QA", "QAM", "openQA".
There are some systems that are "needed" but might be better to be moved to FC Basement. I will for now still tag them with BCL-LSG-Needed. To decide where they should be moved I would need more information about the characteristics of any new nbg Datacenter. So far I have received only some incomplete verbal information. Can you help me to find more official information?
In the meantime I logged in to
https://netbox.dyn.cloud.suse.de/ using the openSUSE SSO. I applied the label "BCL-LSG-Needed" manually on QA-Power8-2.qa.suse.de for testing purposes.
Over https://netbox.dyn.cloud.suse.de/user/api-tokens/ I created an API token. Following https://demo.netbox.dev/static/docs/rest-api/overview/ I am trying that out.
I could call
curl -s -H "Authorization: Token $TOKEN" -H "Content-Type: application/json" https://netbox.dyn.cloud.suse.de/api/dcim/devices/3598/
and the result looks reasonable.
https://netbox.dyn.cloud.suse.de/dcim/devices/?tag=nuremberg&tag=qa&location_id=107&location_id=108&location_id=113 is a manually constructed filter for all "QA" devices in "Location: NUE Server Room 1, NUE Server Room 2, NUE Server Room 2 Extension".
A corresponding pretty-printed output over API can be retrieved with
curl -s -H "Authorization: Token $TOKEN" -H "Content-Type: application/json" "https://netbox.dyn.cloud.suse.de/api/dcim/devices/?tag=nuremberg&tag=qa&location_id=107&location_id=108&location_id=113" | jq .
next step: Add label to each over API
I narrowed down the filter to
https://netbox.dyn.cloud.suse.de/dcim/devices/?tag=nuremberg&tag=qa&location_id=107&location_id=108&location_id=113&status=active&role_id=5&role_id=6&role_id=7&role_id=8&role_id=10&role_id=42&role_id=15&role_id=17&role_id=19&role_id=28&role_id=29&role_id=32
which yields 126 devices. Using the batch edit functionality of the GUI I added the label or rather "tag" "BCL-LSG-Needed" to all 126 entries. I found that this missed some devices, e.g. unarmed.qa.suse.de and amd-zen2-gpu-sut1.qa.suse.de but not sure why.
https://netbox.dyn.cloud.suse.de/dcim/devices/?tag=nuremberg&tag=qa&location_id=107&location_id=108&location_id=113&status=active&role_id=5&role_id=6&role_id=7&role_id=8&role_id=10&role_id=42&role_id=15&role_id=17&role_id=19&role_id=28&role_id=29&role_id=32&sort=device_type&page=1 looks ok now. Should check the state in racktables the following days and crosscheck.
Updated by okurz over 1 year ago
- Due date set to 2023-06-16
- Status changed from New to Feedback
Have not seen BCL-LSG-Needed showing up in racktables. Did the same exercise with qam machines adding the tag in netbox.
Updated by okurz over 1 year ago
- Due date deleted (
2023-06-16) - Status changed from Feedback to Resolved
So there is no sync from netbox back to racktables. The request label is in netbox as requested and we should definitely prefer racktables over netbox as long as racktables is still the ultimate source of truth.