action #110521
Updated by okurz over 2 years ago
Improve QA related server room management, network topology and configuration ## Motivation Different ideas to improve, from #102650#note-25 ## Acceptance criteria 1. The "common name" of any server or active component should be including a suffix .qa.suse.de, this is what we have in most machines. We have that added in some machines rack 1 already * **AC1:** 1. There are some differing "tags", e.g. the "Usage Type" which can be "Testing"/"Development"/"Production". We suggest to avoid "Development" and use "Production" for machines that we expect to be "mostly up", e.g. qanet, qamaster, etc., or anything that is maintained by SUSE QE Tools. All bare-metal testing machines within NUE Server Room 2 : NUE-SRV2-B : Rack 1-4 have at least one port with MAC address configured (should should then be BMC and machine-specific interface) "Testing" * **AC2:** 1. We have a best-practice documented wonder about if grenache-4 and grenache-5 even still exist. Also what about grenache-2, grenache-3? We tried to access an HMC but neither powerhmc1.arch.suse.de nor powerhmc2.arch.suse.de are reachable. We tried over novalink on grenache. With command `pvmctl lpar list` we found grenache-1 through grenache-8. We found for adding new some other machines / keeping existing entries up-to-date ## Suggestions that only "free slots" are used. -> Based on https://suse.slack.com/archives/C029APBKLGK/p1649246680601499?thread_ts=1649241786.564629&cid=C029APBKLGK I could make that look proper by deleting all objects except for the first within the server chassis and now we have 3 free slots showing up 1. Having at least the MAC address for each machine is helpful to debug. We checked holmes.qa.suse.de as the first non-production machine in NUE-SRV2-B-Rack-1. Using IPMI credentials from https://gitlab.suse.de/openqa/salt-pillars-openqa/-/blob/master/openqa/workerconf.sls we could login to https://sp.holmes.qa.suse.de/ (equivalent to https://holmes-sp.qa.suse.de/ , CNAME entry in DNS). Nice surprise we could find that the HMC of holmes knows (likely from SNMP) on which switch port and switch mac address it is connected. We crosschecked that by looking into the configuration ssh interface of the switch -> Do the same exercise manually or semi-automated or fully scripted for all machines and update racktables accordingly 1. https://wiki.racktables.org/index.php/RackTablesUserGuide#SNMP_Sync says that racktables can get information from network switches automatically, sounds nice and worthwhile to explore deeper. We found that SNMP lookup in racktables works nicely, e.g. on qanet15nue.qa.suse.de, v1, public. With that we can update configuration in racktables. Maybe we can allow hosts to set the switch with more SNMP stuff like port description and then we read that port description into racktables over the SNMP sync function? 1. Evaluate https://github.com/rvojcik/rt-server-client if it can be used to keep Racktables networking information up to date automatically 1. Use https://gitlab.suse.de/nicksinger/network-scripts/-/blob/main/find_mac.py to find switch ports by the mac address 1. Research what is the best approach based on the above tooling, document it, e.g. on https://wiki.suse.net/index.php/SUSE-Quality_Assurance/Labs