action #166280
closedSetup of Nvidia Orin for bare-metal tests in openQA
0%
Description
Motivation¶
There is a new machine which we should setup in FC Basement as openQA bare-metal test host, similar to what is done in #150830. We should take over this machine, mount it in FC Basement and bring it into OSD production as bare-metal test machine and ensure testing related squads follow-up with specific testing, e.g. just run the default scenario(s) on the specific host.
Email from Andreas
NVIDIA is now finally shipping to me
another NVIDIA IGX Orin Developer Kit (workstation form-factor, with BMC and optional ConnectX-7 NIC with QSFPs) for QE.
https://www.nvidia.com/en-us/edge-computing/products/igx/
The purpose would be two-fold: bare-metal testing of SLES and SL Micro, and installation/functional testing of SUSE SolidDriver KMPs plus NVIDIA
proprietary libraries/containers (unfinished docs: https://github.com/SUSE/doc-modular/pull/339)
Acceptance criteria¶
- AC1: One new ARM server
NVIDIA IGX Orin
is used in production in openqa.suse.de as bare-metal test host - AC2: Our inventory management system is up-to-date
Suggestions¶
- Read what we did in #150830 because we will do something very similar here just for another machine
- Pickup the machine from the Frankencampus Ralf Unger's office, bring it to FC Basement
- Name it
nvidia-ixg-orin01
- Mount and connect the machine
- Include the configuration, in particular network, in our inventory management system, e.g. in https://racktables.nue.suse.com/index.php?page=rack&rack_id=19186 on top of nvidia-ixg-orin01
- Read out MAC addresses and add details in https://gitlab.suse.de/OPS-Service/salt/ for DHCP/DNS
- Add machines bare-metal test machines in OSD, i.e. include in https://gitlab.suse.de/openqa/salt-pillars-openqa/-/blob/master/openqa/workerconf.sls, e.g. with experimental worker classes, test, then make production worker classes
- Talk with testing squads about extending test scope covering this machine
- Ensure testing implementation is planned or completed accordingly
Ensure our inventory management system is up-to-date
Add a special worker class "nvidia_orin" and ensure that testing related squads run tests on that worker class, e.g. default@nvidia_orin
Updated by szarate 5 months ago
- Copied from action #152887: Setup of Ampere Altra Q32-17 for bare-metal tests in openQA size:M added
Updated by szarate 5 months ago · Edited
Manuals for the system are at:
- https://docs.nvidia.com/igx-orin/user-guide/latest/index.html
- https://docs.nvidia.com/igx-orin/bmc/latest/quickstart.html
Power button is very small, top right corner.
The system is left in Ralf's office, Big Black Box. Seems to require a DisplayPort cable to be connected.
Updated by okurz 5 months ago
- Description updated (diff)
- Category set to Feature requests
- Status changed from Workable to Blocked
- Assignee set to okurz
Given #166598 hit us now we tried to estimate this ticket in the tools team and couldn't accomplish this as we don't know how bare-metal test machines could be put to use in the near future in non-PRG2 locations at all. One way I see is to actually integrate the machine with o3 and not osd, otherwise block on #166598
Updated by okurz 5 months ago
- Status changed from Blocked to Feedback
- Priority changed from Normal to Low
- Target version changed from Ready to future
Alright. I would keep this ticket in feedback but outside our backlog then eagerly awaiting a picture how you set up the machine in your lab location :)
Updated by okurz 4 months ago
- Status changed from Feedback to Resolved
- Target version changed from future to Ready
I extended https://racktables.nue.suse.com/index.php?page=object&tab=edit&object_id=28278 a bit with more details and tags. This should suffice.