action #150830
closedTwo new ARM servers 2023-11 for openqa.suse.de bare-metal testing size:M
0%
Description
Motivation¶
afaerber as coordinator for ARM SUSE development+testing has two new ARM machines ready to be integrated as bare-metal test hosts. We should take over those machines, mount them in FC Basement and bring them into OSD production as bare-metal test machines and ensure testing related squads follow-up with specific testing, e.g. just run the default scenario(s) on each specific host.
Acceptance criteria¶
- AC1: Two new ARM servers from 2023-11 are used in production in openqa.suse.de as bare-metal test hosts
- AC2: Our inventory management system is up-to-date
Suggestions¶
- Coordinate the pickup with afaerber
- Pick up the machines, bring them to FC Basement
- Mount and connect the machines
- Include the configuration, in particular network, in our inventory management system, e.g. in https://racktables.nue.suse.com/index.php?page=rack&rack_id=19182
- Add machines bare-metal test machines in OSD, i.e. include in https://gitlab.suse.de/openqa/salt-pillars-openqa/-/blob/master/openqa/workerconf.sls, e.g. with experimental worker classes, test, then make production worker classes
- Talk with testing squads about extending test scope covering those machines
- Ensure testing implementation is planned or completed accordingly
- Ensure our inventory management system is up-to-date
Updated by okurz about 1 year ago
- Subject changed from Two new ARM servers 2023-11 for openqa.suse.de bare-metal testing to Two new ARM servers 2023-11 for openqa.suse.de bare-metal testing size:M
Updated by okurz about 1 year ago
- Tags changed from infra, arm, fc-basement, next-frankencampus-visit to infra, arm, fc-basement
- Due date set to 2023-12-07
- Status changed from New to Feedback
Picked up two new ARM server packages. access control to FC Basement defective, https://suse.slack.com/archives/C029ANHBQ5R/p1700052451248939 is the thread to follow about that blocking us
Updated by okurz about 1 year ago
- Status changed from Feedback to In Progress
Moved both machines to FC Basement with help from mgriessmeier. Machines are in rack but not yet connected.
Updated by okurz about 1 year ago
- Status changed from In Progress to Workable
Setup racktables entries for squidward https://racktables.nue.suse.com/index.php?page=object&tab=default&object_id=26276 and squidbilly https://racktables.nue.suse.com/index.php?object_id=26271&page=object&tab=default
More planned when I will be in FC Basement again next week.
Updated by okurz about 1 year ago · Edited
- Status changed from Workable to In Progress
connected power and ipmi. squidbilly has fedora with root/root. ipmi 10.168.195.218. ipmitool -Ilanplus -H 10.168.195.218 -U admin -P admin
works fine. Connected FCs for both but seems like switch has those not activated yet.
same for squidward, has ipmi 10.168.194.235 but sol does not show anything.
Updated by okurz about 1 year ago
- Due date deleted (
2023-12-07) - Status changed from In Progress to Blocked
Updated by okurz 12 months ago
- Status changed from Blocked to In Progress
https://gitlab.suse.de/OPS-Service/salt/-/merge_requests/4447 merged.
Changed IPMI password for user ADMIN same as we have for other bare-metal machines and included in openQA salt pillar worker config
https://gitlab.suse.de/openqa/salt-pillars-openqa/-/merge_requests/699 (merged).
Informed afaerber and szarate to check.
Updated by okurz 12 months ago
- Due date set to 2024-01-12
- Status changed from In Progress to Feedback
https://suse.slack.com/archives/C02CCN59E94/p1702552841229849
@Andreas Faerber @Santiago Zarate as we discussed about the two new ARM servers for QE the corresponding ticket with progress is https://progress.opensuse.org/issues/150830 . The two machines are called squidward and squidbilly, racktable entries https://racktables.nue.suse.com/index.php?page=object&tab=default&object_id=26276 and https://racktables.nue.suse.com/index.php?page=object&object_id=26271 correspondingly. Both machines are in openQA with temporary non-production worker classes and they can be addressed by hostname directly, e.g. openQA tests scheduled with
WORKER_CLASS=squidward
. I assume that the fibrechannel connection is not yet enabled for them on the switches but you can also check that yourself :)
Updated by okurz 12 months ago
I have updated all switches in B:1 through B:5 to use the proper port name, e.g. 5/0/1 and such in https://racktables.nue.suse.com/index.php?page=row&row_id=19134, mkittler has enabled the fibrechannel ports.
Updated by mkittler 12 months ago
qa> configure
Entering configuration mode
{master:5}[edit]
qa# set interfaces xe-4/0/1 unit 0 family ethernet-switching interface-mode access
{master:5}[edit]
qa# set interfaces xe-4/0/1 unit 0 family ethernet-switching vlan members VL192
{master:5}[edit]
qa# set interfaces xe-4/0/0 unit 0 family ethernet-switching interface-mode access
{master:5}[edit]
qa# set interfaces xe-4/0/0 unit 0 family ethernet-switching vlan members VL192
{master:5}[edit]
qa# commit
configuration check succeeds
fpc1:
commit complete
fpc2:
commit complete
fpc3:
commit complete
fpc4:
commit complete
commit complete
{master:5}[edit]
Updated by mkittler 12 months ago · Edited
martchus@openqa:~> for w in squidward squidbilly ; do sudo openqa-clone-job --skip-download --parental-inheritance --within-instance https://openqa.suse.de/tests/13004244 _GROUP=0 WORKER_CLASS="$w" {BUILD,TEST}+=-$w-poo150830 ; done
- sle-15-SP6-Online-aarch64-Build44.1-prepare_baremetal@ipmi-64bit-unarmed -> https://openqa.suse.de/tests/13079356
- sle-15-SP6-Online-aarch64-Build44.1-install_ltp_baremetal@ipmi-64bit-unarmed -> https://openqa.suse.de/tests/13079355
- sle-15-SP6-Online-aarch64-Build44.1-ltp_net_ipv6_lib_baremetal@ipmi-64bit-unarmed -> https://openqa.suse.de/tests/13079354
- sle-15-SP6-Online-aarch64-Build44.1-prepare_baremetal@ipmi-64bit-unarmed -> https://openqa.suse.de/tests/13079359
- sle-15-SP6-Online-aarch64-Build44.1-install_ltp_baremetal@ipmi-64bit-unarmed -> https://openqa.suse.de/tests/13079357
- sle-15-SP6-Online-aarch64-Build44.1-ltp_net_ipv6_lib_baremetal@ipmi-64bit-unarmed -> https://openqa.suse.de/tests/13079358
Updated by okurz 12 months ago
- Due date changed from 2024-01-12 to 2024-01-19
both squidbilly+squidarm failed to boot over network. I guess the machines need to be configured to use network boot. But also I created https://gitlab.suse.de/OPS-Service/salt/-/merge_requests/4542 to enable use of the iPXE server config, same as unarmed+monkey3.
Updated by okurz 12 months ago
- Related to action #152887: Setup of Ampere Altra Q32-17 for bare-metal tests in openQA size:M added
Updated by okurz 11 months ago
- Status changed from Feedback to In Progress
https://gitlab.suse.de/OPS-Service/salt/-/merge_requests/4542 merged and deployed.
Triggered new jobs to test:
- sle-15-SP6-Online-aarch64-Build44.1-prepare_baremetal@ipmi-64bit-unarmed -> https://openqa.suse.de/tests/13187308
- sle-15-SP6-Online-aarch64-Build44.1-install_ltp_baremetal@ipmi-64bit-unarmed -> https://openqa.suse.de/tests/13187309
- sle-15-SP6-Online-aarch64-Build44.1-ltp_net_ipv6_lib_baremetal@ipmi-64bit-unarmed -> https://openqa.suse.de/tests/13187307
- sle-15-SP6-Online-aarch64-Build44.1-prepare_baremetal@ipmi-64bit-unarmed -> https://openqa.suse.de/tests/13187311
- sle-15-SP6-Online-aarch64-Build44.1-install_ltp_baremetal@ipmi-64bit-unarmed -> https://openqa.suse.de/tests/13187312
- sle-15-SP6-Online-aarch64-Build44.1-ltp_net_ipv6_lib_baremetal@ipmi-64bit-unarmed -> https://openqa.suse.de/tests/13187310
Updated by okurz 11 months ago
- Tags changed from infra, arm, fc-basement to infra, arm, fc-basement, next-frankencampus-visit
Nope, failed the same. Seems like no network connection available. I booted both squidward+squidbilly over the web remote control interface, selected to boot into the UEFI menu over the pre-installed GRUB and in there configured the boot order to try to boot over network before trying other storage devices. Took me some time to also find that there is an additional network setting to disable/enable the fibre network interfaces which I did on squidward. But regardless the booted Fedora systems don't show a carrier on the fibre network devices. Guess I need to check again in person.
Updated by okurz 11 months ago
I checked the connections physically with the help of dheidler. The SFP+ were both upside down and not properly seated, both cables, both ends. Turned around and ensured that the cables are properly seated.
Additionally to the above network switch configuration by mkittler also did set protocols rstp interface xe-3/2/1 edge
with the according switch ports and we ensured that the network cables get a proper network setup using squiddlydiddly, see #152887. Then over remote control interface for squidbilly I could at least initially get a successful iPXE boot on squidbilly but only that success message, not an interactive menu showing up. Will crosscheck with squidward.
Updated by okurz 11 months ago
- Due date deleted (
2024-01-19) - Status changed from In Progress to Resolved
squidward looks good now, successfully booting SLE installation media, see https://openqa.suse.de/tests/13214720 . squidbilly somehow always reverts to booting from storage but I am sure with the right settings in the UEFI menu this can be fixed.
Handed over to kernel squad: #153277