action #128498
closedARM server for UV squad (was: Requesting a quote for two Ampere Altra Servers to be used for various testing efforts inside the department) size:M
Added by mgriessmeier over 1 year ago. Updated over 1 year ago.
0%
Description
We see an upcoming need for stable aarch64 Hardware inside our department
According to successful tests in #121261 - Ampere Altra machines seem to be the most reliable for that.
We decided that additional to the 4 newly ordered Ampere Altra machines for DC7 in Prague to be run within openQA production, that we want to have 2 more of those machines to be located in Nuremberg for development and redundancy reasons.
Could you please help to get a quote for 2 ARM Ampere Altra machines with roughly following specs:
- Rough budget around 3000-5000 USD per machine
- Ampere Altra q80-30 (or q80-33) if available
- 512GB RAM (8x64GB)
- dedicated IPMI/BMC, 2x 10G copper
- NVME Disk
if you need further requirements, please coordinate with @szarate
Suggestion¶
- Find vendor and get a quote
- If the decision is made to order machines, make sure there is an open ticket including ordering, mounting, installation, etc
Files
Updated by okurz over 1 year ago
- Description updated (diff)
- Assignee deleted (
nicksinger) - Target version set to Ready
shouldn't we wait for results from DC7 before continuing the same route? Getting quotes would be ok as that's not necessarily the decision to buy but we should keep in mind that we have currently openqaworker-arm-4 and openqaworker-arm-5 which are completely unused as of now. I know, Marvell ThunderX2, not Ampera Altra but still I'd be careful.
Updated by nicksinger over 1 year ago
- Subject changed from Requesting a quote for two Ampere Altra Servers to be used for various testing efforts inside the department to Requesting a quote for two Ampere Altra Servers to be used for various testing efforts inside the department size:M
- Description updated (diff)
- Status changed from New to Workable
Updated by szarate over 1 year ago
okurz wrote:
Getting quotes would be ok as that's not necessarily the decision to buy but we should keep in mind that we have currently openqaworker-arm-4 and openqaworker-arm-5 which are completely unused as of now.
I know, Marvell ThunderX2, not Ampera Altra but still, I'd be careful.
Ampere is the only serious player now, openqaworker-arm-4 can be taken by us for a while or for random tests, but they are paperweights at this point :).
Also for the 64K pages testing, it needs to be done in Ampere machines.
Updated by nicksinger over 1 year ago
Where does the 5k€ price estimation come from? Looking at https://www.deltacomputer.com/d10a-m1-aa.html I see the lowest entry (with components required by us) at 5.5k€ for an Q32-17 CPU. Q80-30 starts at 9.1k€. I attached a configuration which should fit our needs.
Updated by okurz over 1 year ago
- Due date set to 2023-06-27
- Status changed from Workable to Feedback
- Assignee set to okurz
Your configuration looks reasonable to me, thank you.
@nicksinger picking up the ticket to await feedback from stakeholders. Feel free to grasp the ticket from me again.
@mgriessmeier @szarate I strongly recommend to refrain from ordering new hardware until the PRG2+NUE3 situation has settled down a bit. What do you think about the pricing and how do you want to proceed?
Updated by mgriessmeier over 1 year ago
okurz wrote:
@mgriessmeier [...] What do you think about the pricing and how do you want to proceed?
I will evaluate the ARM situation again, and check the necessity. Coming back to you after my vacation
Updated by szarate over 1 year ago
In terms
okurz wrote:
Your configuration looks reasonable to me, thank you.
@nicksinger picking up the ticket to await feedback from stakeholders. Feel free to grasp the ticket from me again.
@mgriessmeier @szarate I strongly recommend to refrain from ordering new hardware until the PRG2+NUE3 situation has settled down a bit.
Eventually we'll still need them
What do you think about the pricing and how do you want to proceed?
I'll wait for Matthi to come back; pricing wise, we might have to look again... in the meantime, let's wait; Also with ARM sponsoring only 1 machine, instead of 2, we might need to reevaluate the machine's configuration
Updated by okurz over 1 year ago
- Due date changed from 2023-06-27 to 2023-07-04
I asked mgriessmeier for an update on the above
Updated by mgriessmeier over 1 year ago
situation as of now:
- we will get 1 Ampere Altra sponsored by ARM, according to jstehlik procurement will be managed by afaerber, target site will be Frankencampus for now
- if we'll need an additional one (ordered by QE LSG) is still in evaluation, but tendency is towards 'no' for this fiscal year
Updated by okurz over 1 year ago
ok, good. Do you have any ticket or something to track where we can track the ordering/shipping/delivery/setup of the machine?
Updated by okurz over 1 year ago
- Due date changed from 2023-07-04 to 2023-07-18
no response. I addressed afaerber directly in https://suse.slack.com/archives/C02CCN59E94/p1688389334419489
hi, according to @Matthias Griessmeier and @Jan Stehlík there is one "Ampere Altra" server sponsored by ARM to be provided to LSG QE with target site Frankencampus. Do you have any ticket or something to track where we can track the ordering/shipping/delivery/setup of the machine?
Updated by okurz over 1 year ago
- Due date changed from 2023-07-18 to 2023-12-31
- Priority changed from Normal to Low
afaerber will eventually contact me if he knows about any shipping status of new machines that we should then setup.
Updated by mgriessmeier over 1 year ago
- Priority changed from Low to High
Hi,
I don't have any update about the one arm machines which we get sponsored by ARM, @runger is chasing that.
However, there is a need for an ARM server with very limited specs for the Update Validation squad. For budgeting reasons, we need to order this ASAP, so if you could please get a quote from delta for following specs:
essentially, any small spec configuration would be sufficient (minimal amount of ram (64GiB), 1 CPU socket, small storage (M.2 500 GiB), 1 GBit NIC) but full IPMI mgmt is a must
please reach out to @hrommel or me if you have any questions
Updated by okurz over 1 year ago
- Due date deleted (
2023-12-31) - Status changed from Feedback to New
- Assignee deleted (
okurz) - Priority changed from High to Low
- Target version changed from Ready to future
I am sorry. We do not have the capacity to do that right now. We should not endanger any efforts regarding datacenter migration.
Updated by okurz over 1 year ago
- Status changed from New to In Progress
- Assignee set to okurz
- Target version changed from future to Ready
Wait. Actually we have three arm machines in FC Basement. Maybe they meet those specs. I will get in contact with hrommel.
https://suse.slack.com/archives/C02CANHLANP/p1691497630169919
(Oliver Kurz) @Heiko Rommel We have openqaworker-arm-4.qe.nue2.suse.org, openqaworker-arm-5.qe.nue2.suse.org, arm3.qe.nue2.suse.org, thunderx21 with varying specs. Please check which of those machines meet your requirements and we can reserve those machines for your testing demands. Given that a lot of hardware is unused and was abandoned quickly in particular when it is about QAM and manual testing related efforts I strongly suggest to check available hardware before ordering any new hardware.
Updated by okurz over 1 year ago
- Due date set to 2023-08-31
- Status changed from In Progress to Feedback
Updated by okurz over 1 year ago
- Due date deleted (
2023-08-31) - Status changed from Feedback to Rejected
- Target version changed from Ready to future
Last message in the linked thread is https://suse.slack.com/archives/C02CANHLANP/p1692704238666599?thread_ts=1691497630.169919&cid=C02CANHLANP
(Matthias Griessmeier) @Heiko Rommel what you mean with "reliable"? as far as I am aware of, the only issue that our arm machines have, is that they tend to "hang" until power cycled - which indeed is not optimal, but can be safely worked around since all of them are connected with a managed PDU to power cycle them.
With this back to the original plan: afaerber might eventually contact us regarding a new machine. If not then regardless we don't have the capacity to track this further right now. So for now setting to "Rejected" accordingly.
Updated by okurz over 1 year ago
- Status changed from Rejected to Feedback
- Target version changed from future to Ready
working with mpluskal to setup machines.
Updated by okurz over 1 year ago
- Subject changed from Requesting a quote for two Ampere Altra Servers to be used for various testing efforts inside the department size:M to ARM server for UV squad (was: Requesting a quote for two Ampere Altra Servers to be used for various testing efforts inside the department) size:M
- Due date set to 2023-09-20
I provided mpluskal the IPMI credentials for both openqaworker-arm-4+5 and updated racktables. Comment addition "okurz: 2023-09-06: Updated description "Used by LSG QE UV squad as hypervisor for manual validation. contact mpluskal@suse.cz for more information" and loan expiration. See https://progress.opensuse.org/issues/128498 for details"
https://gitlab.suse.de/OPS-Service/salt/-/merge_requests/3953 for DHCP/DNS updates.
Updated by okurz over 1 year ago
- Tags changed from infra to infra, next-frankencampus-visit
gitlab unusable, created https://sd.suse.com/servicedesk/customer/portal/1/SD-132244
EDIT: It's back now.
created https://gitlab.suse.de/openqa/salt-pillars-openqa/-/merge_requests/610 to update name references in openQA workerconf. I also need to update the physical labels and label references in racktables.
Updated by okurz over 1 year ago
- Tags changed from infra, next-frankencampus-visit to infra
Labels applied, racktables updated
Waiting for
https://gitlab.suse.de/openqa/salt-pillars-openqa/-/merge_requests/610
Updated by okurz over 1 year ago
- Due date deleted (
2023-09-20) - Status changed from Feedback to Resolved
merged and effective. All tasks resolved.
Updated by okurz about 23 hours ago
- Copied to action #174547: Ensure wasserstoff+atomkraft are properly used or designated as unused added