Project

General

Profile

action #103736

Make aarch64 machine chan-1 up and running after it is broken size:M

Added by waynechen55 7 months ago. Updated 3 days ago.

Status:
Workable
Priority:
High
Assignee:
Target version:
Start date:
2021-12-09
Due date:
% Done:

0%

Estimated time:

Description

Motivation

Jira service desk ticket: https://sd.suse.com/servicedesk/customer/portal/1/SD-69653 https://sd.suse.com/servicedesk/customer/portal/1/SD-70018 to track this.

Will follow the ticket and also automation run to verify the status of chan-1.

Acceptance criteria

  • AC1: Machine is usable again in SRV2 (or reimbursed)

Suggestions

  • Get access to the invoice of the machine, e.g. contact people from above SD tickets
  • Contact hardware vendors over phone how to continue as they don't seem to react to tickets or something
  • Get the machine replaced by vendor
  • Have new machine put back into Nbg Maxtorhof SRV2 and provide remote control options in the ticket

Further details

Current machine details: https://racktables.nue.suse.com/index.php?page=object&object_id=13554

History

#1 Updated by waynechen55 7 months ago

  • Assignee set to waynechen55

#2 Updated by waynechen55 7 months ago

  • Target version changed from QE-VT Sprint 86 to QE-VT Sprint 87

#4 Updated by waynechen55 6 months ago

Suggest infra guys to contact vendor for further support.

#5 Updated by waynechen55 6 months ago

  • Target version changed from QE-VT Sprint 87 to QE-VT Sprint 88

#6 Updated by waynechen55 6 months ago

This issue has associated bugzilla
https://bugzilla.suse.com/show_bug.cgi?id=1194105

#7 Updated by waynechen55 6 months ago

  • Status changed from In Progress to Blocked
  • Target version changed from QE-VT Sprint 88 to QE-VT Sprint 89

No further update.

#8 Updated by waynechen55 5 months ago

  • Target version changed from QE-VT Sprint 89 to QE-VT Sprint 90

#9 Updated by waynechen55 4 months ago

  • Target version changed from QE-VT Sprint 90 to QE-VT Sprint 91

#10 Updated by waynechen55 4 months ago

  • Target version changed from QE-VT Sprint 91 to QE-VT Sprint 92

#11 Updated by waynechen55 3 months ago

  • Target version changed from QE-VT Sprint 92 to QE-VT Sprint 93

#12 Updated by waynechen55 3 months ago

  • Target version changed from QE-VT Sprint 93 to QE-VT Sprint 94

#13 Updated by waynechen55 2 months ago

  • Target version changed from QE-VT Sprint 94 to QE-VT Sprint 95

#14 Updated by waynechen55 about 2 months ago

  • Target version changed from QE-VT Sprint 95 to QE-VT Sprint 96

#15 Updated by waynechen55 about 2 months ago

  • Target version changed from QE-VT Sprint 96 to QE-VT Sprint 97

#16 Updated by nicksinger about 1 month ago

  • Tags set to next-office-day
  • Project changed from QE-Virtualization to openQA Infrastructure
  • Assignee changed from waynechen55 to nicksinger
  • Target version deleted (QE-VT Sprint 97)

#17 Updated by nicksinger about 1 month ago

  • Description updated (diff)

waynechen55 since I'm mainly working on the debugging and communication with the vendor I take this ticket over into our backlog. Hope you're fine with this

#18 Updated by okurz about 1 month ago

  • Status changed from Blocked to New
  • Priority changed from Normal to Low
  • Target version set to Ready

the ticket was blocked on https://bugzilla.suse.com/show_bug.cgi?id=1194105 which is VERIFIED FIXED so we can move (back) to "New" and clarify what needs to be done next.

nicksinger https://racktables.nue.suse.com/?page=search&last_page=index&last_tab=default&q=chan-1 does not resolve anything. What is this machine?

#19 Updated by nicksinger about 1 month ago

okurz wrote:

the ticket was blocked on https://bugzilla.suse.com/show_bug.cgi?id=1194105 which is VERIFIED FIXED so we can move (back) to "New" and clarify what needs to be done next.

nicksinger https://racktables.nue.suse.com/?page=search&last_page=index&last_tab=default&q=chan-1 does not resolve anything. What is this machine?

I previously updated the description to include the most recent and relevant ticket. There is also information what was done and what we're waiting (feedback from vendor). The machine in question is https://racktables.nue.suse.com/index.php?page=object&object_id=13554

#20 Updated by okurz about 1 month ago

nicksinger and me checked the machine together. Using ssh qanet14nue.qa.suse.de, disconnecting and reconnecting the BMC LAN cable we could verify that the machine's BMC is connected to gi17 on qanet14nue.qa.suse.de. show mac address-table interface GigabitEthernet 17 showed the MAC address to be e0:d5:5e:a7:e8:34. ssh qanet and tail -n 10000 /var/log/dhcpd.log | grep -i e0:d5:5e:a7:e8:34 revealed the IPv4 address of the BMC 10.162.3.68 . Over browser we could connect to the BMC and found the machine to be reported to be generally ok. The presence of two CPUs was reported. Surprisingly "Hardware > Memory" reported DIMM slots to be occupied but we did remove all. Maybe the BMC needs to have such information from at least one successful POST of the system to know such information. We have then checked all different kind of memory configurations. In no case could we hear any tones from the on-board buzzer and no content showed on the VGA output. The BMC seems to be fully operational. We did unscrew again the heatsink on CPU1 (secondary) and could confirm that the CPU is soldered into the socket and not unpluggable. Support by the manufacturer is advised.

#21 Updated by okurz 5 days ago

  • Tags deleted (next-office-day)

#22 Updated by okurz 5 days ago

  • Priority changed from Low to High

setting to "High" as blocking #111578. nicksinger to ask for fuze VoIP account to contact hardware supplier.

#23 Updated by cdywan 4 days ago

  • Subject changed from Make aarch64 machine chan-1 up and running after it is broken to Make aarch64 machine chan-1 up and running after it is broken size:M
  • Description updated (diff)
  • Status changed from New to Workable

#24 Updated by okurz 4 days ago

  • Description updated (diff)

#26 Updated by okurz 4 days ago

I asked for further information in racktables by email, see email on osd-admins@suse.de

#27 Updated by nanzhang 3 days ago

Server informaton from Invoice:
PH-R150-T62
Phoenics 1U 8-bay Server
S/N :GIG3N7612A0015
A/T :SU10468

Detailed specifications from order:
Motherboard
System on Chip (SoC)
CaviumĀ® ThunderXTM ARM processor
64bit ARMv8 architecture, BGA 2601, 28nm
8 DIMM Slots (max. 1024GB DDR4)
1 x 40GbE QSFP+ LAN Port and
4 x 10GbE SFP+ LAN Ports
4x SATA3 (6Gb/s) Ports (RAID 0,1,5,10)

PCI Slot:
1x PCI-E 3.0 x8 Slot (FHHL)
Integrated IPMI 2.0 with Dedicated LAN
Integrated Aspeed AST2400 BMC
2x USB 3.0

CPU
1 x Cavium ARM ThunderX
48 Cores pro CPU, 2,0 GHz

RAM
4 x Micron MTA9ASF1G72PZ-2G9
= 64 GB, 8 x 8 GB

Disk
2 x Intel S4610 SSDSC2KG480G801
480 GB, SSD, 3,42 DWPD

Official website for this model:
https://www.gigabyte.com/Enterprise/ARM-Server/R150-T62-rev-100

Also available in: Atom PDF