Project

General

Profile

Actions

action #168556

closed

Support setup of bare-metal5+6 size:S

Added by okurz about 2 months ago. Updated 20 days ago.

Status:
Resolved
Priority:
Low
Assignee:
Category:
Feature requests
Start date:
2024-10-21
Due date:
% Done:

0%

Estimated time:
Tags:

Description

Motivation

Apparently QE Virt has ordered new hardware for PRG2e and needs it setup. References:

Acceptance criteria

  • AC1: Both bare-metal5+6 are usable as OSD bare-metal test machines

Suggestions


Files

Actions #1

Updated by okurz about 2 months ago

Commented in https://sd.suse.com/servicedesk/customer/portal/1/SD-169101

Hi Oliver, I noticed that tools team used to help with some new machine setup. Is it possible that tools team can help somehow on this ticket too?
yes, we will help. I now created https://progress.opensuse.org/issues/168556 for team internal tracking. I suggest for the next time to please create tickets involving our team “QE Tools” from the beginning before even ordering as our resources are also limited and we need to find out if we can administrate more machines or if there are better alternatives.

Regarding the setup of bare-metal5+bare-metal6 I originally saw changes in https://gitlab.suse.de/OPS-Service/salt/-/merge_requests/5559/diffs#note_679344 adding network configuration for both machines but the configuration was incomplete and had errors hence I created https://sd.suse.com/servicedesk/customer/portal/1/SD-170584 (for IT reference, not shared) based on https://sd.suse.com/servicedesk/customer/portal/1/SD-170438 (I don’t even have access myself). By now the racktable entries look rather complete:
bare-metal5: https://racktables.nue.suse.com/index.php?page=object&tab=default&object_id=28304
bare-metal6: https://racktables.nue.suse.com/index.php?page=object&tab=default&object_id=28308

Let’s try to follow-up in https://progress.opensuse.org/issues/168556 and report back here if machines are fully usable

Actions #3

Updated by okurz about 2 months ago

  • Status changed from In Progress to Blocked
Actions #5

Updated by okurz about 2 months ago

Update in https://sd.suse.com/servicedesk/customer/portal/1/SD-169101. We received initial BMC passwords.

  1. I failed to login on https://bare-metal5-ipmi.qe.prg2.suse.org/ and other colleagues in the team also tried with no success until I tried the password “ADMIN”. Maybe somebody already changed the password? Anyway we can take it from here for bare-metal5 until we hit other problems.
  2. https://bare-metal6-ipmi.qe.prg2.suse.org/ is not reachable at all.
  3. Also the entry “BMC: 7CC2556047EA” means that the MAC address of the BMC interface of bare-metal5 should be 7C:C2:55:60:47:EA however https://racktables.nue.suse.com/index.php?page=object&tab=default&object_id=28304 states that it would be 7C:C2:55:5F:8B:1A, so a different one. Please crosscheck. The sticker entry and racktables entry for bare-metal6 however seem to be fine.

Regarding IPMI over the BMC webUI I created an openQA account according to https://gitlab.suse.de/openqa/salt-pillars-openqa/-/merge_requests/921 but for both ADMIN as well as the openQA IPMI account I get on ipmitool -I lanplus -H bare-metal5-ipmi.qe.prg2.suse.org …

Error in open session response message : invalid role

any ideas?

https://gitlab.suse.de/OPS-Service/salt/-/merge_requests/5605 is still open so blocked on this.

Actions #6

Updated by okurz about 2 months ago

  • Status changed from Blocked to New

After updates in https://sd.suse.com/servicedesk/customer/portal/1/SD-169101 I could connected to bare-metal6 and also login to the BMC and unlike bare-metal5 also IPMI works with username ADMIN. I added an account "openqa" and that also worked. Will update password for admin in https://gitlab.suse.de/openqa/salt-pillars-openqa/-/merge_requests/921/diffs

Actions #7

Updated by okurz about 2 months ago

  • Status changed from New to Blocked
Actions #8

Updated by okurz about 1 month ago

  • Status changed from Blocked to Feedback

https://gitlab.suse.de/OPS-Service/salt/-/merge_requests/5605 was deployed

@xlai can you test if bare-metal5+6 are usable for you?

Actions #9

Updated by xlai about 1 month ago

okurz wrote in #note-8:

https://gitlab.suse.de/OPS-Service/salt/-/merge_requests/5605 was deployed

@xlai can you test if bare-metal5+6 are usable for you?

Thank you @okurz for the support! @xguo is working on #167890 for our side work. Leon would you please give some feedback here after you try the new 2 machines?

Actions #10

Updated by xguo about 1 month ago · Edited

okurz wrote in #note-8:

https://gitlab.suse.de/OPS-Service/salt/-/merge_requests/5605 was deployed

@xlai can you test if bare-metal5+6 are usable for you?

@okurz
Thanks so much for your great help and support.

Actions #11

Updated by xguo about 1 month ago

xlai wrote in #note-9:

okurz wrote in #note-8:

https://gitlab.suse.de/OPS-Service/salt/-/merge_requests/5605 was deployed

@xlai can you test if bare-metal5+6 are usable for you?

Thank you @okurz for the support! @xguo is working on #167890 for our side work. Leon would you please give some feedback here after you try the new 2 machines?

@xlai
Sure, I will update some comments here after I do more validation runs on both bare-metal 5+6 machines. Thanks

Actions #12

Updated by xguo about 1 month ago

@okurz
Quick help, try to perform the latest SLE15SP7 build39.2 host installation with bare-metal6-ipmi.qe.prg2.suse.org via IPXE boot.

Found that new problem - No Media Present for bare-metal6-ipmi.qe.prg2.suse.org now.
FYI.
Refer to https://openqa.suse.de/tests/15912574/video?filename=video.webm for more details.

Actions #13

Updated by Julie_CAO about 1 month ago · Edited

xguo wrote in #note-12:

Refer to https://openqa.suse.de/tests/15912574/video?filename=video.webm for more details.

I got the screenshot for you.

Actions #14

Updated by xguo about 1 month ago · Edited

Seams that No Media Present due to connect iPXE serve time out

Actions #15

Updated by okurz about 1 month ago · Edited

xguo wrote in #note-14:

Seams that No Media Present due to connect iPXE serve time out

that's related to #168106

I can confirm the problems reported in https://sd.suse.com/servicedesk/customer/portal/1/SD-169101 that bare-metal5 does not properly respond on ipmitool -I lanplus … unlike bare-metal6. The behaviour is the same for IPMI accounts "qadmin" and "openqa". ipmitool -I lanplus -H bare-metal5-ipmi.qe.prg2.suse.org -U qadmin -P … yields

Error in open session response message : invalid role

Error: Unable to establish IPMI v2 / RMCP+ session

leaving out -I lanplus works fine. But on bare-metal6 both works, with and without -I lanplus. I will ask others from the team for ideas.

EDIT: Created https://github.com/os-autoinst/os-autoinst/pull/2570 to allow to configure IPMI options

Actions #16

Updated by okurz about 1 month ago

  • Subject changed from Support setup of bare-metal5+6 to Support setup of bare-metal5+6 size:S
  • Description updated (diff)
Actions #17

Updated by xguo about 1 month ago · Edited

okurz wrote in #note-15:

xguo wrote in #note-14:

Seams that No Media Present due to connect iPXE serve time out

that's related to #168106

I can confirm the problems reported in https://sd.suse.com/servicedesk/customer/portal/1/SD-169101 that bare-metal5 does not properly respond on ipmitool -I lanplus … unlike bare-metal6. The behaviour is the same for IPMI accounts "qadmin" and "openqa". ipmitool -I lanplus -H bare-metal5-ipmi.qe.prg2.suse.org -U qadmin -P … yields

Error in open session response message : invalid role

Error: Unable to establish IPMI v2 / RMCP+ session

leaving out -I lanplus works fine. But on bare-metal6 both works, with and without -I lanplus. I will ask others from the team for ideas.

EDIT: Created https://github.com/os-autoinst/os-autoinst/pull/2570 to allow to configure IPMI options

There are new bare metal test machines, which one of them was beasd on the latest IPMI Protocol v2.0.

We'd better to continue to use -I lanplus - IPMI v2.0 RMCP+ LAN Interface with them.

Meanwhile, figure out other solution, try to use -I lanplus and -C 3 to access bare-metal5-ipmi with openqa account successfully:

# ipmitool -I lanplus -C 3 -H bare-metal5-ipmi.qe.prg2.suse.org -U openqa
Password: 
No command provided!
Commands:
    raw           Send a RAW IPMI request and print response
    i2c           Send an I2C Master Write-Read command and print response
    spd           Print SPD info from remote I2C device
    lan           Configure LAN Channels
    chassis       Get chassis status and set power state
    power         Shortcut to chassis power commands
    event         Send pre-defined events to MC
    mc            Management Controller status and global enables
    sdr           Print Sensor Data Repository entries and readings
    sensor        Print detailed sensor information
    fru           Print built-in FRU and scan SDR for FRU locators
    gendev        Read/Write Device associated with Generic Device locators sdr
    sel           Print System Event Log (SEL)
    pef           Configure Platform Event Filtering (PEF)
    sol           Configure and connect IPMIv2.0 Serial-over-LAN
    tsol          Configure and connect with Tyan IPMIv1.5 Serial-over-LAN
    isol          Configure IPMIv1.5 Serial-over-LAN
    user          Configure Management Controller users
    channel       Configure Management Controller channels
    session       Print session information
    dcmi          Data Center Management Interface
    nm            Node Manager Interface
    sunoem        OEM Commands for Sun servers
    kontronoem    OEM Commands for Kontron devices
    picmg         Run a PICMG/ATCA extended cmd
    fwum          Update IPMC using Kontron OEM Firmware Update Manager
    firewall      Configure Firmware Firewall
    delloem       OEM Commands for Dell systems
    shell         Launch interactive IPMI shell
    exec          Run list of commands from file
    set           Set runtime variable for shell and exec
    hpm           Update HPM components using PICMG HPM.1 file
    ekanalyzer    run FRU-Ekeying analyzer using FRU files
    ime           Update Intel Manageability Engine Firmware
    vita          Run a VITA 46.11 extended cmd
    lan6          Configure IPv6 LAN Channels

Debug info:

# ipmitool -vv -I lanplus -H bare-metal5-ipmi.qe.prg2.suse.org -U openqa
ipmitool version 1.8.18

Password: 
Loading IANA PEN Registry...
Testing br0 interface address: 2a07:de40:b230:1:7d04:2ce3:1fba:771c scope=0
Successful connected on br0 interface with scope id 0

>> Sending IPMI command payload
>>    netfn   : 0x06
>>    command : 0x38
>>    data    : 0x8e 0x04 


>> Sending IPMI command payload
>>    netfn   : 0x06
>>    command : 0x54
>>    data    : 0x0e 0x00 0x80 

Local RqAddr 0x20 transit 0:0 target 0x20:0 bridgePossible 0

>> Sending IPMI command payload
>>    netfn   : 0x06
>>    command : 0x54
>>    data    : 0x0e 0x00 0x81 

Local RqAddr 0x20 transit 0:0 target 0x20:0 bridgePossible 0

>> Sending IPMI command payload
>>    netfn   : 0x06
>>    command : 0x54
>>    data    : 0x0e 0x00 0x82 

Local RqAddr 0x20 transit 0:0 target 0x20:0 bridgePossible 0

>> Sending IPMI command payload
>>    netfn   : 0x06
>>    command : 0x54
>>    data    : 0x0e 0x00 0x83 

Local RqAddr 0x20 transit 0:0 target 0x20:0 bridgePossible 0

>> Sending IPMI command payload
>>    netfn   : 0x06
>>    command : 0x54
>>    data    : 0x0e 0x00 0x84 

Local RqAddr 0x20 transit 0:0 target 0x20:0 bridgePossible 0

>> Sending IPMI command payload
>>    netfn   : 0x06
>>    command : 0x54
>>    data    : 0x0e 0x00 0x85 

Local RqAddr 0x20 transit 0:0 target 0x20:0 bridgePossible 0

>> Sending IPMI command payload
>>    netfn   : 0x06
>>    command : 0x54
>>    data    : 0x0e 0x00 0x86 

Local RqAddr 0x20 transit 0:0 target 0x20:0 bridgePossible 0
Using best available cipher suite 17

>> SENDING AN OPEN SESSION REQUEST

<<OPEN SESSION RESPONSE
<<  Message tag                        : 0x00
<<  RMCP+ status                       : invalid role
<<  Maximum privilege level            : Unknown (0x00)
<<  Console Session ID                 : 0x00a0a2a3
Error in open session response message : invalid role

Error: Unable to establish IPMI v2 / RMCP+ session

Work with -I lanplus and -C 3

# ipmitool -vv -I lanplus -C 3 -H bare-metal5-ipmi.qe.prg2.suse.org -U openqa
ipmitool version 1.8.18

Password: 
Loading IANA PEN Registry...
Testing br0 interface address: 2a07:de40:b230:1:7d04:2ce3:1fba:771c scope=0
Successful connected on br0 interface with scope id 0

>> Sending IPMI command payload
>>    netfn   : 0x06
>>    command : 0x38
>>    data    : 0x8e 0x04 

>> SENDING AN OPEN SESSION REQUEST

<<OPEN SESSION RESPONSE
<<  Message tag                        : 0x00
<<  RMCP+ status                       : no errors
<<  Maximum privilege level            : admin
<<  Console Session ID                 : 0xa0a2a3a4
<<  BMC Session ID                     : 0x00000081
<<  Negotiated authenticatin algorithm : hmac_sha1
<<  Negotiated integrity algorithm     : hmac_sha1_96
<<  Negotiated encryption algorithm    : aes_cbc_128

>> Console generated random number (16 bytes)
 5a 6d fe 17 7f c8 22 93 48 80 af 4e 6b 00 08 64
>> SENDING A RAKP 1 MESSAGE

<<RAKP 2 MESSAGE
<<  Message tag                   : 0x00
<<  RMCP+ status                  : no errors
<<  Console Session ID            : 0xa0a2a3a4
<<  BMC random number             : 0x31d5cd381b7c4d1560ef954ce4cb3a27
<<  BMC GUID                      : 0x00000000000000000000000000000000
<<  Key exchange auth code [sha1] : 0x938cffbe6a4c3189169b548874b63220f9e3a4e4

session integrity key input (40 bytes)
 5a 6d fe 17 7f c8 22 93 48 80 af 4e 6b 00 08 64
 31 d5 cd 38 1b 7c 4d 15 60 ef 95 4c e4 cb 3a 27
 14 06 6f 70 65 6e 71 61
Generated session integrity key (20 bytes)
 3f 2b af 07 2f fb 0a 92 9a 8e 22 94 ae 05 d8 72
 e8 9f 18 bd
Generated K1 (20 bytes)
 ea 0e 8b 65 01 5f 3f 3a 05 64 cf 5b be c6 b1 43
 58 8a ad ff
Generated K2 (20 bytes)
 c0 ad 1c b1 19 5b 2d aa 0e d7 d0 d6 75 93 c5 9c
 4f 9d df f4
>> SENDING A RAKP 3 MESSAGE

<<RAKP 4 MESSAGE
<<  Message tag                   : 0x00
<<  RMCP+ status                  : no errors
<<  Console Session ID            : 0xa0a2a3a4
<<  Key exchange auth code [sha1] : 0xeb1c14fa750fb956f76ff455

IPMIv2 / RMCP+ SESSION OPENED SUCCESSFULLY


>> Sending IPMI command payload
>>    netfn   : 0x06
>>    command : 0x3b
>>    data    : 0x04 

Local RqAddr 0x20 transit 0:0 target 0x20:0 bridgePossible 0
Set Session Privilege Level to ADMINISTRATOR


>> Sending IPMI command payload
>>    netfn   : 0x2c
>>    command : 0x3e
>>    data    : 0x00 0x02 

Local RqAddr 0x20 transit 0:0 target 0x20:0 bridgePossible 0
IPM Controller is not HPM.2 compatible

>> Sending IPMI command payload
>>    netfn   : 0x06
>>    command : 0x01
>>    data    : 

Local RqAddr 0x20 transit 0:0 target 0x20:0 bridgePossible 1
Iana: 10876
Running Get PICMG Properties my_addr 0x20, transit 0, target 0x20

>> Sending IPMI command payload
>>    netfn   : 0x2c
>>    command : 0x00
>>    data    : 0x00 

Local RqAddr 0x20 transit 0:0 target 0x20:0 bridgePossible 1
Error response 0xc1 from Get PICMG Properties
Running Get VSO Capabilities my_addr 0x20, transit 0, target 0x20

>> Sending IPMI command payload
>>    netfn   : 0x2c
>>    command : 0x00
>>    data    : 0x03 

Local RqAddr 0x20 transit 0:0 target 0x20:0 bridgePossible 1
Invalid completion code received: Invalid command
Acquire IPMB address
Discovered IPMB address 0x0
Interface address: my_addr 0x20 transit 0:0 target 0x20:0 ipmb_target 0

No command provided!
Commands:
    raw           Send a RAW IPMI request and print response
    i2c           Send an I2C Master Write-Read command and print response
    spd           Print SPD info from remote I2C device
    lan           Configure LAN Channels
    chassis       Get chassis status and set power state
    power         Shortcut to chassis power commands
    event         Send pre-defined events to MC
    mc            Management Controller status and global enables
    sdr           Print Sensor Data Repository entries and readings
    sensor        Print detailed sensor information
    fru           Print built-in FRU and scan SDR for FRU locators
    gendev        Read/Write Device associated with Generic Device locators sdr
    sel           Print System Event Log (SEL)
    pef           Configure Platform Event Filtering (PEF)
    sol           Configure and connect IPMIv2.0 Serial-over-LAN
    tsol          Configure and connect with Tyan IPMIv1.5 Serial-over-LAN
    isol          Configure IPMIv1.5 Serial-over-LAN
    user          Configure Management Controller users
    channel       Configure Management Controller channels
    session       Print session information
    dcmi          Data Center Management Interface
    nm            Node Manager Interface
    sunoem        OEM Commands for Sun servers
    kontronoem    OEM Commands for Kontron devices
    picmg         Run a PICMG/ATCA extended cmd
    fwum          Update IPMC using Kontron OEM Firmware Update Manager
    firewall      Configure Firmware Firewall
    delloem       OEM Commands for Dell systems
    shell         Launch interactive IPMI shell
    exec          Run list of commands from file
    set           Set runtime variable for shell and exec
    hpm           Update HPM components using PICMG HPM.1 file
    ekanalyzer    run FRU-Ekeying analyzer using FRU files
    ime           Update Intel Manageability Engine Firmware
    vita          Run a VITA 46.11 extended cmd
    lan6          Configure IPv6 LAN Channels


>> Sending IPMI command payload
>>    netfn   : 0x06
>>    command : 0x3c
>>    data    : 0x81 0x00 0x00 0x00 

Local RqAddr 0x20 transit 0:0 target 0x20:0 bridgePossible 0
Closed Session 00000081
Actions #19

Updated by okurz about 1 month ago

  • Status changed from Feedback to In Progress

merged and deployed. Let's see if we can run one of the latest test from bare-metal6 on bare-metal5

openqa-clone-job --skip-chained-deps --within-instance https://openqa.suse.de/tests/15920961 _GROUP=0 {BUILD,TEST}+=-bare-metal5-okurz-poo168556 WORKER_CLASS=bare-metal5

1 job has been created:

  • sle-15-SP7-Online-x86_64-Build39.2-gi-guest_oraclelinux-on-host_developing-kvm@guoxuguang_os-autoinst-distri-opensuse_leon_poo167890@64bit-ipmi-large-mem -> https://openqa.suse.de/tests/15938886
Actions #20

Updated by xguo about 1 month ago

okurz wrote in #note-19:

merged and deployed. Let's see if we can run one of the latest test from bare-metal6 on bare-metal5

openqa-clone-job --skip-chained-deps --within-instance https://openqa.suse.de/tests/15920961 _GROUP=0 {BUILD,TEST}+=-bare-metal5-okurz-poo168556 WORKER_CLASS=bare-metal5

1 job has been created:

  • sle-15-SP7-Online-x86_64-Build39.2-gi-guest_oraclelinux-on-host_developing-kvm@guoxuguang_os-autoinst-distri-opensuse_leon_poo167890@64bit-ipmi-large-mem -> https://openqa.suse.de/tests/15938886

Refer to https://openqa.suse.de/tests/15932884/logfile?filename=autoinst-log.txt,

[2024-11-14T04:50:42.295184Z] [info] [pid:78014] ::: backend::baseclass::die_handler: Backend process died, backend errors are reported below in the following lines:
  ipmitool -C3 -H bare-metal5-ipmi.qe.prg2.suse.org -U openqa -P [masked] chassis power status: Error: Unable to establish LAN session
  Error: Unable to establish IPMI v1.5 / RMCP session at /usr/lib/os-autoinst/backend/ipmi.pm line 46.
[2024-11-14T04:50:42.295710Z] [debug] [pid:78014] Passing remaining frames to the video encoder
[image2pipe @ 0x55f06b9145c0] Could not find codec parameters for stream 0 (Video: ppm, none): unspecified size
Consider increasing the value for the 'analyzeduration' (0) and 'probesize' (5000000) options
Input #0, image2pipe, from 'pipe:':
  Duration: N/A, bitrate: N/A
  Stream #0:0: Video: ppm, none, 24 fps, 24 tbr, 24 tbn, 24 tbc
Output #0, webm, to 'video.webm':
Output file #0 does not contain any stream
[2024-11-14T04:50:42.302932Z] [debug] [pid:78014] Waiting for video encoder to finalize the video
[2024-11-14T04:50:42.303043Z] [debug] [pid:78014] The built-in video encoder (pid 78445) terminated
[2024-11-14T04:50:42.303133Z] [debug] [pid:78014] The external video encoder (pid 78444) terminated
[2024-11-14T04:50:42.304160Z] [debug] [pid:78014] sending magic and exit
[2024-11-14T04:50:42.304590Z] [debug] [pid:75969] received magic close
[2024-11-14T04:50:42.316261Z] [debug] [pid:75969] backend process exited: 0
[2024-11-14T04:50:42.417249Z] [warn] [pid:75969] !!! main: failed to start VM at /usr/lib/os-autoinst/backend/driver.pm line 104.
    backend::driver::start_vm(backend::driver=HASH(0x55e44aa2f478)) called at /usr/lib/os-autoinst/OpenQA/Isotovideo/Backend.pm line 18
    OpenQA::Isotovideo::Backend::new("OpenQA::Isotovideo::Backend") called at /usr/lib/os-autoinst/OpenQA/Isotovideo/Runner.pm line 109
    OpenQA::Isotovideo::Runner::create_backend(OpenQA::Isotovideo::Runner=HASH(0x55e441f572f0)) called at /usr/lib/os-autoinst/OpenQA/Isotovideo/Runner.pm line 251
    OpenQA::Isotovideo::Runner::init(OpenQA::Isotovideo::Runner=HASH(0x55e441f572f0)) called at /usr/bin/isotovideo line 182
    eval {...} called at /usr/bin/isotovideo line 177

Confirm that IPMI_OPTIONS: -C3 do not work very well with bare-metal5

We'd better to use IPMI_OPTIONS: -I lanplus -C 3 for bare-metal5

Actions #21

Updated by openqa_review about 1 month ago

  • Due date set to 2024-11-29

Setting due date based on mean cycle time of SUSE QE Tools

Actions #22

Updated by okurz about 1 month ago

  • Status changed from In Progress to Feedback

https://openqa.suse.de/tests/15938886/logfile?filename=autoinst-log.txt shows that we can connect to IPMI, can power off/on the machine but then in a later step I don't know what is going on

[2024-11-14T09:19:40.854954Z] [debug] [pid:31461] ||| starting ipxe_install tests/installation/ipxe_install.pm
[2024-11-14T09:19:40.912400Z] [debug] [pid:31461] IPMI: 
[2024-11-14T09:19:43.969878Z] [debug] [pid:31461] IPMI: 
[2024-11-14T09:19:44.024084Z] [debug] [pid:31461] IPMI: 
[2024-11-14T09:19:47.080409Z] [debug] [pid:31461] IPMI: 
[2024-11-14T09:19:47.137938Z] [debug] [pid:31461] IPMI: 
[2024-11-14T09:19:50.197815Z] [debug] [pid:31461] IPMI: 
[2024-11-14T09:19:50.252072Z] [debug] [pid:31461] IPMI: 
[2024-11-14T09:19:53.310686Z] [debug] [pid:31461] IPMI: 
[2024-11-14T09:19:53.368075Z] [debug] [pid:31461] IPMI: 
[2024-11-14T09:19:56.427906Z] [debug] [pid:31461] IPMI: 
[2024-11-14T09:19:56.483100Z] [debug] [pid:31461] IPMI: 
[2024-11-14T09:19:59.544216Z] [debug] [pid:31461] IPMI: 
[2024-11-14T09:19:59.599971Z] [debug] [pid:31461] IPMI: 
[2024-11-14T09:20:02.659123Z] [debug] [pid:31461] IPMI: 
[2024-11-14T09:20:02.718210Z] [debug] [pid:31461] IPMI: 

Trying with -I lanplus -C 3:

openqa-clone-job --skip-chained-deps --within-instance https://openqa.suse.de/tests/15920961 _GROUP=0 {BUILD,TEST}+=-bare-metal5-okurz-poo168556-lanplus WORKER_CLASS=bare-metal5 IPMI_OPTIONS='-I lanplus -C 3'

1 job has been created:

  • sle-15-SP7-Online-x86_64-Build39.2-gi-guest_oraclelinux-on-host_developing-kvm@guoxuguang_os-autoinst-distri-opensuse_leon_poo167890@64bit-ipmi-large-mem -> https://openqa.suse.de/tests/15939909

showing same symptoms as above.

@xguo do you have an idea here? Or consider a firmware upgrade for the machine?

Actions #23

Updated by xguo about 1 month ago

@okurz

There's no need to update firmware here for bare-metal5 at all.

Seems that I figure out solution of this problem as you mentioned, meanwhile, draft a new pr - https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/20613 to trace it.

ROOT CAUSE:
Refer to https://github.com/os-autoinst/os-autoinst/pull/2570, it does not enough that just only modify ipmi_cmdline() to support $ipmi_options for bare-metal5.
We have need to modify ipmitool() from lib/ipmi_backend_utils.pm together.

For now, confirm that 15SP7 build40.1 iPXE host installation with bare-metal5 successfully - https://openqa.suse.de/tests/15947566#step/installation/8
Hope all goes well with the next validation run for bare-metal5! I still keep an eye on bare-metal5. We'll see.

FYI
Merged MR!935 https://gitlab.suse.de/openqa/salt-pillars-openqa/-/merge_requests/935 works very well with me now.
Thanks so much for your great help and support.

okurz wrote in #note-22:

https://openqa.suse.de/tests/15938886/logfile?filename=autoinst-log.txt shows that we can connect to IPMI, can power off/on the machine but then in a later step I don't know what is going on

[2024-11-14T09:19:40.854954Z] [debug] [pid:31461] ||| starting ipxe_install tests/installation/ipxe_install.pm
[2024-11-14T09:19:40.912400Z] [debug] [pid:31461] IPMI: 
[2024-11-14T09:19:43.969878Z] [debug] [pid:31461] IPMI: 
[2024-11-14T09:19:44.024084Z] [debug] [pid:31461] IPMI: 
[2024-11-14T09:19:47.080409Z] [debug] [pid:31461] IPMI: 
[2024-11-14T09:19:47.137938Z] [debug] [pid:31461] IPMI: 
[2024-11-14T09:19:50.197815Z] [debug] [pid:31461] IPMI: 
[2024-11-14T09:19:50.252072Z] [debug] [pid:31461] IPMI: 
[2024-11-14T09:19:53.310686Z] [debug] [pid:31461] IPMI: 
[2024-11-14T09:19:53.368075Z] [debug] [pid:31461] IPMI: 
[2024-11-14T09:19:56.427906Z] [debug] [pid:31461] IPMI: 
[2024-11-14T09:19:56.483100Z] [debug] [pid:31461] IPMI: 
[2024-11-14T09:19:59.544216Z] [debug] [pid:31461] IPMI: 
[2024-11-14T09:19:59.599971Z] [debug] [pid:31461] IPMI: 
[2024-11-14T09:20:02.659123Z] [debug] [pid:31461] IPMI: 
[2024-11-14T09:20:02.718210Z] [debug] [pid:31461] IPMI: 

Trying with -I lanplus -C 3:

openqa-clone-job --skip-chained-deps --within-instance https://openqa.suse.de/tests/15920961 _GROUP=0 {BUILD,TEST}+=-bare-metal5-okurz-poo168556-lanplus WORKER_CLASS=bare-metal5 IPMI_OPTIONS='-I lanplus -C 3'

1 job has been created:

  • sle-15-SP7-Online-x86_64-Build39.2-gi-guest_oraclelinux-on-host_developing-kvm@guoxuguang_os-autoinst-distri-opensuse_leon_poo167890@64bit-ipmi-large-mem -> https://openqa.suse.de/tests/15939909

showing same symptoms as above.

@xguo do you have an idea here? Or consider a firmware upgrade for the machine?

Actions #24

Updated by okurz about 1 month ago

  • Due date changed from 2024-11-29 to 2024-12-13
  • Priority changed from Normal to Low
  • Target version changed from Ready to Tools - Next
Actions #25

Updated by okurz 23 days ago

  • Status changed from Feedback to Blocked
  • Target version changed from Tools - Next to Ready
Actions #26

Updated by okurz 20 days ago

  • Due date deleted (2024-12-13)
  • Status changed from Blocked to Resolved

https://gitlab.suse.de/openqa/salt-pillars-openqa/-/merge_requests/936 merged. With that and verification from #167890-27 both machines can be used in OSD production jobs. Verification jobs from production

Actions

Also available in: Atom PDF