Project

General

Profile

Actions

action #132773

closed

fail to set iPXE boot for a baremetal machine in FC basement

Added by Julie_CAO 10 months ago. Updated 7 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
Start date:
2023-07-14
Due date:
% Done:

0%

Estimated time:
Tags:

Description

Hi,

I am changing one baremetal machine in FC basement to use iPXE from PXE with legacy BIOS(none-UEFI). My MR was merged, https://gitlab.suse.de/OPS-Service/salt/-/merge_requests/3737. It is expected to get bootscript from the kernel qa team's http server, http://baremetal-support.qa.suse.de:8080/v1/bootscript/script.ipxe/10.168.192.87. But the machine still access its original PXE server, QA-PXEBOOT on qa-jump.qe.nue2.suse.org.

I confirmed that the machine has set boot from pxe, and the bootscript is present on the http server:

wget http://baremetal-support.qa.suse.de:8080/v1/bootscript/script.ipxe/10.168.192.87
#!ipxe
echo ++++++++++++++++++++++++++++++++++++++++++
echo ++++++++++++ openQA ipxe boot ++++++++++++
echo +    Host: scooter-1.qe.nue2.suse.org
echo ++++++++++++++++++++++++++++++++++++++++++

kernel http://openqa.suse.de/assets/repo/SLE-15-SP5-Online-x86_64-Build102.1-Media1/boot/x86_64/loader/linux install=http://openqa.suse.de/assets/repo/SLE-15-SP5-Online-x86_64-Build102.1-Media1  regurl=http://all-102.1.proxy.scc.suse.de ssh=1 sshpassword=xxx plymouth.enable=0 Y2DEBUG=1 vga=791 video=1024x768 console=ttyS1,115200 linuxrc.log=/dev/ttyS1 linuxrc.core=/dev/ttyS1 linuxrc.debug=4,trace reboot_timeout=0 kernel.softlockup_panic=1 vt.color=0x07 
initrd http://openqa.suse.de/assets/repo/SLE-15-SP5-Online-x86_64-Build102.1-Media1/boot/x86_64/loader/initrd
boot

Both openqa run and manual boot had the same result --- boot from QA-PXEBOOT rather than bootscript on baremetal-support.qa.suse.de. The boot process can be seen from the openqa test: https://openqa.suse.de/tests/11574059 (video in download tab)

Is there anything wrong with the server configuration files, or could you tell me what else configurations do I need to do? I follow the ticket but I have not gotten clue, https://progress.opensuse.org/issues/130501.

Actions #1

Updated by okurz 10 months ago

  • Tags set to infra
  • Due date set to 2023-07-28
  • Status changed from New to Feedback
  • Assignee set to okurz
  • Target version set to Ready
Actions #2

Updated by Julie_CAO 10 months ago

I have already set it in the merged MR above.

https://gitlab.suse.de/OPS-Service/salt/-/merge_requests/3737/diffs
https://gitlab.suse.de/OPS-Service/salt/-/blob/production/pillar/domain/qe_nue2_suse_org/hosts.yaml#L112

scooter-1:
  mac: 'ac:1f:6b:47:73:38'
  ip4: 10.168.192.87
  hostname: scooter-1
  dhcp_filename: 'kernelqa/undionly.kpxe'
Actions #3

Updated by okurz 10 months ago

  • Assignee changed from okurz to MMoese
  • Target version changed from Ready to future

@MMoese do you have an idea?

Actions #5

Updated by xlai 10 months ago

@okurz @MMoese Hello guys, this issue is blocking a high priority task for VT poo#110133, which is a middle size task (3~4 weeks) and we plan to finish before sle15sp6 starts in Mid August. Is there any chance to fix it with urgency? Thanks in advance!

Actions #6

Updated by MMoese 10 months ago

The iPXEscript looks fine to me. Also the right file was used for legacy boot, so this looks good.

But when the old boot configuration is still effective, maybe the DHCP configuration wasn't deployed yet? Unfortunately I don't have access to the DHCP server. Any chance we can verify the dhcpd returns the right bootfile?

Actions #7

Updated by MMoese 10 months ago

  • Assignee changed from MMoese to okurz
Actions #8

Updated by okurz 10 months ago

  • Subject changed from fail to set iPXE boot for a berametal machine in FC basement to fail to set iPXE boot for a baremetal machine in FC basement
  • Due date deleted (2023-07-28)
  • Status changed from Feedback to Blocked
  • Target version changed from future to Ready

From the DHCP server it is visible that the config was not deployed:

$ ssh walter1.qe.nue2.suse.org "grep -C 3 scooter /etc/dhcpd.conf"
Welcome to walter1.qe.nue2.suse.org!
************************
* Managed by SaltStack *
************************
Make sure changes get pushed into the state repo!
  option host-name "amd-zen3-gpu-sut1-2";
}

host scooter-sp {
  hardware ethernet ac:1f:6b:4b:a7:d7;
  fixed-address 10.168.192.86;
  option host-name "scooter-sp";
}

host scooter-1 {
  hardware ethernet ac:1f:6b:47:73:38;
  fixed-address 10.168.192.87;
  option host-name "scooter-1";
}

host kermit-sp {

created https://sd.suse.com/servicedesk/customer/portal/1/SD-127307

Actions #9

Updated by okurz 10 months ago

The ticket was resolved so likely the changes are effective now. I triggered https://openqa.suse.de/tests/11617390 now.
As unfortunately the second point in the SD ticket was not resolved I opened another one
https://sd.suse.com/servicedesk/customer/portal/1/SD-127847

Actions #10

Updated by okurz 10 months ago

https://openqa.suse.de/tests/11617390 confirms that the system boots over iPXE now

Actions #11

Updated by xlai 10 months ago

okurz wrote:

https://openqa.suse.de/tests/11617390 confirms that the system boots over iPXE now

Thanks a lot! @okurz @MMoese .

Actions #12

Updated by okurz 7 months ago

  • Status changed from Blocked to Resolved

The referenced SD ticket as well as https://sd.suse.com/servicedesk/customer/portal/1/SD-133348 is now resolved. Result: DNS servers are now on upstream servers not walter1/2 and suttner1/2 which are now only DHCP server but we have root ssh access to all four so we can follow how DHCP behaves. Also documented on https://wiki.suse.net/index.php/SUSE-Quality_Assurance/Labs

Actions

Also available in: Atom PDF