Project

General

Profile

Actions

action #124661

closed

[qe-tools] tftp server and directory mount issue on qanet.qa

Added by zluo about 1 year ago. Updated about 1 year ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Target version:
Start date:
2023-02-16
Due date:
% Done:

0%

Estimated time:
Tags:

Description

  1. for installation on Power LPAR tftp server doesn’t send grub.cfg to the host at stage on SMS. So installaiton is not possible

  2. the directory /mounts is required for /srv/tftp/ (linked to mounts) but it doesn’t work at moment. run command ls /mounts and ls /srv/tftp stucks forever


Related issues 1 (0 open1 closed)

Related to openQA Infrastructure - action #124655: [openQA][infra][pxe] Physical SUT machine can not boot from pxe and mismatch hostnameResolvedokurz2023-02-16

Actions
Actions #1

Updated by okurz about 1 year ago

  • Tags set to infra
  • Project changed from openQA Tests to openQA Infrastructure
  • Category deleted (Infrastructure)
  • Assignee set to okurz
  • Priority changed from Normal to High
  • Target version set to Ready
Actions #2

Updated by jbaier_cz about 1 year ago

  • Related to action #124655: [openQA][infra][pxe] Physical SUT machine can not boot from pxe and mismatch hostname added
Actions #3

Updated by okurz about 1 year ago

  • Status changed from New to In Progress

I called

umount -f -l /srv/tftp/mounts
umount -f -l /mounts

and then I restarted the service nfs-kernel.service and then back on qanet I called mount /mounts

Actions #4

Updated by okurz about 1 year ago

  • Due date set to 2023-03-02
  • Status changed from In Progress to Feedback

@zluo I can list the directories again. Can you check with what you wanted to achieve with the manual installation?

Actions #5

Updated by zluo about 1 year ago

I can access to /mounts and /srv/tftp now
but tftp server still doesn't send grub.cfg to LPAR. Installation is not possible

Actions #6

Updated by okurz about 1 year ago

I restarted atftpd. You can crosscheck yourself if that works or provide steps to reproduce

Actions #7

Updated by nicksinger about 1 year ago

/srv/tftp/mounts (bind to /mounts) was not mounted correctly because autofs hang. I had to do a killall -9 automount and restarted the autofs service. Before:

qanet:/srv/tftp # ls /mounts/install/SLP/
SLE-15-SP3-Full-Beta2  wgao

Now:

qanet:/srv/tftp # ls /mounts/install/SLP
AAA_README_first.txt                           SLE-15-SP1-Module-Development-Tools-GM                           SLE-15-SP2-Product-SUSE-Manager-Proxy-4.1-LATEST                 SLE-15-SP4-Module-SAP-Applications-GMC-202205           SLE-15-SP5-Product-SUSE-Manager-Proxy-4.4-LATEST
openSUSE-Leap-15.4                             SLE-15-SP1-Module-Development-Tools-LATEST                       SLE-15-SP2-Product-SUSE-Manager-Retail-Branch-Server-4.1-GM      SLE-15-SP4-Module-SAP-Applications-LATEST               SLE-15-SP5-Product-SUSE-Manager-Proxy-4.4-TEST
openSUSE-Tumbleweed                            SLE-15-SP1-Module-HPC-GM                                         SLE-15-SP2-Product-SUSE-Manager-Retail-Branch-Server-4.1-LATEST  SLE-15-SP4-Module-Server-Applications-GMC-202205        SLE-15-SP5-Product-SUSE-Manager-Retail-Branch-Server-4.4-Beta3-202301

@zluo please try again

Actions #8

Updated by zluo about 1 year ago

IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM
IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM
IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM
IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM
IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM

TFTP BOOT ---------------------------------------------------
Server IP.....................10.162.0.1
Client IP.....................10.162.8.10
Gateway IP....................10.162.63.254
Subnet Mask...................255.255.192.0
( 1 ) Filename.................ppc64le/grub2
TFTP Retries..................5
Block Size....................512

it stucks for a while, then it shows:
...

Block Size....................512 !BA017021 !

                  .----------------------------------.
                  |  No Operating Systems Installed  |
                  `----------------------------------'

I checked extra grub.cfg and cannot find error in it. and fro another test I used an old grub.cfg from last year as well.

you can find at:
qanet:/srv/tftp/boot/ppc64le/grub2-ieee1275

Actions #9

Updated by okurz about 1 year ago

The file is there but atftpd couldn't serve any files. apparently atftpd as a process was also stuck and we couldn't terminate it so any attempts to restart also failed. We now rebooted qanet and /var/log/atftpd/atftp.log shows success:

Feb 16 12:49:08 qanet atftpd[2526.140445177575168]: Serving /mnt/openqa/repo/SLE-15-SP4-Online-x86_64-Build176.1-Media1/boot/x86_64/loader/linux to 10.162.2.75:49159
Feb 16 12:49:08 qanet atftpd[2526.140445177575168]: tsize option -> 11419424
Feb 16 12:49:08 qanet atftpd[2526.140445177575168]: blksize option -> 1408
Feb 16 12:49:10 qanet atftpd[2526.140445177575168]: End of transfer
Feb 16 12:49:10 qanet atftpd[2526.140445177575168]: Server thread exiting
Feb 16 12:49:10 qanet atftpd[2526.140445095032576]: Creating new socket: 10.162.0.1:40602
Feb 16 12:49:10 qanet atftpd[2526.140445095032576]: Serving /mnt/openqa/repo/SLE-15-SP4-Online-x86_64-Build176.1-Media1/boot/x86_64/loader/initrd to 10.162.2.75:49160
Feb 16 12:49:10 qanet atftpd[2526.140445095032576]: tsize option -> 140964092
Feb 16 12:49:10 qanet atftpd[2526.140445095032576]: blksize option -> 1408
Feb 16 12:49:35 qanet atftpd[2526.140445095032576]: End of transfer
Feb 16 12:49:35 qanet atftpd[2526.140445095032576]: Server thread exiting
Actions #10

Updated by zluo about 1 year ago

yes, it is working now, thanks oli!

Actions #11

Updated by okurz about 1 year ago

  • Due date deleted (2023-03-02)
  • Status changed from Feedback to Resolved

ok, good. Likely the problem was caused because servers from BuildOps changed IP addresses. This won't happen that often. Last time likely only happened 6 years ago so we should be ok without doing any further improvements.

Actions

Also available in: Atom PDF