action #124661
closed[qe-tools] tftp server and directory mount issue on qanet.qa
Added by zluo almost 2 years ago. Updated almost 2 years ago.
0%
Description
for installation on Power LPAR tftp server doesn’t send grub.cfg to the host at stage on SMS. So installaiton is not possible
the directory /mounts is required for /srv/tftp/ (linked to mounts) but it doesn’t work at moment. run command ls /mounts and ls /srv/tftp stucks forever
Updated by okurz almost 2 years ago
- Tags set to infra
- Project changed from openQA Tests (public) to openQA Infrastructure (public)
- Category deleted (
Infrastructure) - Assignee set to okurz
- Priority changed from Normal to High
- Target version set to Ready
Updated by jbaier_cz almost 2 years ago
- Related to action #124655: [openQA][infra][pxe] Physical SUT machine can not boot from pxe and mismatch hostname added
Updated by okurz almost 2 years ago
- Status changed from New to In Progress
I called
umount -f -l /srv/tftp/mounts
umount -f -l /mounts
and then I restarted the service nfs-kernel.service
and then back on qanet I called mount /mounts
Updated by okurz almost 2 years ago
- Due date set to 2023-03-02
- Status changed from In Progress to Feedback
@zluo I can list the directories again. Can you check with what you wanted to achieve with the manual installation?
Updated by zluo almost 2 years ago
I can access to /mounts and /srv/tftp now
but tftp server still doesn't send grub.cfg to LPAR. Installation is not possible
Updated by okurz almost 2 years ago
I restarted atftpd. You can crosscheck yourself if that works or provide steps to reproduce
Updated by nicksinger almost 2 years ago
/srv/tftp/mounts (bind to /mounts) was not mounted correctly because autofs hang. I had to do a killall -9 automount
and restarted the autofs service. Before:
qanet:/srv/tftp # ls /mounts/install/SLP/
SLE-15-SP3-Full-Beta2 wgao
Now:
qanet:/srv/tftp # ls /mounts/install/SLP
AAA_README_first.txt SLE-15-SP1-Module-Development-Tools-GM SLE-15-SP2-Product-SUSE-Manager-Proxy-4.1-LATEST SLE-15-SP4-Module-SAP-Applications-GMC-202205 SLE-15-SP5-Product-SUSE-Manager-Proxy-4.4-LATEST
openSUSE-Leap-15.4 SLE-15-SP1-Module-Development-Tools-LATEST SLE-15-SP2-Product-SUSE-Manager-Retail-Branch-Server-4.1-GM SLE-15-SP4-Module-SAP-Applications-LATEST SLE-15-SP5-Product-SUSE-Manager-Proxy-4.4-TEST
openSUSE-Tumbleweed SLE-15-SP1-Module-HPC-GM SLE-15-SP2-Product-SUSE-Manager-Retail-Branch-Server-4.1-LATEST SLE-15-SP4-Module-Server-Applications-GMC-202205 SLE-15-SP5-Product-SUSE-Manager-Retail-Branch-Server-4.4-Beta3-202301
@zluo please try again
Updated by zluo almost 2 years ago
IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM
IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM
IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM
IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM
IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM
TFTP BOOT ---------------------------------------------------
Server IP.....................10.162.0.1
Client IP.....................10.162.8.10
Gateway IP....................10.162.63.254
Subnet Mask...................255.255.192.0
( 1 ) Filename.................ppc64le/grub2
TFTP Retries..................5
Block Size....................512
it stucks for a while, then it shows:
...
Block Size....................512 !BA017021 !
.----------------------------------.
| No Operating Systems Installed |
`----------------------------------'
I checked extra grub.cfg and cannot find error in it. and fro another test I used an old grub.cfg from last year as well.
you can find at:
qanet:/srv/tftp/boot/ppc64le/grub2-ieee1275
Updated by okurz almost 2 years ago
The file is there but atftpd couldn't serve any files. apparently atftpd as a process was also stuck and we couldn't terminate it so any attempts to restart also failed. We now rebooted qanet and /var/log/atftpd/atftp.log shows success:
Feb 16 12:49:08 qanet atftpd[2526.140445177575168]: Serving /mnt/openqa/repo/SLE-15-SP4-Online-x86_64-Build176.1-Media1/boot/x86_64/loader/linux to 10.162.2.75:49159
Feb 16 12:49:08 qanet atftpd[2526.140445177575168]: tsize option -> 11419424
Feb 16 12:49:08 qanet atftpd[2526.140445177575168]: blksize option -> 1408
Feb 16 12:49:10 qanet atftpd[2526.140445177575168]: End of transfer
Feb 16 12:49:10 qanet atftpd[2526.140445177575168]: Server thread exiting
Feb 16 12:49:10 qanet atftpd[2526.140445095032576]: Creating new socket: 10.162.0.1:40602
Feb 16 12:49:10 qanet atftpd[2526.140445095032576]: Serving /mnt/openqa/repo/SLE-15-SP4-Online-x86_64-Build176.1-Media1/boot/x86_64/loader/initrd to 10.162.2.75:49160
Feb 16 12:49:10 qanet atftpd[2526.140445095032576]: tsize option -> 140964092
Feb 16 12:49:10 qanet atftpd[2526.140445095032576]: blksize option -> 1408
Feb 16 12:49:35 qanet atftpd[2526.140445095032576]: End of transfer
Feb 16 12:49:35 qanet atftpd[2526.140445095032576]: Server thread exiting
Updated by okurz almost 2 years ago
- Due date deleted (
2023-03-02) - Status changed from Feedback to Resolved
ok, good. Likely the problem was caused because servers from BuildOps changed IP addresses. This won't happen that often. Last time likely only happened 6 years ago so we should be ok without doing any further improvements.