Hi guys,
let me give you some explanation on the root cause of this issue.
As you might be aware of, to spawn installations on z/VM guests, there is a script called ftpboot to specify the installation medium to be loaded. You might be more familiar with the name "qaboot", which is basically a fork of ftpboot I wrote some years ago to adapt it to our use-cases, cause ftpboot has some functionality we didn't really need.
let me shortly explain how qaboot functions. Basically you can compare it to a PXE setup where you specify the which kernel and initrd the system is going to load and boot from.
qaboot mainly does only few things, as example let's take this call qaboot 10.160.0.207 SLE-15-SP4-Full-s390x-Build61.1-Media1\
- it looks on ftp server 10.160.0.207 (which is ftp://openqa.suse.de/) for a folder "SLE-15-SP4-Full-s390x-Build61.1-Media1"
- it creates a temporary disk T on the z/VM with a fixed size which was always big enough to hold all the data. (you see where this is going). This disk is destroyed before every creation, to ensure that there is no old data stored somewhere.
- it downloads initrd, kernel and default parmfile from the directory (ftp://openqa.suse.de/SLE-15-SP4-Full-s390x-Build61.1-Media1)
- it loads initrd and kernel from this temporary disk T and boots from it.
so here was the issue, I took the initial value from ftpboot which was 200KB, usually way larger than what it needs to store initrd, kernel and parmfile.
apparently, with build 61.1 the overall size of these 3 files exceeded this (I didn't quite figure out why, cause when I compared to previous builds it just slightly increased and my math never came to 200KB....)
However, while investigating this issue I stumbled across this 200KB and simply increased it to 500KB within the qaboot script. This already fixed it. I still cannot say why it couldn't fit the data from the new kernel, but probably the ftp download creates some swap files or so.
so long story short, increasing the size of this disk by 150% should solve it for good.
to describe the steps to check the qaboot script (please be cautious, cause this can destroy our workflows):
- I logged on as linuxmnt user over x3270 to s390zp11 (our z/VM hypervisor) - linuxmnt is some sort of administrator user who has the privileges to modify those files such as qaboot and distributes it to alle the remaining guests.
- I edited the qaboot script to increase the size of the disk.
- afterwards a logoff to all QAZVM### was done to make those guests aware of the change in the script.
this was quite a corner case in the z/VM administration area which imo makes it hard to implement proper monitoring for it, but I'd be happy to support here if you have any nice ideas how this could be done.