action #166748
open[MinimalVM] VMware images not handling hdd subfoldes
0%
Description
VMware jobs fail to include images from the hdd/fixed
folder, although it works in some cases when the assets are already present on the worker.
The main problems arises in https://github.com/os-autoinst/os-autoinst-distri-opensuse/blob/31dd5c1676685a016198cb1adaac265ef5b48be5/tests/installation/bootloader_svirt.pm#L143 and the following LOC, where handling of subfolders is not properly implemented. This should be done in the backend.
Original ticket¶
This poo is a follow up of issues noted in poo#162941: note-14, note-17, note-24.
In particular:
A) during SL Micro 6.0 Product Increments -
VMware tests run, in osado
-bootloader_svirt.pm
execution, the vmware image full-path is not transferred to the routines managing that file, but only the basename and the path is statically recomposed , assuming no subfoldes for hdd
: but those images are in hdd/fixed
, instead.
Therefore the bash snippet fails to find the source file, when the basename is not already present in the expected destination folder of the SUT.
B) Moreover, in next runs of those tests, bootloader_svirt.pm
passed ok, like 15400838, because the named snippet found the vmware image already in the right place(here transferred by some unknown or manual operation), skipping the copy command.
But leaving that image in the expected place, never cleaned, eventual images update from new builds would be never transferred, therefore not tested.
A possible fix is, sequentially applying the steps:
- update the code in (A) ensuring that the right full-path image is provided as origin in
_copy_image_vmware
. - Define a
lock-file policy
for the VMware image (using pre/post_run), to allow transferring that file but preventing that it is cleaned or ovewritten by other similar running tests during elaboration. - Update the
cleanup
, like in item n.2 of note-14.
Updated by mdati 4 months ago
- Assignee set to mdati
With reference to case (A), the subroutine calls are:
bootloader_svirt.pm
-> add_disk($self, $args): @args
contains also original image full-path
, but not used;
-> _copy_image_to_vm_host($args,...): @args
still contains original image full-path-file, but basename
only is transferred.
-> _copy_image_vmware(...,$file_basename,...): the only basename is received, the image path is hard-coded and partially re-calculated, without any subfolder management.
Therefore images in hdd/
(or iso/
) subfolders are not correctly managed by the copy command.
A fix for item (1) has been proposed in PR https://github.com/os-autoinst/os-autoinst/pull/2542.
Updated by ph03nix 4 months ago
Notes for myself to help me understand what's going on here:
- https://openqa.suse.de/tests/15040545/#step/bootloader_svirt/16 -
Can't copy VMware image SL-Micro.x86_64-6.0-Base-VMware-GM.vmdk
.
Assets are being copied to VMWARE_DATASTORE
. The failure looks to me like a race condition either on the host itself or when multiple workers are involved.
Updated by ph03nix 4 months ago
- Related to action #162941: Add job group definitions for SLEM 6.0 to QAC-yaml added
Updated by mdati 4 months ago · Edited
Note to add some more details about poo's issue, in addition to the description in poo header:
in os-autoinst consoles/sshVirtsh.pm the called _copy_image_vmware subroutine starts running this shell script, based on input-parameters, that contain the only file base name, triggering the:
(1)if-block
when vmware image file exists, in Datastore:
...
my $cmd =
"$ds_debug if test -e $vmware_openqa_datastore$file_basename; then " .
"while lsof | grep 'cp.*$file_basename'; do " .
"echo File $file_basename is being copied by other process, sleeping for 60 seconds; sleep 60;" .
'done;' .
that does nothing, otherwise the (2)else-block
:
'else ' .
"cp /vmfs/volumes/$vmware_nfs_datastore/$nfs_dir/$file_basename $vmware_openqa_datastore;" .
'fi;';
...
that should transfer the image file from source to datastore.
In one old test only that file was not-found and (2) block triggered, but being the source file in hdd/fixed/
subfolder, the cp
command failed expecting it in hdd/
.
For some unclear reason, in all next tests runs the image file was/is always present in the Datastore (may be manually copied), as confirmed using the VMWARE_NFS_DATASTORE_DEBUG=1
enabling set -x, therefore (1) only executed and no matter where the source image is.
Also, the file is never deleted, looking at the cleanup phase structure.
This is due to (A) the image-file parameter passed to the final routine without the original full-path, but only the base-name, (B) file never cleaned.
Therefore, the 3 steps proposed in the poo-header.
Updated by mdati 3 months ago · Edited
In a test cloned adding only VMWARE_NFS_DATASTORE_DEBUG=1
to enable set -x
, we can better see a fault like in note#7 (2)-else block
, due to image not found:
https://openqa.suse.de/tests/15563772/logfile?filename=autoinst-log.txt#line-266
[2024-09-30T14:44:11.715655Z] [debug] [pid:124900] [run_ssh_cmd(set -x; if test -e /vmfs/volumes/datastore1/openQA/SL-Micro.x86_64-6.0-Base-VMware-GM.vmdk; then while lsof | grep 'cp.*SL-Micro.x86_64-6.0-Base-VMware-GM.vmdk'; do echo File SL-Micro.x86_64-6.0-Base-VMware-GM.vmdk is being copied by other process, sleeping for 60 seconds; sleep 60;done;else cp /vmfs/volumes/openqa/hdd/SL-Micro.x86_64-6.0-Base-VMware-GM.vmdk /vmfs/volumes/datastore1/openQA/;fi;)] stderr:
+ test -e /vmfs/volumes/datastore1/openQA/SL-Micro.x86_64-6.0-Base-VMware-GM.vmdk
+ cp /vmfs/volumes/openqa/hdd/SL-Micro.x86_64-6.0-Base-VMware-GM.vmdk /vmfs/volumes/datastore1/openQA/
cp: can't stat '/vmfs/volumes/openqa/hdd/SL-Micro.x86_64-6.0-Base-VMware-GM.vmdk': No such file or directory
It is the same fault named in https://openqa.suse.de/tests/15040545/logfile?filename=autoinst-log.txt#line-507 for
https://openqa.suse.de/tests/15040545/logfile?filename=autoinst-log.txt#line-507, where debug not enabled and we see only last line, the cp error.
This shows why the PR https://github.com/os-autoinst/os-autoinst/pull/2542 is needed.
Updated by ph03nix 3 months ago
- Project changed from Containers and images to 208
- Subject changed from VMware images not handling hdd iso subfoldes to VMware images not handling hdd subfoldes
- Description updated (diff)
- Status changed from Blocked to Workable
- Assignee deleted (
mdati) - Priority changed from Normal to Low
Refining ticket and lowering priority and moving to MinimalVM project.