action #112742
closed[tools] aarch64 - qemu-img: /var/lib/openqa/pool/14/raid/hd0-overlay0: Image is not in qcow2 format
0%
Description
Many aarch64 jobs that are using qcow2 images are failing with the following error:
[2022-06-20T12:00:57.994995+02:00] [info] ::: backend::baseclass::die_handler: Backend process died, backend errors are reported below in the following lines:
runcmd '/usr/bin/qemu-img create -f qcow2 -F qcow2 -b /var/lib/openqa/pool/14/SLES-15-SP1-aarch64-mru-install-minimal-with-addons-Build20220619-1-Server-DVD-Updates-aarch64-virtio.qcow2 /var/lib/openqa/pool/14/raid/hd0-overlay0 1016659968' failed with exit code 1: 'qemu-img: /var/lib/openqa/pool/14/raid/hd0-overlay0: Image is not in qcow2 format
Could not open backing image.' at /usr/lib/os-autoinst/osutils.pm line 89.
https://openqa.suse.de/tests/8985743
https://openqa.suse.de/tests/8985737
https://openqa.suse.de/tests/8985736
https://openqa.suse.de/tests/8985743
https://openqa.suse.de/tests/8985742
https://openqa.suse.de/tests/8985741
https://openqa.suse.de/tests/8985740
https://openqa.suse.de/tests/8985739
Updated by szarate over 2 years ago
- Project changed from openQA Infrastructure to openQA Project
- Subject changed from aarch64 - qemu-img: /var/lib/openqa/pool/14/raid/hd0-overlay0: Image is not in qcow2 format to [tools] aarch64 - qemu-img: /var/lib/openqa/pool/14/raid/hd0-overlay0: Image is not in qcow2 format
While it doesn't happen all the time, it's often that some images get mangled on the upload phase... I can't recall how the worker code handles the image upload, but adding an extra check for integrity on the webui side, wouldn't be that bad. before marking the job as passed
Updated by okurz over 2 years ago
- Related to action #109319: [qe-core] aarch64 tests failing in qemu-img due to broken image (was: "with cache error") size:S added
Updated by okurz over 2 years ago
- Description updated (diff)
- Category set to Regressions/Crashes
- Target version set to future
Might be related to #109319. With https://github.com/os-autoinst/openQA/pull/4597 we should already be able to prevent corrupted assets being uploaded.
I checked on openqaworker-arm-2 with
find /var/lib/openqa/cache -name 'SLES-15-SP1-aarch64-mru-install-minimal-with-addons-Build20220619-1-Server-DVD-Updates-aarch64-virtio.qcow2'
file /var/lib/openqa/cache/openqa.suse.de/SLES-15-SP1-aarch64-mru-install-minimal-with-addons-Build20220619-1-Server-DVD-Updates-aarch64-virtio.qcow2
sha256sum /var/lib/openqa/cache/openqa.suse.de/SLES-15-SP1-aarch64-mru-install-minimal-with-addons-Build20220619-1-Server-DVD-Updates-aarch64-virtio.qcow2
and got
/var/lib/openqa/cache/openqa.suse.de/SLES-15-SP1-aarch64-mru-install-minimal-with-addons-Build20220619-1-Server-DVD-Updates-aarch64-virtio.qcow2: QEMU QCOW Image (v3), 42949672960 bytes
67b3e8f3dafad6d5230160f5204afa4008ddf1914d337e73812efe2004477cd4 /var/lib/openqa/cache/openqa.suse.de/SLES-15-SP1-aarch64-mru-install-minimal-with-addons-Build20220619-1-Server-DVD-Updates-aarch64-virtio.qcow2
with the file being new since the jobs ran, Jun 20 14:20.
I think we should have checksums uploaded as assets as well and use them to check integrity for each transfer process.
@punkioudi @szarate if you see more cases I suggest you check the checksums before anything or anyone overwrites the files so that we can narrow down where the problem happens.
Updated by szarate almost 2 years ago
- Description updated (diff)
- Status changed from New to Rejected
Rejecting for now, not seen in a while (at least not me)