action #99336
closed[qe-core][migration] test fails in bootloader_zkvm - no space on openqa s390x worker auto_review:"rsync: write failed on.*/var/lib/libvirt/images/.*s390x.*No space left on device":retry
0%
Description
Observation¶
Found by Matthias:
There are 4 .img files for each instance, this was not the case before and could be the culprit here -why there are 4 diskks created for each job?
So we need figure out why this happens.
openQA test in scenario sle-15-SP4-Migration-from-SLE12-SPx-s390x-offline_sles12sp5_pscc_base_all_minimal@s390x-kvm-sle12 fails in
bootloader_zkvm
Test suite description¶
The base test suite is used for job templates defined in YAML documents. It has no settings of its own.
Reproducible¶
Fails since (at least) Build 38.1
Expected result¶
Last good: 36.1 (or more recent)
Further details¶
Always latest result in this scenario: latest
Updated by leli over 2 years ago
- Subject changed from [qe-core][migration] test fails in bootloader_zkvm - no space on openqa worker to [qe-core][migration] test fails in bootloader_zkvm - no space on openqa s390x worker
Updated by mgriessmeier over 2 years ago
first suggestion: increase disksize to mitigate current bottleneck - I''ll take care about that on mainframe side
second thing would be to find out where those img files come from and if they are really needed
Updated by nicksinger over 2 years ago
- Status changed from New to In Progress
- Assignee set to nicksinger
Disks got resize on the Z-side. /usr/bin/rescan-scsi-bus.sh -s
reports a resize from 200G -> 400G
Updated by vsvecova over 2 years ago
We are encountering failures in Maintenance jobs that seem very similar to the one you described, such as this one:
https://openqa.suse.de/tests/7239697#step/bootloader_start/19
Do you think it could be a similar issue?
Updated by nicksinger over 2 years ago
- Status changed from In Progress to Feedback
Steps to enlarge:
- stop workers on related jump-host (grenache-1)
umount /var/lib/libvirt/images
multipath resize map 36005076307ffd3b30000000000000148
fdisk /dev/mapper/36005076307ffd3b30000000000000148
(delete partition, creatae new with max size)partprobe
e2fsck -f /dev/mapper/36005076307ffd3b30000000000000148-part1
resize2fs /dev/mapper/36005076307ffd3b30000000000000148-part1
mount /var/lib/libvirt/images
- start workers on grenache-1 again
Updated by nicksinger over 2 years ago
vsvecova wrote:
We are encountering failures in Maintenance jobs that seem very similar to the one you described, such as this one:
https://openqa.suse.de/tests/7239697#step/bootloader_start/19Do you think it could be a similar issue?
likely. Both disks are enlarged now so you could retrigger to see if the issue still persists
Updated by okurz over 2 years ago
- Subject changed from [qe-core][migration] test fails in bootloader_zkvm - no space on openqa s390x worker to [qe-core][migration] test fails in bootloader_zkvm - no space on openqa s390x worker auto_review:"rsync: write failed on.*/var/lib/libvirt/images/.*s390x.*No space left on device":retry
Updated by nicksinger over 2 years ago
- Status changed from Feedback to Resolved
Checked the most recent jobs on each worker instance manually - workers seem to be able to complete jobs successfully so we can assume the extension worked.