Actions
action #110725
opencoordination #80142: [saga][epic] Scale out: Redundant/load-balancing deployments of openQA, easy containers, containers on kubernetes
coordination #81060: [epic] openQA web UI in kubernetes
Unexpected behavior for cache service under k3s when the CACHE_MIN_FREE_PERCENTAGE is set size:M
Start date:
2022-05-06
Due date:
% Done:
0%
Estimated time:
Description
Observation¶
If I include CACHE_MIN_FREE_PERCENTAGE = 10
in the workers.ini and the Cache Service is (probably) not able to find out the remaining space (which might be also a bug on its own), each downloaded asset is deleted before the next asset is downloaded, effectively leaving only the last one.
It should be also noted, that this behavior was observed on a worker running inside k3s cluster.
<5>[535] [i] Downloading: "openSUSE-MicroOS.x86_64-16.0.0-kvm-and-xen-Snapshot20220505.qcow2.sha256"
<5>[535] [i] Cache size 0 Byte + needed 0 Byte exceeds limit of 50 GiB, purging least used assets
<5>[535] [i] Cache size of "/var/lib/openqa/cache" is 0 Byte, with limit 50 GiB
<5>[535] [i] Downloading "openSUSE-MicroOS.x86_64-16.0.0-kvm-and-xen-Snapshot20220505.qcow2.sha256" from "http://polaris.suse.cz/tests/2424/asset/other/openSUSE-MicroOS.x86_64-16.0.0-kvm-and-xen-Snapshot20220505.qcow2.sha256"
<5>[535] [i] Cache size 0 Byte + needed 132 Byte exceeds limit of 50 GiB, purging least used assets
<5>[535] [i] Size of "/var/lib/openqa/cache/polaris.suse.cz/openSUSE-MicroOS.x86_64-16.0.0-kvm-and-xen-Snapshot20220505.qcow2.sha256" is 132 Byte, with ETag ""84-5de56a8b21945""
<5>[535] [i] Download of "/var/lib/openqa/cache/polaris.suse.cz/openSUSE-MicroOS.x86_64-16.0.0-kvm-and-xen-Snapshot20220505.qcow2.sha256" successful (4 KiB/s), new cache size is 132 Byte
<5>[535] [i] Finished download
<5>[536] [i] Downloading: "openSUSE-MicroOS.x86_64-16.0.0-kvm-and-xen-Snapshot20220505.qcow2"
<5>[536] [i] Cache size 0 Byte + needed 0 Byte exceeds limit of 50 GiB, purging least used assets
<5>[536] [i] Purging "/var/lib/openqa/cache/polaris.suse.cz/openSUSE-MicroOS.x86_64-16.0.0-kvm-and-xen-Snapshot20220505.qcow2.sha256" because we need space for new assets, reclaiming 132 Byte
<5>[536] [i] Cache size of "/var/lib/openqa/cache" is 0 Byte, with limit 50 GiB
<5>[536] [i] Downloading "openSUSE-MicroOS.x86_64-16.0.0-kvm-and-xen-Snapshot20220505.qcow2" from "http://polaris.suse.cz/tests/2424/asset/hdd/openSUSE-MicroOS.x86_64-16.0.0-kvm-and-xen-Snapshot20220505.qcow2"
<5>[536] [i] Cache size 0 Byte + needed 406 MiB exceeds limit of 50 GiB, purging least used assets
<5>[536] [i] Size of "/var/lib/openqa/cache/polaris.suse.cz/openSUSE-MicroOS.x86_64-16.0.0-kvm-and-xen-Snapshot20220505.qcow2" is 406 MiB, with ETag ""19640000-5de56a8af0c05""
<5>[536] [i] Download of "/var/lib/openqa/cache/polaris.suse.cz/openSUSE-MicroOS.x86_64-16.0.0-kvm-and-xen-Snapshot20220505.qcow2" successful (110 MiB/s), new cache size is 406 MiB
<5>[536] [i] Finished download
<5>[537] [i] Downloading: "ignition.qcow2"
<5>[537] [i] Cache size 406 MiB + needed 0 Byte exceeds limit of 50 GiB, purging least used assets
<5>[537] [i] Purging "/var/lib/openqa/cache/polaris.suse.cz/openSUSE-MicroOS.x86_64-16.0.0-kvm-and-xen-Snapshot20220505.qcow2" because we need space for new assets, reclaiming 406 MiB
<5>[537] [i] Cache size of "/var/lib/openqa/cache" is 0 Byte, with limit 50 GiB
<5>[537] [i] Downloading "ignition.qcow2" from "http://polaris.suse.cz/tests/2424/asset/hdd/ignition.qcow2"
<5>[537] [i] Cache size 0 Byte + needed 896 KiB exceeds limit of 50 GiB, purging least used assets
<5>[537] [i] Size of "/var/lib/openqa/cache/polaris.suse.cz/ignition.qcow2" is 896 KiB, with ETag ""e0000-5de56a7329d65""
<5>[537] [i] Download of "/var/lib/openqa/cache/polaris.suse.cz/ignition.qcow2" successful (21 MiB/s), new cache size is 896 KiB
<5>[537] [i] Finished download
# kubectl version
Server Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.6+k3s1", GitCommit:"418c3fa858b69b12b9cefbcff0526f666a6236b9", GitTreeState:"clean", BuildDate:"2022-04-28T22:16:18Z", GoVersion:"go1.17.5", Compiler:"gc", Platform:"linux/amd64"}
# grep openqa /proc/mounts
/dev/sda2 /var/lib/openqa/cache btrfs rw,relatime,space_cache,subvolid=258,subvol=/@/var/lib/rancher/k3s/storage/pvc-05687443-98c4-451c-a404-74a10b6d2fa2_default_worker-cache 0 0
# df -h /var/lib/openqa/cache/
Filesystem Size Used Avail Use% Mounted on
/dev/sda2 932G 14G 917G 2% /var/lib/openqa/cache
Acceptance criteria¶
- AC1: Handle the case gracefully in the cache service
Suggestion¶
- Wait for Jan to prepare a nice way to replicate the problem
- If the free disk space cannot be determined, disable the feature with a warning
Workaround¶
- Don't use
CACHE_MIN_FREE_PERCENTAGE
(the feature is not enabled without it)
Actions