Project

General

Profile

action #110725

coordination #80142: [saga][epic] Scale out: Redundant/load-balancing deployments of openQA, easy containers, containers on kubernetes

coordination #81060: [epic] openQA web UI in kubernetes

Unexpected behavior for cache service under k3s when the CACHE_MIN_FREE_PERCENTAGE is set size:M

Added by jbaier_cz about 2 months ago. Updated about 2 months ago.

Status:
Workable
Priority:
Low
Assignee:
-
Category:
Concrete Bugs
Target version:
Start date:
2022-05-06
Due date:
% Done:

0%

Estimated time:
Difficulty:

Description

Observation

If I include CACHE_MIN_FREE_PERCENTAGE = 10 in the workers.ini and the Cache Service is (probably) not able to find out the remaining space (which might be also a bug on its own), each downloaded asset is deleted before the next asset is downloaded, effectively leaving only the last one.

It should be also noted, that this behavior was observed on a worker running inside k3s cluster.

<5>[535] [i] Downloading: "openSUSE-MicroOS.x86_64-16.0.0-kvm-and-xen-Snapshot20220505.qcow2.sha256"
<5>[535] [i] Cache size 0 Byte + needed 0 Byte exceeds limit of 50 GiB, purging least used assets
<5>[535] [i] Cache size of "/var/lib/openqa/cache" is 0 Byte, with limit 50 GiB
<5>[535] [i] Downloading "openSUSE-MicroOS.x86_64-16.0.0-kvm-and-xen-Snapshot20220505.qcow2.sha256" from "http://polaris.suse.cz/tests/2424/asset/other/openSUSE-MicroOS.x86_64-16.0.0-kvm-and-xen-Snapshot20220505.qcow2.sha256"
<5>[535] [i] Cache size 0 Byte + needed 132 Byte exceeds limit of 50 GiB, purging least used assets
<5>[535] [i] Size of "/var/lib/openqa/cache/polaris.suse.cz/openSUSE-MicroOS.x86_64-16.0.0-kvm-and-xen-Snapshot20220505.qcow2.sha256" is 132 Byte, with ETag ""84-5de56a8b21945""
<5>[535] [i] Download of "/var/lib/openqa/cache/polaris.suse.cz/openSUSE-MicroOS.x86_64-16.0.0-kvm-and-xen-Snapshot20220505.qcow2.sha256" successful (4 KiB/s), new cache size is 132 Byte
<5>[535] [i] Finished download
<5>[536] [i] Downloading: "openSUSE-MicroOS.x86_64-16.0.0-kvm-and-xen-Snapshot20220505.qcow2"
<5>[536] [i] Cache size 0 Byte + needed 0 Byte exceeds limit of 50 GiB, purging least used assets
<5>[536] [i] Purging "/var/lib/openqa/cache/polaris.suse.cz/openSUSE-MicroOS.x86_64-16.0.0-kvm-and-xen-Snapshot20220505.qcow2.sha256" because we need space for new assets, reclaiming 132 Byte
<5>[536] [i] Cache size of "/var/lib/openqa/cache" is 0 Byte, with limit 50 GiB
<5>[536] [i] Downloading "openSUSE-MicroOS.x86_64-16.0.0-kvm-and-xen-Snapshot20220505.qcow2" from "http://polaris.suse.cz/tests/2424/asset/hdd/openSUSE-MicroOS.x86_64-16.0.0-kvm-and-xen-Snapshot20220505.qcow2"
<5>[536] [i] Cache size 0 Byte + needed 406 MiB exceeds limit of 50 GiB, purging least used assets
<5>[536] [i] Size of "/var/lib/openqa/cache/polaris.suse.cz/openSUSE-MicroOS.x86_64-16.0.0-kvm-and-xen-Snapshot20220505.qcow2" is 406 MiB, with ETag ""19640000-5de56a8af0c05""
<5>[536] [i] Download of "/var/lib/openqa/cache/polaris.suse.cz/openSUSE-MicroOS.x86_64-16.0.0-kvm-and-xen-Snapshot20220505.qcow2" successful (110 MiB/s), new cache size is 406 MiB
<5>[536] [i] Finished download
<5>[537] [i] Downloading: "ignition.qcow2"
<5>[537] [i] Cache size 406 MiB + needed 0 Byte exceeds limit of 50 GiB, purging least used assets
<5>[537] [i] Purging "/var/lib/openqa/cache/polaris.suse.cz/openSUSE-MicroOS.x86_64-16.0.0-kvm-and-xen-Snapshot20220505.qcow2" because we need space for new assets, reclaiming 406 MiB
<5>[537] [i] Cache size of "/var/lib/openqa/cache" is 0 Byte, with limit 50 GiB
<5>[537] [i] Downloading "ignition.qcow2" from "http://polaris.suse.cz/tests/2424/asset/hdd/ignition.qcow2"
<5>[537] [i] Cache size 0 Byte + needed 896 KiB exceeds limit of 50 GiB, purging least used assets
<5>[537] [i] Size of "/var/lib/openqa/cache/polaris.suse.cz/ignition.qcow2" is 896 KiB, with ETag ""e0000-5de56a7329d65""
<5>[537] [i] Download of "/var/lib/openqa/cache/polaris.suse.cz/ignition.qcow2" successful (21 MiB/s), new cache size is 896 KiB
<5>[537] [i] Finished download
# kubectl version
Server Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.6+k3s1", GitCommit:"418c3fa858b69b12b9cefbcff0526f666a6236b9", GitTreeState:"clean", BuildDate:"2022-04-28T22:16:18Z", GoVersion:"go1.17.5", Compiler:"gc", Platform:"linux/amd64"}

#  grep openqa /proc/mounts
/dev/sda2 /var/lib/openqa/cache btrfs rw,relatime,space_cache,subvolid=258,subvol=/@/var/lib/rancher/k3s/storage/pvc-05687443-98c4-451c-a404-74a10b6d2fa2_default_worker-cache 0 0

#  df -h /var/lib/openqa/cache/
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda2       932G   14G  917G   2% /var/lib/openqa/cache

Acceptance criteria

  • AC1: Handle the case gracefully in the cache service

Suggestion

  • Wait for Jan to prepare a nice way to replicate the problem
  • If the free disk space cannot be determined, disable the feature with a warning

Workaround

  • Don't use CACHE_MIN_FREE_PERCENTAGE (the feature is not enabled without it)

Related issues

Related to openQA Project - action #110524: [timeboxed:20h][spike] openQA proof-of-concept within kubernetes size:MResolved2022-05-02

History

#1 Updated by jbaier_cz about 2 months ago

  • Related to action #110524: [timeboxed:20h][spike] openQA proof-of-concept within kubernetes size:M added

#2 Updated by okurz about 2 months ago

  • Parent task set to #81060

#3 Updated by cdywan about 2 months ago

  • Subject changed from Unexpected behavior for cache service under k3s when the CACHE_MIN_FREE_PERCENTAGE is set to Unexpected behavior for cache service under k3s when the CACHE_MIN_FREE_PERCENTAGE is set size:M
  • Description updated (diff)
  • Status changed from New to Workable

#4 Updated by okurz about 2 months ago

  • Target version changed from Ready to future

Also available in: Atom PDF