Project

General

Profile

action #121573

Updated by okurz over 1 year ago

## Observation # Summary 
 mdoucha Martin pointed out an interesting case ( https://suse.slack.com/archives/C02CANHLANP/p1670328701486539 ) where a HDD-file was deleted while a job runs: https://openqa.suse.de/tests/10089958#step/oom03/5 
 I found a similar problem described here: #64544 https://progress.opensuse.org/issues/64544 but this is mainly about "pending" jobs. However, the related change in https://github.com/os-autoinst/openQA/pull/2918/files states that we "Consider all jobs which are not done or cancelled as pending" 

 With `journalctl --since=today -u openqa-worker-cacheservice-minion.service | grep -C 100 "14-819.1.g3e6aee2-Server"` I tried to understand when the asset was deleted (any why, most likely because the cache was full) but I only found when openQA/cacheservice downloaded it: 

 ``` 
 Dec 06 07:58:11 powerqaworker-qam-1 openqa-worker-cacheservice-minion[101810]: [101810] [i] Downloading "sle-12-SP5-ppc64le-4.12.14-819.1.g3e6aee2-Server-DVD-Incidents-Kernel-KOTD@ppc64le-virtio-with-ltp.qcow2" from "http://openqa.suse.de/tests/10089934/asset/hdd/sle-12-SP5-ppc64le-4.12.14-819.1.g3e6aee2-Server-DVD-Incidents-Kernel-KOTD@ppc64le-virtio-with-ltp.qcow2" 
 ``` 

 Afterwards I just see `Downloading: "sle-12-SP5-ppc64le-4.12.14-819.1.g3e6aee2-Server-DVD-Incidents-Kernel-KOTD@ppc64le-virtio-with-ltp.qcow2"` without the http link so I **assume** this means the asset is still present in the cache (and gets reused). 

 ## Acceptance criteria 
 * **AC1:** required assets are not deleted while they are in use by running/pending openQA tests 

 ## Suggestions 
 * Take a look at our cacheservice-minion code what conditions are required for an asset to be deleted

Back