action #106898
closedProtection against asset clobbering
Description
QCOW images in OpenQA occasionally get corrupted because multiple jobs try to publish the same file at the same time, either due to PUBLISH_*
setting misconfiguration or duplicate install jobs scheduled in parallel. For example, this job failed to start:
https://openqa.suse.de/tests/8162749
because these three install jobs finished 20 minutes apart and tried to upload the same QCOW image:
https://openqa.suse.de/tests/8162347
https://openqa.suse.de/tests/8161501
https://openqa.suse.de/tests/8160547
Please add some sort of protection against asset clobbering via PUBLISH_*
variables:
- two jobs must not publish the same file in parallel
- jobs must not publish a file while another job may be downloading the previous version
PUBLISH_*
misconfiguration (e.g. copy-paste mistakes among multiple testsuites) should be detected and reported in the WebUI, for example as the reason why install job was terminated
Updated by okurz over 2 years ago
- Related to action #109319: [qe-core] aarch64 tests failing in qemu-img due to broken image (was: "with cache error") size:S added
Updated by okurz over 2 years ago
- Status changed from New to Resolved
- Assignee set to okurz
- Target version changed from future to Ready
AFAICS https://github.com/os-autoinst/openQA/pull/4597 solves this together with pointing out the limitations and implications in https://open.qa/docs/#_specifying_assets_created_by_a_job