Unlikely we can find out what caused this. Looking in the database I can find:
openqa=> select jobs.id,t_finished,test from jobs,job_settings where (jobs.test ~ 'windows' and job_settings.job_id = jobs.id and key = 'PUBLISH_HDD_1' and value = 'windows-10-x86_64-1903@uefi_win.qcow2');
id | t_finished | test
---------+---------------------+------------
1036580 | 2019-09-20 10:57:15 | windows_10
(1 row)
so a single job but that is much older – about the age of the actual fixed asset – and also https://openqa.opensuse.org/tests/1036580/file/worker-log.txt shows what looks like a "longer" upload corresponding to a file that is way bigger than 100kb. So I guess someone did a mistake, triggered one job, maybe aborted it prematurely, etc. Maybe we can just regard it as unlucky timing that caused it to end up in a way that is not completely obvious :D
In hindsight the wrong permissions might also be a symptom of "prematurely aborted upload" as it might be that in the correct case the file should change its ownership to geekotest. But could also be someone doing stuff manually. Overall the story looks related to #67219 .
So I think the immediate problem is fixed. I will take the ticket and try to use the opportunity for all of us involved to learn and see how we can improve in the future to maybe not prevent case like these but improve so that the next time we spend less time and effort to identify the root cause.
I have one finding: https://openqa.opensuse.org/tests/1277483 is the first job in the row that failed. maxlin reviewed and reported the bug on bugzilla. What could have helped is the initial investigation bisection step to distinguish "1. is it reproducible, 2. does the same test with test code of 'last good' still work, 3. does the same test with product state of 'last good' still work.". https://gitlab.suse.de/openqa/auto-review/pipelines is setup for that by triggering automatic investigation jobs for every new failures that do not yet have a comment. There was however unfortunate timing as the pipeline triggers every day at 0819 CET and maxlin commented at just 0759 CET so 20mins before :D The specific review job in question is https://gitlab.suse.de/openqa/auto-review/-/jobs/210675
Hence I have one simple suggestion: Use https://github.com/os-autoinst/scripts/blob/master/openqa-investigate for any new openQA test failures where the root cause is not immediately obvious
I am looking forward for more comments from all of you