action #70768
closedobs_rsync_run and obs_rsync_update_builds_text Minion tasks fail frequently
0%
Description
Observation¶
The obs_rsync_run
and obs_rsync_update_builds_text
Minion task fails frequently on OSD (not o3).
The failing `obs_rsync_run
can be observed using the following query parameters: https://openqa.suse.de/minion/jobs?soffset=0&task=obs_rsync_run&state=failed
I'm going to remove most of these jobs to calm down the alert but right now 43 jobs have piled up over 22 days. However, the problem actually exists longer than 22 days. I assume somebody cleaned up the Minion dashboard at some point.
The job arguments are always like this:
"args" => [
{
"project" => "SUSE:SLE-15-SP3:GA:Staging:E"
}
],
The results always look like one of these:
"result" => {
"code" => 256,
"message" => "rsync: change_dir \"/SUSE:/SLE-15-SP3:/GA:/Staging:/S/images/iso\" (in repos) failed: No such file or directory (2)\nrsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1674) [Receiver=3.1.3]"
},
"result" => {
"code" => 256,
"message" => "No file found: {SUSE:SLE-15-SP3:GA:Staging:B/read_files.sh}"
},
So there are two different cases when something can not be found.
There are also some failing obs_rsync_update_builds_text
jobs which look like this:
"result" => {
"code" => 256,
"message" => "rsync: change_dir \"/SUSE:/SLE-15-SP3:/GA:/Staging:/E/images/iso\" (in repos) failed: No such file or directory (2)\nrsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1674) [Receiver=3.1.3]\n"
},
Suggestions¶
It looks like that these failures are not a practical problem - at least we haven't received any negative feedback. Likely it works again on the next run or the job was not needed anymore anyways. If that's true these jobs should not end up as failures¹ or shouldn't have been created in the first place. Maybe there's also an actual bug we need to fix.
¹ Failures in the sense of the Minion dashboard are jobs which should be manually investigated but I doubt these failing jobs should be manually investigated when they occur.
Check low-level commands executed by obsrsync, potentially try to reproduce manuallyCheck if source project folders exist or not
Just adjust our monitoring to ignore obs_rsync failures. If test reviewers find missing assets in their tests these tests will incomplete and https://openqa.suse.de/minion/jobs?state=failed has additional information available on demand.