Project

General

Profile

action #70768

Updated by mkittler over 3 years ago

### Observation 

 The `obs_rsync_run` and `obs_rsync_update_builds_text` Minion task fails frequently on OSD (not o3). 

 The failing ``obs_rsync_run` This can be observed using the following query parameters: https://openqa.suse.de/minion/jobs?soffset=0&task=obs_rsync_run&state=failed   
 I'm going to remove most of these jobs to calm down the alert but right now 43 jobs have piled up over 22 days. However, the problem actually exists longer than 22 days. I assume somebody cleaned up the Minion dashboard at some point. 

 The job arguments are always like this: 

 ``` 
   "args" => [ 
     { 
       "project" => "SUSE:SLE-15-SP3:GA:Staging:E" 
     } 
   ], 
 ``` 

 The results always look like one of these: 

 ``` 
   "result" => { 
     "code" => 256, 
     "message" => "rsync: change_dir \"/SUSE:/SLE-15-SP3:/GA:/Staging:/S/images/iso\" (in repos) failed: No such file or directory (2)\nrsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1674) [Receiver=3.1.3]" 
   }, 
 ``` 

 ``` 
   "result" => { 
     "code" => 256, 
     "message" => "No file found: {SUSE:SLE-15-SP3:GA:Staging:B/read_files.sh}" 
   }, 
 ``` 

 So there are two different cases when something can not be found. 

 There are also some failing `obs_rsync_update_builds_text` jobs which look like this: 

 ``` 
   "result" => { 
     "code" => 256, 
     "message" => "rsync: change_dir \"/SUSE:/SLE-15-SP3:/GA:/Staging:/E/images/iso\" (in repos) failed: No such file or directory (2)\nrsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1674) [Receiver=3.1.3]\n" 
   }, 
 ``` 

 ### Suggestions 

 It looks like that these failures are not a practical problem - at least we haven't received any negative feedback. Likely it works again on the next run or the job was not needed anymore anyways. If that's true these jobs should not end up as failures¹ or shouldn't have been created in the first place. Maybe there's also an actual bug we need to fix. 

 ¹ Failures in the sense of the Minion dashboard are jobs which should be manually investigated but I doubt these failing jobs should be manually investigated when they occur. 

 * Check low-level commands executed by obsrsync, potentially try to reproduce manually 
 * Check if source project folders exist or not

Back