action #137984

Updated by okurz 9 months ago

## Observation shows a lot of errors, e.g. 

         Traceback (most recent call last): 
           File "/usr/lib/python3.6/site-packages/salt/modules/", line 79, in _get_top_file_envs 
             return __context__["saltutil._top_file_envs"] 
           File "/usr/lib/python3.6/site-packages/salt/loader/", line 78, in __getitem__ 
             return self.value()[item] 
         KeyError: 'saltutil._top_file_envs' 
         During handling of the above exception, another exception occurred: 
         Traceback (most recent call last): 
           File "/usr/lib/python3.6/site-packages/salt/", line 2110, in _thread_multi_return 

 but in the end the CI job passes instead of failing 

 ## Steps to reproduce 
 I assume so far as long as is not fully effective yet the problem can be reproduced by rerunning the CI job. The error message itself can be reproduced on osd with 

 salt --no-color 's390zl12*' saltutil.sync_grains 

 ## Acceptance criteria 
 * **AC1:** Obvious errors visible in the log of the "refresh" CI job should fail the CI job 

 ## Suggestions 
 * *DONE* Crosscheck if the salt command itself provides a non-zero exit code when the problem reproduces -> the command on osd `salt --no-color 's390zl12*' saltutil.sync_grains; echo $?` yields "1" from the exit code. So likely the problem is that in the CI instructions the command executed over ssh is not properly valuing the exit code of the internal command execution 
 * Ensure that the CI job values the exit code or error condition accordingly 
 * Make sure the exit code is still evaluated regardless of shown error messages 

 ## Problem 
 * The problem seems to be related to the compound statement. `salt \* saltutil.sync_grains,saltutil.refresh_grains ,` yields a 0 exit code, `salt \* saltutil.sync_grains` yields 1