action #132827

Updated by okurz 12 months ago

## Observation 
 I can see that some tests are failing due to DNS resolve issue on workers "sapworker*", especially on multi-machine tests.can someone help check? 

 Some error messages as below: 

 ## Reproducible 

 [Failed test links]( 

 ## Expected result 

 I Tried with another worker to run the rsync tests without any issue: 

 ## Rollback steps 

 * Add back production worker class on sapworker{1,2,3}, i.e. revert 
 * Add back "tap" worker class to openqaworker1 and sapworker{1,2,3} 

 ## Further details 

 May be some network problems with workers "sapworker*", based on my tests [at least for rsync test result], the same test can pass with "worker5" but fail with "sapworker1" 

 ## Suggestions 
 - First ensure that all openQA workers have the salt state applied cleanly, e.g. `sudo salt --no-color -C 'G@roles:worker' state.apply` 
 - Maybe the failure can be improved on the os-autoinst side, like a better "die"message/reason 
 - As temporary measure consider disabling the "tap" class from affected workers, e.g. make it tap_pooXXX 
 - Debug multi-machine capabilities according to 
 - Ensure that our salt states ensure all what is needed to run stable multi-machine tests 
 - Add back production worker classes for all affected machines openqaworker1, sapworker{1-7}