Project

General

Profile

action #95458

Updated by okurz over 2 years ago

## Observation 

 openQA tests in HA scenarios (2 or 3 node clusters) fail in different modules due to unexpected reboots 
 in one or more of the SUTs: 

 1. [QAM HA qdevice node 1, fails in ha_cluster_init module](https://openqa.suse.de/tests/6427527#step/ha_cluster_init/36) 
 2. [QAM HA rolling upgrade migration, node 2, fails in filesystem module](https://openqa.suse.de/tests/6410169#step/filesystem/31) 
 3. [QAM HA hawk/HAProxy node 1, fails in check_after_reboot module](https://openqa.suse.de/tests/6426340#step/check_after_reboot/5) 
 4. [QAM 2 nodes, node 1, fails in ha_cluster_init module](https://openqa.suse.de/tests/6426369#step/ha_cluster_init/15) 

 ## Test suite description 
 The base test suite is used for job templates defined in YAML documents. It has no settings of its own. 


 ## Reproducible 

 Issue is very sporadic, and reproducing it is not always possible. Usually, re-triggering the jobs lead to the tests passing. 

 For example, from the jobs above, re-triggered jobs succeeded: 

 1. https://openqa.suse.de/tests/6435165 
 2. https://openqa.suse.de/tests/6435380 
 3. https://openqa.suse.de/tests/6435384 
 4. https://openqa.suse.de/tests/6435389 

 Find jobs referencing this ticket with the help of 
 https://raw.githubusercontent.com/os-autoinst/scripts/master/openqa-query-for-job-label , 
 call `openqa-query-for-job-label poo#95458` 


 ## Expected result 

 1. Last good: [:20349:samba](https://openqa.suse.de/tests/6426521) (or more recent) 
 2. Last good: [:20121:crmsh](https://openqa.suse.de/tests/6429080) 
 3. Last good: [20321:kernel-ec2](https://openqa.suse.de/tests/6422264) 
 4. Last good: [MR:244261:crmsh](https://openqa.suse.de/tests/6389360) 

Back