Project

General

Profile

action #96191

Updated by livdywan over 2 years ago

## Motivation 
 The hypothesis was raised that "multimachine jobs have decreased reliability since ~2 weeks (2 nodes). More nodes are even worse." Maybe true, maybe not. We should be able to calculate a fail-ratio for different categories of openQA tests, e.g. in grafana based on SQL queries. With this we would be able to support/reject the hypothesis. 

 ## Suggestion 
 - See what grafana data we have, or SQL queries, extend as needed 
 - Consider mm versus "normal" tests 
 - Focus on failed start with - we already deal with incompletes 
 - Exclude retried jobs since those don't run for mm

Back