Project

General

Profile

action #94606

Updated by okurz 6 months ago

## Motivation 
 From discussion between okurz and mgrifalconi. Currently SLE maintenance aggregate tests are scheduled twice per day. Often only the first build of a day is interesting for reviewers as it is likely more complete and the second build would likely only include a smaller inter-day delta. But currently (to-be-confirmed) aggregate tests are scheduled by obsoleting older builds meaning that the tests of the first build per day might not yet be completely finished and aborted when the second build gets triggered. As openQA supports deprioritizing older builds instead of obsoleting this can also give aggregate tests the possibility to finish. 

 ## Acceptance criteria 
 * **AC1:** SLE maintenance aggregate jobs from older builds the first build of a day can (mostly) all finish even if not finished by the time another the second build per day is scheduled 
 * **AC2:** OSD can still ensure a reasonable job age for all related architectures and worker classes 

 ## Suggestions 

 As documented on http://open.qa/docs/#_spawning_multiple_jobs_based_on_templates_isos_post use `_DEPRIORITIZEBUILD` instead of `_OBSOLETE`, e.g. in https://gitlab.suse.de/qa-maintenance/openQABot/-/blob/400f79aa9bb8283870aba16f8b6749f37400d454/openqabot/openqabot.py#L184 
 * Monitor the impact of _DEPRIORITIZEBUILD 
 * Tweak _DEPRIORITIZE_LIMIT based on monitoring data and observation over some days/weeks 
 * Consider setting the _ONLY_OBSOLETE_SAME_BUILD option 
 * Consider introducing the option to set scheduling flags in the metadata project e.g. by product/team/group 

 


 ## Challenges 
 * AFAIR originally there had been even more "aggregate tests". The next build is scheduled which is always scheduled with a constant time offset (unlike in product validation where there can be the exception of a rapid succession of builds). If the first build of a day is not even able to finish all tests by then and this is not blocking the release of any updates then I guess we won't significantly benefit from such behaviour change. IMHO the criteria for releaseability should not be "any failed test blocking the release" but "not less passed tests than on our reference". If we would stick to that then we would have a direct motivation to have efficient, fast, relevant tests.

Back