Actions
action #104007
closedcoordination #102906: [saga][epic] Increased stability of tests with less "known failures", known incompletes handled automatically within openQA
Support retry of openQA jobs based on test variables
Description
Motivation¶
Similar as in https://docs.microsoft.com/en-us/azure/devops/release-notes/2021/sprint-195-update#automatic-retries-for-a-task and https://docs.gitlab.com/ee/ci/yaml/#retry and https://github.com/marketplace/actions/retry-step
Acceptance criteria¶
- AC1: openQA jobs are automatically retriggered with a maximum amount specified in test variables
- AC2: By default openQA tests apply no automatic retry
Acceptance tests¶
- AT1-1: Given an unfinished openQA test with the special
RETRY
variable given with valueN
When the job fails And the current job is not already the N-th clone of an original job Then an automatic clone is created - AT1-2: Given an unfinished openQA test with the special
RETRY
variable given with valueN
When the job fails And the current job is the N-th clone of an original job Then no automatic clone is created - AT2-1: Given an unfinished openQA test with no special retry variable given When the job fails Then no automatic clone is created at this point (job done hooks or external events may still trigger clones as well as manual triggers)
Updated by okurz about 3 years ago
- Status changed from New to In Progress
- Assignee set to okurz
- Target version changed from future to Ready
also trying this one
Updated by okurz about 3 years ago
- Status changed from In Progress to Feedback
Updated by okurz about 3 years ago
- Status changed from Feedback to Resolved
merged. Deployed on o3. Tested:
$ openqa-cli api --o3 -X post jobs TEST=okurz_test_poo104007
{"id":2099581}
$ openqa-cli api --o3 -X post jobs TEST=okurz_test_poo104007 RETRY=3:bug#42
{"id":2099582}
$ openqa-cli api --o3 -X post jobs/2099582/set_done result=failed
{"reason":null,"result":null}
$ openqa-cli api --o3 -X post jobs/2099583/set_done result=failed
{"reason":null,"result":null}
$ openqa-cli api --o3 -X post jobs/2099584/set_done result=failed
{"reason":null,"result":null}
$ openqa-cli api --o3 -X post jobs/2099585/set_done result=failed
{"reason":null,"result":null}
$ openqa-cli api --o3 -X post jobs/2099586
404 Not Found
{"error_status":404}
so jobs where retried automatically on failing but aborted after 3 retries (original + 3 retries, 4 in total)
Updated by AdamWill almost 3 years ago
Oh this is great! Thanks! We've had a plugin in Fedora forever to retry certain tests on failure, can probably drop that and use this instead now. Yay.
Updated by okurz over 2 years ago
- Related to action #113758: Jobs restarted with `RETRY` are not shown as 'clones', so it is hard or impossible find the original job added
Actions