action #104007
closed
coordination #102906: [saga][epic] Increased stability of tests with less "known failures", known incompletes handled automatically within openQA
Support retry of openQA jobs based on test variables
Added by okurz about 3 years ago.
Updated about 3 years ago.
Category:
Feature requests
Description
Motivation¶
Similar as in https://docs.microsoft.com/en-us/azure/devops/release-notes/2021/sprint-195-update#automatic-retries-for-a-task and https://docs.gitlab.com/ee/ci/yaml/#retry and https://github.com/marketplace/actions/retry-step
Acceptance criteria¶
- AC1: openQA jobs are automatically retriggered with a maximum amount specified in test variables
- AC2: By default openQA tests apply no automatic retry
Acceptance tests¶
- AT1-1: Given an unfinished openQA test with the special
RETRY
variable given with value N
When the job fails And the current job is not already the N-th clone of an original job Then an automatic clone is created
- AT1-2: Given an unfinished openQA test with the special
RETRY
variable given with value N
When the job fails And the current job is the N-th clone of an original job Then no automatic clone is created
- AT2-1: Given an unfinished openQA test with no special retry variable given When the job fails Then no automatic clone is created at this point (job done hooks or external events may still trigger clones as well as manual triggers)
Related issues
1 (1 open — 0 closed)
- Description updated (diff)
- Description updated (diff)
Extended description with ATs
- Status changed from New to In Progress
- Assignee set to okurz
- Target version changed from future to Ready
- Status changed from In Progress to Feedback
- Status changed from Feedback to Resolved
merged. Deployed on o3. Tested:
$ openqa-cli api --o3 -X post jobs TEST=okurz_test_poo104007
{"id":2099581}
$ openqa-cli api --o3 -X post jobs TEST=okurz_test_poo104007 RETRY=3:bug#42
{"id":2099582}
$ openqa-cli api --o3 -X post jobs/2099582/set_done result=failed
{"reason":null,"result":null}
$ openqa-cli api --o3 -X post jobs/2099583/set_done result=failed
{"reason":null,"result":null}
$ openqa-cli api --o3 -X post jobs/2099584/set_done result=failed
{"reason":null,"result":null}
$ openqa-cli api --o3 -X post jobs/2099585/set_done result=failed
{"reason":null,"result":null}
$ openqa-cli api --o3 -X post jobs/2099586
404 Not Found
{"error_status":404}
so jobs where retried automatically on failing but aborted after 3 retries (original + 3 retries, 4 in total)
- Parent task set to #102906
Oh this is great! Thanks! We've had a plugin in Fedora forever to retry certain tests on failure, can probably drop that and use this instead now. Yay.
- Related to action #113758: Jobs restarted with `RETRY` are not shown as 'clones', so it is hard or impossible find the original job added
Also available in: Atom
PDF