coordination #94105
openopenQA Project - coordination #102915: [saga][epic] Automated classification of failures
[epic] Use feedback from openqa-investigate to automatically inform on github pull requests, open tickets, weed out automatically failed tests
Description
User stories¶
- US1: As test writer introducing regressions in test code I want to be informed on my pull request in case my pull request is unambiguously identified as culprit for failing openQA tests so that I can quickly provide a regression fix
- US2: As a test writer I want to be informed on my pull request in case a failing openQA test is identified to be a clear test regression with multiple git commits as candidates that likely introduced a regression so that I can crosscheck my PR if it could have caused the failure
- US3: As a QE squad PO I would like to receive a ticket in the openQA tests issue tracker in case a failing openQA test is identified to be a clear test regression so that we can plan on fixing that issue
- US4: As an openQA infrastructure admin I would like to receive a ticket in the openQA infrastructure issue tracker in case a failing openQA test is identified to be a clear infrastructure regression so that we can plan on fixing that issue
- US5: As a maintenance coordination engineer creating maintenance (release) requests I want to receive a notification in case a failing openQA test is identified to be a clear product regression, i.e. if submitted changes trigger the problem, to be able to fix my submission
- US6: As an openQA test reviewer I want openqa-investigate to automatically create auto-review tickets that handle restarting tests that fail for the same reason so that I do not need to retrigger manually while test maintainers have time to fix the sporadic issue
Suggestions¶
- Incorporate https://progress.opensuse.org/projects/openqav3/wiki/#Categorization-scheme into an automatic decision tree with specific actions
Consider the different state results from combination of openqa-investigate jobs ("X" meaning failed, "V" passed, "O" other failure;
2^4=16
possible combination of results):- S0: retry X, last_good_test X, last_good_build X, last_good_test+build X -> infrastructure issue => report ticket in progress.opensuse.org/projects/openqa-infrastructure/issues/ with e.g. "Urgent" priority
- S1: retry X, last_good_test X, last_good_build X, last_good_test+build V -> sporadic issue => see S8-15
- S2: retry X, last_good_test X, last_good_build V, last_good_test+build X -> sporadic issue => see S8-15
- S3: retry X, last_good_test X, last_good_build V, last_good_test+build V -> reproducible product issue => if QAM test write comment on IBS/OBS or smelt, for non-QAM report product bug
- S4: retry X, last_good_test V, last_good_build X, last_good_test+build X -> sporadic issue => see S8-15
- S5: retry X, last_good_test V, last_good_build X, last_good_test+build V -> reproducible test regression => bisect git log, inform on pull request, report ticket in progress.opensuse.org/projects/openqatests/issues/
- S6: retry X, last_good_test V, last_good_build V, last_good_test+build X -> sporadic issue => see S8-15
- S7: retry X, last_good_test V, last_good_build V, last_good_test+build V -> sporadic issue => see S8-15
- S8-15: retry V (all 8 combinations) -> sporadic issue => automatically create auto-review tickets that handle restarting tests and retrigger original -> first step #94105
for sporadic issues bisect on all worker class settings in the "last_good vs. first_bad" diff except the scheduled one, e.g. for a job scheduled against qemu_x86_64 but where workers have like in last good "worker1,foo,bar" and first bad has "worker2,foo,baz" then retrigger against worker1 as well as bar and baz to check for impact of that sub-classes
for sporadic issues calculate fail ratio and conduct as many tests as needed to have a significant statistical number
Intermediate steps from weekly 2021-07-16:
- we can always start with just writing yet another comment on openQA jobs
Simplified approach: If last_good+build fails, a product regression is unlikely, state a comment for that
If last_good_test+build fails then likely to be infrastructure issue
Updated by okurz over 3 years ago
- Tracker changed from action to coordination
- Subject changed from Use feedback from openqa-investigate to automatically inform on github pull requests, open tickets, weed out automatically failed tests to [epic] Use feedback from openqa-investigate to automatically inform on github pull requests, open tickets, weed out automatically failed tests
- Description updated (diff)
- Assignee set to okurz
- Parent task set to #39719
Updated by okurz over 3 years ago
- Related to action #91773: Automatic replacement of openQA job URLs preview of openQA size:M added
Updated by okurz over 3 years ago
- Description updated (diff)
- Assignee deleted (
okurz)
subtasks provided, more can created based on the provided examples.
Updated by okurz over 3 years ago
- Status changed from New to Workable
In the weekly estimation meeting we decided that for an epic it's actually "Workable" because we don't need to estimate the epic itself and the next task is simply to refine and create subtasks
Updated by okurz almost 3 years ago
- Related to action #107014: trigger openqa-trigger-bisect-jobs from our automatic investigations whenever the cause is not already known size:M added
Updated by livdywan over 2 years ago
Redmine won't mention it, but #109920 is a new subtask and this epic can move on and isn't stale
Updated by okurz over 2 years ago
- Status changed from Workable to Blocked
- Assignee set to okurz
Updated by okurz about 1 year ago
- Target version changed from Ready to Tools - Next