Project

General

Profile

coordination #94105

openQA Project - coordination #102915: [saga][epic] Automated classification of failures

[epic] Use feedback from openqa-investigate to automatically inform on github pull requests, open tickets, weed out automatically failed tests

Added by okurz 11 months ago. Updated 17 days ago.

Status:
Blocked
Priority:
Normal
Assignee:
Target version:
Start date:
2021-07-20
Due date:
% Done:

29%

Estimated time:
(Total: 40.00 h)

Description

User stories

  • US1: As test writer introducing regressions in test code I want to be informed on my pull request in case my pull request is unambiguously identified as culprit for failing openQA tests so that I can quickly provide a regression fix
  • US2: As a test writer I want to be informed on my pull request in case a failing openQA test is identified to be a clear test regression with multiple git commits as candidates that likely introduced a regression so that I can crosscheck my PR if it could have caused the failure
  • US3: As a QE squad PO I would like to receive a ticket in the openQA tests issue tracker in case a failing openQA test is identified to be a clear test regression so that we can plan on fixing that issue
  • US4: As an openQA infrastructure admin I would like to receive a ticket in the openQA infrastructure issue tracker in case a failing openQA test is identified to be a clear infrastructure regression so that we can plan on fixing that issue
  • US5: As a maintenance coordination engineer creating maintenance (release) requests I want to receive a notification in case a failing openQA test is identified to be a clear product regression, i.e. if submitted changes trigger the problem, to be able to fix my submission
  • US6: As an openQA test reviewer I want openqa-investigate to automatically create auto-review tickets that handle restarting tests that fail for the same reason so that I do not need to retrigger manually while test maintainers have time to fix the sporadic issue

Suggestions

  • Incorporate https://progress.opensuse.org/projects/openqav3/wiki/#Categorization-scheme into an automatic decision tree with specific actions
  • Consider the different state results from combination of openqa-investigate jobs ("X" meaning failed, "V" passed, "O" other failure; 2^4=16 possible combination of results):

    • S0: retry X, last_good_test X, last_good_build X, last_good_test+build X -> infrastructure issue => report ticket in progress.opensuse.org/projects/openqa-infrastructure/issues/ with e.g. "Urgent" priority
    • S1: retry X, last_good_test X, last_good_build X, last_good_test+build V -> sporadic issue => see S8-15
    • S2: retry X, last_good_test X, last_good_build V, last_good_test+build X -> sporadic issue => see S8-15
    • S3: retry X, last_good_test X, last_good_build V, last_good_test+build V -> reproducible product issue => if QAM test write comment on IBS/OBS or smelt, for non-QAM report product bug
    • S4: retry X, last_good_test V, last_good_build X, last_good_test+build X -> sporadic issue => see S8-15
    • S5: retry X, last_good_test V, last_good_build X, last_good_test+build V -> reproducible test regression => bisect git log, inform on pull request, report ticket in progress.opensuse.org/projects/openqatests/issues/
    • S6: retry X, last_good_test V, last_good_build V, last_good_test+build X -> sporadic issue => see S8-15
    • S7: retry X, last_good_test V, last_good_build V, last_good_test+build V -> sporadic issue => see S8-15
    • S8-15: retry V (all 8 combinations) -> sporadic issue => automatically create auto-review tickets that handle restarting tests and retrigger original -> first step #94105
  • Intermediate steps from weekly 2021-07-16:

    • we can always start with just writing yet another comment on openQA jobs
  • Simplified approach: If last_good+build fails, a product regression is unlikely, state a comment for that

  • If last_good_test+build fails then likely to be infrastructure issue


Subtasks

action #95742: In openqa-investigate jobs add URL to original job as settingResolvedokurz

action #95746: Identify likely "sporadic" openQA tests with "openqa-investigate" size:MResolvedXiaojing_liu

openQA Project - action #98862: Comment about intermittent/sporadic test issues on original job if openqa-investigate retry job passesBlockedokurz

action #109920: Identify reproducible product issues using openqa-investigateBlockedokurz

action #110176: [spike solution] [timeboxed:10h] Restart hook script in delayed minion job based on exit code size:MWorkable

openQA Project - action #110518: Call job_done_hooks if requested by test setting (not only openQA config as done so far) size:MWorkable

openQA Project - action #110530: Do NOT call job_done_hooks if requested by test settingBlockedokurz


Related issues

Related to openQA Project - action #91773: Automatic replacement of openQA job URLs preview of openQA size:MResolved2021-04-26

Related to QA - action #107014: trigger openqa-trigger-bisect-jobs from our automatic investigations whenever the cause is not already known size:MResolved2022-02-17

History

#1 Updated by ilausuch 10 months ago

  • Description updated (diff)

#2 Updated by okurz 10 months ago

  • Tracker changed from action to coordination
  • Subject changed from Use feedback from openqa-investigate to automatically inform on github pull requests, open tickets, weed out automatically failed tests to [epic] Use feedback from openqa-investigate to automatically inform on github pull requests, open tickets, weed out automatically failed tests
  • Description updated (diff)
  • Assignee set to okurz
  • Parent task set to #39719

#3 Updated by okurz 10 months ago

  • Description updated (diff)

#4 Updated by okurz 10 months ago

  • Description updated (diff)

#5 Updated by okurz 10 months ago

  • Related to action #91773: Automatic replacement of openQA job URLs preview of openQA size:M added

#6 Updated by okurz 10 months ago

#91773 might be related, might help.

#7 Updated by okurz 10 months ago

  • Description updated (diff)
  • Assignee deleted (okurz)

subtasks provided, more can created based on the provided examples.

#8 Updated by okurz 10 months ago

  • Status changed from New to Workable

In the weekly estimation meeting we decided that for an epic it's actually "Workable" because we don't need to estimate the epic itself and the next task is simply to refine and create subtasks

#9 Updated by okurz 6 months ago

  • Parent task changed from #39719 to #102915

#10 Updated by okurz 3 months ago

  • Related to action #107014: trigger openqa-trigger-bisect-jobs from our automatic investigations whenever the cause is not already known size:M added

#11 Updated by cdywan about 1 month ago

Redmine won't mention it, but #109920 is a new subtask and this epic can move on and isn't stale

#12 Updated by okurz about 1 month ago

  • Description updated (diff)

#13 Updated by okurz about 1 month ago

  • Description updated (diff)

#14 Updated by okurz 17 days ago

  • Status changed from Workable to Blocked
  • Assignee set to okurz

Also available in: Atom PDF