Project

General

Profile

action #109920

openQA Project - coordination #102915: [saga][epic] Automated classification of failures

coordination #94105: [epic] Use feedback from openqa-investigate to automatically inform on github pull requests, open tickets, weed out automatically failed tests

Identify reproducible product issues using openqa-investigate

Added by okurz 2 months ago. Updated 3 days ago.

Status:
New
Priority:
Normal
Assignee:
-
Target version:
Start date:
2022-04-13
Due date:
% Done:

0%

Estimated time:

Description

Motivation

See parent #94105 where we identified multiple users stories regarding creating tickets or identifying direct or indirect users of openQA based on openqa-investigate results. As a next step we could try to identify product issues from openqa-investigate results, in particular the step "S3: retry X, last_good_test X, last_good_build V, last_good_test+build V -> reproducible product issue => if QAM test write comment on IBS/OBS or smelt, for non-QAM report product bug" from https://progress.opensuse.org/issues/94105#Suggestions

Acceptance criteria

  • AC1: On failed openQA jobs with openqa-investigate infos "retry X, last_good_test X, last_good_build V, last_good_test+build V" (X: failed, V: passed) a comment is written pointing to a likely product regression
  • AC2: No such comment is written on other jobs

Suggestions

  • Take a look how we identify likely sporadic issues as a result of the "retry" job in https://github.com/os-autoinst/scripts/blob/master/openqa-investigate#L136=
  • Then try to fan-in on the results of multiple investigation jobs to find the jobs with the combination "retry X, last_good_test X, last_good_build V, last_good_test+build V" (X: failed, V: passed). The challenge is that job done hooks are called on a single job so one would need to identify other sibling investigation jobs. And any other job can finish sooner than the others. Maybe we just call this investigation step on the "last_good_test" and if other jobs are not finished by then, then trigger another incarnation of the same minion job with a delay (exponential back-off?). Communicate by exit code? This would also avoid the need to run job_done_hooks on passed jobs.
  • Then add an openQA comment stating the observation about a likely product regression
  • Note: The bash script openqa-investigate itself must not know anything about "openQA minion jobs" or schedule any

Related issues

Copied to QA - action #110176: [spike solution] [timeboxed:10h] Restart hook script in delayed minion job based on exit code size:MResolved2022-06-15

History

#1 Updated by okurz 2 months ago

  • Description updated (diff)

#2 Updated by mkittler 2 months ago

  • Copied to action #110176: [spike solution] [timeboxed:10h] Restart hook script in delayed minion job based on exit code size:M added

#3 Updated by okurz 2 months ago

  • Description updated (diff)
  • Status changed from New to Blocked
  • Assignee set to okurz

#4 Updated by okurz 3 days ago

  • Status changed from Blocked to New
  • Assignee deleted (okurz)

Also available in: Atom PDF