Project

General

Profile

Actions

coordination #77899

closed

openQA Project - coordination #39719: [saga][epic] Detection of "known failures" for stable tests, easy test results review and easy tracking of known issues

[epic] Extend "auto-review" for failed jobs as well

Added by okurz over 3 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
Start date:
2020-11-26
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)

Description

Motivation

Especially SUSE QEM suffers from the workload of manually reviewing openQA test results due to the comparatively high false-positive rate (as the product is of higher quality after GM in comparison to products in development before GM). The existing scenario based "label carry-over" is much less useful for the current setup of QAM scenarios that are spread over many different job groups. With "auto-review" we have a good solution to handle known incompletes, retrigger automatically where it makes sense as well as find new, unknown incompletes easily. As "auto-review" can work regardless of the result of the job but is just depending on what list of jobs is passed, we should evaluate to extend it for handling unlabeled failed results as well.

Acceptance criteria

  • AC1: Failed openQA jobs where the log(s) match a regex specified in progress tickets with "auto_review" like for incomplete jobs are labeled with the corresponding ticket
  • AC2: No gitlab CI pipelines monitored by the team SUSE QE Tools fail if there are unlabeled unknown failed jobs encountered
  • AC3: Same for o3 and osd
  • AC4: Power users know about the feature and how it can be used

Suggestions

  • Don't fail gitlab CI pipelines in case failed jobs are not known as SUSE QE Tools can't handle that load of unreviewed, new, failed tests and should not be concerned about that
  • Start with o3 as "testbed" and extend to osd if the process on o3 runs in a convincing way
  • Consider including the solution within openQA itself, e.g. as plugin, triggering a synchronous action when a job finishes and after automatic label carry-over did not find a convincing candidate
  • Consider caching of tickets to reduce the need for recurring loading from redmine API but still ensure that ticket updates, e.g. fixed auto-review regex's, have an effect, e.g. only cache for 10s or 1m
  • Present to power users, e.g. documentation, blog article, feature video, workshop

Subtasks 4 (0 open4 closed)

action #80414: [proof-of-concept] Extend "auto-review" for failed jobs as well, start with o3Resolvedokurz2020-11-26

Actions
action #80418: [learning] Fix parse errors in "openqa-investigate" "parse error: Invalid numeric literal at line 1, column 10"Resolvedmkittler2020-11-26

Actions
action #80806: Extend "auto-review" for failed jobs as well - Generalize openqa-monitor-investigation-candidates to look at more than just one job groupResolvedokurz2020-12-07

Actions
action #80808: Extend "auto-review" for failed jobs as well - enable same as on o3 but on osdResolvedokurz2020-12-07

Actions

Related issues 1 (0 open1 closed)

Copied to QA - action #77944: Run "auto-review" more often but alarm lessResolvedokurz2020-11-14

Actions
Actions

Also available in: Atom PDF