coordination #62420: [epic] Distinguish all types of incompletes - openQA Project - openSUSE Project Management Tool

Actions

coordination #62420

closed

coordination #39719: [saga][epic] Detection of "known failures" for stable tests, easy test results review and easy tracking of known issues

[epic] Distinguish all types of incompletes

Added by okurz almost 5 years ago. Updated about 3 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

okurz

Category:

Feature requests

Target version:

Ready

Start date:

2018-12-12

Due date:

% Done:

100%

Estimated time:

(Total: 120.00 h)

Description

Motivation¶

As a test reviewer I want to understand the reason for incompletes so that I know who should do what to fix it

Acceptance criteria¶

AC1: All incompletes provide more details about the reason for the incompletion
AC2: Types of incompletes are distinguishable without needing to read logs
AC3: The incomplete reason is visualized in the UI (not only in logs)
AC4: If the reason is not known all available log details are accessible from the job

Acceptance tests¶

AT1-1: Given an incomplete job, When reading out the job from the openQA database, Then the incomplete reason is given
AT1-2: Same as AT1-1 but for a failed job, Then no incomplete reason is given
AT2-1: Given an incomplete job, When reading out the job over the API, Then the incomplete reason is rendered
AT3-1: Given an incomplete job, When showing job details in the webui, Then the incomplete reason is visible (not only in logs)

Suggestions¶

Check were "setup failure" and other results are provided and when no further details from other services, e.g. cacheservice, and ensure that there is a hint about the problem source.
Extend the API to also accept the incomplete reason
Extend the UI to also show the incomplete reason
Split "setup failure" into more specific types
Also for example we use "setup failure" in multiple cases but do not forward the result to the webui except in the log files as strings. In the case here we do not even have any information string that would point out what the real problem was or is, e.g. at least show the available logs or even extract log excerpts but I guess we already have this covered by showing the logs in the details tab when there is no other information available.

Further details¶

What to do when we asked the systemd service to stop because we want to reboot the machine? IMHO we should abort as fast as possible on TERM but provide a better information. This and all the experiences with different sources of incompletes from the past months brings me to the conclusion we just want to pass the internal "reason" we already have on the worker to the webui, e.g. "setup-failure" as we already have. And "worker-shutdown" can be another reason. Based on the reason we can also decide if we should auto-duplicate. For a "compilation-error", no retrigger, for "worker-shutdown" yes.

Started as ticket "Improve reporting on incompletes with result "setup-failure" and no further explanation".

See #62237 . There were many incompletes with not much details, e.g. https://openqa.suse.de/tests/3795872 shows just

[2020-01-17T10:56:50.0830 CET] [info] [pid:110583] +++ setup notes +++
[2020-01-17T10:56:50.0830 CET] [info] [pid:110583] Start time: 2020-01-17 09:56:50
[2020-01-17T10:56:50.0830 CET] [info] [pid:110583] Running on QA-Power8-5-kvm:6 (Linux 4.12.14-lp151.27-default #1 SMP Fri May 10 14:13:15 UTC 2019 (862c838) ppc64le)
[2020-01-17T11:01:50.0997 CET] [info] [pid:110583] +++ worker notes +++
[2020-01-17T11:01:50.0998 CET] [info] [pid:110583] End time: 2020-01-17 10:01:50
[2020-01-17T11:01:50.0998 CET] [info] [pid:110583] Result: setup failure
[2020-01-17T11:01:51.0002 CET] [info] [pid:21796] Uploading autoinst-log.txt

We have the result "setup failure" or "setup-failure" as it is also used but this is worker-internal and not forwarded to the webui.

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

QA » openQA Project

Tags

Custom queries