action #64884: Distinguish test contributor errors from unexpected backend crashes - openQA Project (public) - openSUSE Project Management Tool

Actions

action #64884

closed

coordination #39719: [saga][epic] Detection of "known failures" for stable tests, easy test results review and easy tracking of known issues

coordination #62420: [epic] Distinguish all types of incompletes

Distinguish test contributor errors from unexpected backend crashes

Added by okurz about 5 years ago. Updated almost 5 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

mkittler

Category:

Feature requests

Target version:

Start date:

2020-03-26

Due date:

% Done:

Estimated time:

Description

Motivation¶

See #62420#note-17 . Jobs can incomplete due to test contribution errors, e.g. invalid settings or simply syntax mistakes in test code, or with unexpected backend crashes which more likely target instance admins we should separate both by the "incomplete reason".

Acceptance criteria¶

AC1: test contributor errors, e.g. syntax mistakes in test code produce a distinct incomplete reason
AC2: Unexpected backend crashes still yield "died: …"

Suggestions¶

Research how syntax or compilation errors in test code are treated by isotovideo
If necessary put error message not only in autoinst-log.txt but accessible to openQA, e.g. in one of the json files read by openQA
in openQA where we write "died: terminated prematurely, see log output for details" distinguish if there is an error string about a known test contributor error and output according reason, else fall back to "died: …"

Related issues 1 (0 open — 1 closed)

Actions

Copy link

Updated by mkittler almost 5 years ago

Assignee set to mkittler
Target version set to Current Sprint

Actions

Copy link

Updated by mkittler almost 5 years ago

Blocked by action #64857: Put single-line error messages into incomplete reason for "died" added

Actions

Copy link

Updated by mkittler almost 5 years ago

Status changed from Workable to Blocked

It makes sense to implement forwarding the reason for isotovideo to stop early first. Then we can differentiate between different causes.

Actions

Copy link

Updated by mkittler almost 5 years ago

Status changed from Blocked to In Progress

Actions

Copy link

Updated by mkittler almost 5 years ago

PR for the openQA/worker part: https://github.com/os-autoinst/openQA/pull/3096

Actions

Copy link

Updated by mkittler almost 5 years ago

PR for os-autoinst: https://github.com/os-autoinst/os-autoinst/pull/1409

Actions

Copy link

Updated by mkittler almost 5 years ago

Status changed from In Progress to Feedback

With both PRs merged we should now see tests died: ... and backend died: .... There might still be just died: ... in case something else within os-autoinst failed.

Note that the PR only affects errors when loading main.pm and the test modules. So it is mainly about compilation errors. Unhandled exceptions within the test execution were already distinguished before: The test result is set to failed and there's a failed step result with the exception message. I don't think it makes sense to duplicate that information into the "reason".

So when this works in production I consider both ACs done.

Actions

Copy link

Updated by mkittler almost 5 years ago

It seems to work on OSD and o3 (see select jobs.id, t_started, t_finished, workers.host as worker_host, workers.instance as worker_instance, reason from jobs join workers on assigned_worker_id=workers.id where result = 'incomplete' order by id desc limit 100;).

The error `tests died: unable to load main.pm, check the log for the cause (e.g. syntax error)´ can now be observed on both instances.

On OSD there are several instances of backend died: No map for 'Ã\u0083' at /usr/lib/os-autoinst/consoles/VNC.pm line 741.. Maybe we should also look into that issue but it could also be caused by network issues or problems of the VNC server.

On o3 I've only seen several instances of backend died: Migrate to file failed, it has been running for more than 240 at /usr/lib/os-autoinst/backend/qemu.pm line 258.. I've created a PR to include the unit here. Not sure whether this error needs actual fixing.

Actions

Copy link

Updated by mkittler almost 5 years ago

Status changed from Feedback to Resolved
Target version deleted (~~Current Sprint~~)

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

QA (public) » openQA Project (public)

Tags

Custom queries

action #64884

Distinguish test contributor errors from unexpected backend crashes

Motivation¶

Acceptance criteria¶

Suggestions¶

Updated by mkittler almost 5 years ago

Updated by mkittler almost 5 years ago

Updated by mkittler almost 5 years ago

Updated by mkittler almost 5 years ago

Updated by mkittler almost 5 years ago

Updated by mkittler almost 5 years ago

Updated by mkittler almost 5 years ago

Updated by mkittler almost 5 years ago

Updated by mkittler almost 5 years ago