Project

General

Profile

Actions

action #64884

closed

coordination #39719: [saga][epic] Detection of "known failures" for stable tests, easy test results review and easy tracking of known issues

coordination #62420: [epic] Distinguish all types of incompletes

Distinguish test contributor errors from unexpected backend crashes

Added by okurz about 4 years ago. Updated almost 4 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Feature requests
Target version:
-
Start date:
2020-03-26
Due date:
% Done:

0%

Estimated time:

Description

Motivation

See #62420#note-17 . Jobs can incomplete due to test contribution errors, e.g. invalid settings or simply syntax mistakes in test code, or with unexpected backend crashes which more likely target instance admins we should separate both by the "incomplete reason".

Acceptance criteria

  • AC1: test contributor errors, e.g. syntax mistakes in test code produce a distinct incomplete reason
  • AC2: Unexpected backend crashes still yield "died: …"

Suggestions

  • Research how syntax or compilation errors in test code are treated by isotovideo
  • If necessary put error message not only in autoinst-log.txt but accessible to openQA, e.g. in one of the json files read by openQA
  • in openQA where we write "died: terminated prematurely, see log output for details" distinguish if there is an error string about a known test contributor error and output according reason, else fall back to "died: …"

Related issues 1 (0 open1 closed)

Blocked by openQA Project - action #64857: Put single-line error messages into incomplete reason for "died"Resolvedlivdywan2020-03-26

Actions
Actions #1

Updated by mkittler almost 4 years ago

  • Assignee set to mkittler
  • Target version set to Current Sprint
Actions #2

Updated by mkittler almost 4 years ago

  • Blocked by action #64857: Put single-line error messages into incomplete reason for "died" added
Actions #3

Updated by mkittler almost 4 years ago

  • Status changed from Workable to Blocked

It makes sense to implement forwarding the reason for isotovideo to stop early first. Then we can differentiate between different causes.

Actions #4

Updated by mkittler almost 4 years ago

  • Status changed from Blocked to In Progress
Actions #5

Updated by mkittler almost 4 years ago

PR for the openQA/worker part: https://github.com/os-autoinst/openQA/pull/3096

Actions #7

Updated by mkittler almost 4 years ago

  • Status changed from In Progress to Feedback

With both PRs merged we should now see tests died: ... and backend died: .... There might still be just died: ... in case something else within os-autoinst failed.

Note that the PR only affects errors when loading main.pm and the test modules. So it is mainly about compilation errors. Unhandled exceptions within the test execution were already distinguished before: The test result is set to failed and there's a failed step result with the exception message. I don't think it makes sense to duplicate that information into the "reason".

So when this works in production I consider both ACs done.

Actions #8

Updated by mkittler almost 4 years ago

It seems to work on OSD and o3 (see select jobs.id, t_started, t_finished, workers.host as worker_host, workers.instance as worker_instance, reason from jobs join workers on assigned_worker_id=workers.id where result = 'incomplete' order by id desc limit 100;).


The error `tests died: unable to load main.pm, check the log for the cause (e.g. syntax error)´ can now be observed on both instances.

On OSD there are several instances of backend died: No map for 'Ã\u0083' at /usr/lib/os-autoinst/consoles/VNC.pm line 741.. Maybe we should also look into that issue but it could also be caused by network issues or problems of the VNC server.

On o3 I've only seen several instances of backend died: Migrate to file failed, it has been running for more than 240 at /usr/lib/os-autoinst/backend/qemu.pm line 258.. I've created a PR to include the unit here. Not sure whether this error needs actual fixing.

Actions #9

Updated by mkittler almost 4 years ago

  • Status changed from Feedback to Resolved
  • Target version deleted (Current Sprint)
Actions

Also available in: Atom PDF