action #47219: [functional][u][easy] test fails in keymap_or_locale - rogue workqueue lockup - openQA Tests (public) - openSUSE Project Management Tool

Actions

Copy link

action #47219

closed

[functional][u][easy] test fails in keymap_or_locale - rogue workqueue lockup

Added by szarate about 6 years ago. Updated almost 6 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

jorauch

Category:

Bugs in existing tests

Target version:

SUSE QA (private) - Milestone 23

Start date:

2019-02-26

Due date:

% Done:

100%

Estimated time:

(Total: 0.00 h)

Difficulty:

Description

Observation¶

openQA test in scenario sle-15-SP1-Installer-DVD-aarch64-Build159.1-textmode@aarch64 fails in
keymap_or_locale

Reproducible¶

Looks sporadic

Expected result¶

Last good: 158.4 (or more recent)

Further details¶

There's a rogue workeque lockup present in the screen and serial0.txt log, but no further information about it

Suggestions¶

- Write a new serial failure, so that these can be detected
- Modify the post fail hook to collect at least dmesg and full journal log.

Subtasks 1 (0 open — 1 closed)

Actions

Copy link

Updated by szarate about 6 years ago

Setting this to workable right away, I don't think should be complicated :)

Actions

Copy link

Updated by szarate almost 6 years ago

Subject changed from [functional][u] test fails in keymap_or_locale - rogue workqueue lockup to [functional][u][easy] test fails in keymap_or_locale - rogue workqueue lockup

Actions

Copy link

Updated by okurz almost 6 years ago

Target version changed from Milestone 22 to Milestone 23

yes, good idea.

Actions

Copy link

Updated by jorauch almost 6 years ago

Assignee set to jorauch

Actions

Copy link

Updated by jorauch almost 6 years ago

Status changed from Workable to In Progress

According to the logs no post_fail_hook is being executed.
When setting the error type to 'hard' the test fails and we get the PFH of opensusebasetest.

However the big question is: How am I supposed to verify this?

Actions

Copy link

Updated by jorauch almost 6 years ago

Status changed from In Progress to Feedback

Created PR:
https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/6851

Actions

Copy link

Updated by szarate almost 6 years ago

@jorauch for the time being I guess it's not required, as we know that this kind of errors are difficult to reproduce. Having the PFH sounds nice, however... in the case of this test it didn't cause any other test module to be tainted, could you use soft instead, so that those minutes are not wasted and further testing gets lost?

I thought for a moment that whenever a test that inherited from opensusebasetest, it would also inherit the PFH...

Actions

Copy link

Updated by jorauch almost 6 years ago

If we fail hard but the test is not set to 'fatal' we should not lose any tests that follow?
Autoinst logs do not show any sign of PFH, but the screenshots seem to.
Now I am really confused

EDIT: also I think we should not softfail without bugref and if it appears in other modules the impact might be 'hard'

Actions

Copy link

Updated by okurz almost 6 years ago

szarate wrote:

I thought for a moment that whenever a test that inherited from opensusebasetest, it would also inherit the PFH...

yes, that should be the case.

Actions

Copy link

#10

Updated by jorauch almost 6 years ago

Status changed from Feedback to Resolved

PR merged, as its just a pattern detection of a random failure I would close this as we cannot be sure when we'll see it in production.

Actions

Copy link

#11

Updated by okurz almost 6 years ago

Status changed from Resolved to Feedback

Please see https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/6851#issuecomment-467338617 . Either we should avoid the word "bug" or add a bug reference on our bugzilla instance.

Actions

Copy link

#12

Updated by jorauch almost 6 years ago

It's rogue, opening a bug will most likely have no effect, except we have a number to reference to.
Since the used module is named 'known_bugs' I would like to keep the naming and changing the commit message seems not feasible to me.

Actions

Copy link

#13

Updated by szarate almost 6 years ago

Status changed from Feedback to Resolved

We can already detect the bug :), this ticket was for detecting the bug. I opened #48419 as a follow up, closing this one.

Actions

Copy link

#14

Updated by okurz almost 6 years ago

thanks

Actions

Copy link

#15

Updated by okurz almost 6 years ago

Has duplicate action #48689: [functional][y] test fails in yast2_snapper - "rogue workqueue lockup bsc#1126782 - Serial error: BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 32s!" added

Actions

Copy link

#16

Updated by okurz almost 6 years ago

Has duplicate deleted (action #48689: [functional][y] test fails in yast2_snapper - "rogue workqueue lockup bsc#1126782 - Serial error: BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 32s!")

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

QA (public) » openQA Project (public) » openQA Tests (public)

Tags

Custom queries

action #47219

[functional][u][easy] test fails in keymap_or_locale - rogue workqueue lockup

Observation¶

Reproducible¶

Expected result¶

Further details¶

Suggestions¶

Updated by szarate about 6 years ago

Updated by szarate almost 6 years ago

Updated by okurz almost 6 years ago

Updated by jorauch almost 6 years ago

Updated by jorauch almost 6 years ago

Updated by jorauch almost 6 years ago

Updated by szarate almost 6 years ago

Updated by jorauch almost 6 years ago

Updated by okurz almost 6 years ago

Updated by jorauch almost 6 years ago

Updated by okurz almost 6 years ago

Updated by jorauch almost 6 years ago

Updated by szarate almost 6 years ago

Updated by okurz almost 6 years ago

Updated by okurz almost 6 years ago

Updated by okurz almost 6 years ago