Project

General

Profile

action #54488

[opensuse][kde][functional][u] test fails in kontact, then fails to login in post_fail_hook. Enable post_fail_hook for collecting logs

Added by okurz about 1 year ago. Updated about 2 months ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Enhancement to existing tests
Target version:
SUSE QA tests - Milestone 31
Start date:
2019-07-20
Due date:
% Done:

0%

Estimated time:
42.00 h
Difficulty:

Description

Observation

openQA test in scenario opensuse-15.1-Argon-Live-x86_64-krypton-live@64bit-2G fails in
kontact
then fails to login in post_fail_hook. Probably system is stalled.

Reproducible

Fails since (at least) Build 1.27

Expected result

Regardless of being able to login or not we should run our "stall detection" and system load checks, e.g. at least magic-sysrq to look for blocked tasks.

Suggestions

  • Move the return if get_var(NOLOGS) from x11test::post_fail_hook to opensusebasetest::post_fail_hook but still under show_tasks_in_blocked_state. DONE
  • Move the call to export_logs to the post_fail_hook inside opensusebasetest DONE
  • Remove x11test::post_fail_hook DONE

Further details

Always latest result in this scenario: latest


Related issues

Related to openQA Tests - action #36126: [functional][u] post_fail_hook matches on "text_login_root" before actual tty switch and therefore never logs inResolved2018-05-14

History

#1 Updated by okurz about 1 year ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: krypton-live
https://openqa.opensuse.org/tests/997651

#2 Updated by okurz about 1 year ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: krypton-live
https://openqa.opensuse.org/tests/1010661

#3 Updated by okurz about 1 year ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: krypton-live
https://openqa.opensuse.org/tests/1022219

#4 Updated by okurz about 1 year ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: krypton-live
https://openqa.opensuse.org/tests/1034012

#5 Updated by okurz 12 months ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: krypton-live
https://openqa.opensuse.org/tests/1045410

To prevent further reminder comments one of the following options should be followed:

  1. The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
  2. The openQA job group is moved to "Released"
  3. The label in the openQA scenario is removed

#6 Updated by okurz 11 months ago

  • Related to action #36126: [functional][u] post_fail_hook matches on "text_login_root" before actual tty switch and therefore never logs in added

#7 Updated by SLindoMansilla 11 months ago

  • Priority changed from Normal to High

#8 Updated by SLindoMansilla 11 months ago

  • Subject changed from [functional][u] test fails in kontact, then fails to login in post_fail_hook. Probably system is stalled but we should run our "stall detection" and system load checks regardless to [opensuse][kde] test fails in kontact, then fails to login in post_fail_hook. Probably system is stalled but we should run our "stall detection" and system load checks regardless

#9 Updated by zluo 10 months ago

  • Subject changed from [opensuse][kde] test fails in kontact, then fails to login in post_fail_hook. Probably system is stalled but we should run our "stall detection" and system load checks regardless to [opensuse][kde][functional][u] test fails in kontact, then fails to login in post_fail_hook. Probably system is stalled but we should run our "stall detection" and system load checks regardless
  • Status changed from New to In Progress
  • Assignee set to zluo

actually stall detection is already in place. I saw this yesterday on o3. let' me check this now.

#10 Updated by zluo 10 months ago

http://f40.suse.de/tests/5532#next_previous shows 100 test runs has only 1 failure:

http://f40.suse.de/tests/5517#step/kontact/12 kontact can not be started up. It is mostly a worker issue or performance issue.

So this issue can be found https://openqa.opensuse.org/tests/1088376#step/kontact/12. At moment we don't have issue on o3. So I would say we have now different situation.
Since post_fail_hook doesn't called at all, I will this it for now and check.

#11 Updated by zluo 10 months ago

  • Subject changed from [opensuse][kde][functional][u] test fails in kontact, then fails to login in post_fail_hook. Probably system is stalled but we should run our "stall detection" and system load checks regardless to [opensuse][kde][functional][u] test fails in kontact, then fails to login in post_fail_hook. Enable post_fail_hook for collecting logs

#13 Updated by SLindoMansilla 10 months ago

So this issue can be found https://openqa.opensuse.org/tests/1088376#step/kontact/12. At moment we don't have issue on o3. So I would say we have now different situation.
Since post_fail_hook doesn't called at all, I will this it for now and check.

This links shows that post_fail_hook was called. But, it failed to login. That is exactly the issue that the ticket mentions.

#14 Updated by zluo 10 months ago

to discuss with team:

how can we handle the issue when post_fail_hook encounter issue with stalled SUT?

#15 Updated by szarate 10 months ago

  • Description updated (diff)

I think we can get away with just shifting around the calls to post fail hooks

#16 Updated by szarate 10 months ago

  • Description updated (diff)

#17 Updated by szarate 10 months ago

  • Blocks action #60188: [functional][u] test fails in libqt5_qtbase because "Emoticons --System Settings Module" window added

#18 Updated by szarate 10 months ago

  • Assignee deleted (zluo)

#19 Updated by szarate 10 months ago

  • Status changed from In Progress to Workable

#20 Updated by szarate 10 months ago

  • Target version set to Milestone 28
  • Estimated time set to 42.00 h

#21 Updated by mgriessmeier 9 months ago

  • Target version changed from Milestone 28 to Milestone 31

#22 Updated by SLindoMansilla 8 months ago

  • Description updated (diff)
  • Assignee set to SLindoMansilla

#23 Updated by SLindoMansilla 8 months ago

  • Status changed from Workable to In Progress

Merge x11test::post_fail_hook to opensusebasetest: https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/9324 (merged)

#24 Updated by SLindoMansilla 8 months ago

  • Status changed from In Progress to Workable

Waiting for next occurrence in production.

#25 Updated by SLindoMansilla 7 months ago

  • Status changed from Workable to Resolved

post_fail_hook is triggered: https://openqa.opensuse.org/tests/1176518#step/kontact/21

But, the system is stalled: #63355

#26 Updated by okurz 7 months ago

  • Status changed from Resolved to Workable

SLindoMansilla wrote:

post_fail_hook is triggered: https://openqa.opensuse.org/tests/1176518#step/kontact/21

But, the system is stalled: #63355

But this is exactly what the original ticket observation states: #54488#Observation

So maybe you fixed some intermediate problem and are back to the original problem now? Maybe it helps to overall increase the timeout a lot for the initial login of the post_fail_hook or login into the log console before any relevant test has a chance to fail.

#27 Updated by okurz 7 months ago

  • Blocks deleted (action #60188: [functional][u] test fails in libqt5_qtbase because "Emoticons --System Settings Module" window)

#28 Updated by SLindoMansilla 3 months ago

  • Status changed from Workable to New
  • Assignee deleted (SLindoMansilla)

For grooming

#29 Updated by SLindoMansilla 2 months ago

  • Description updated (diff)

#30 Updated by szarate 2 months ago

  • Status changed from New to Resolved

Latest occurences of errors in kontact are no longer related to stalls, so this ticket seems done from AC criteria, however https://progress.opensuse.org/issues/68794 has been created as a follow up to address the time wasted during the post fail hook stuff.

okurz: if you disagree, please ask via rocket chat before reopening, or remove the [u] tag and pick it yourself

#31 Updated by SLindoMansilla about 2 months ago

  • Assignee set to szarate

Also available in: Atom PDF