action #54488

[opensuse][kde][functional][u] test fails in kontact, then fails to login in post_fail_hook. Enable post_fail_hook for collecting logs

Added by okurz 7 months ago. Updated 1 day ago.

Status:WorkableStart date:20/07/2019
Priority:HighDue date:
Assignee:SLindoMansilla% Done:

0%

Category:Enhancement to existing testsEstimated time:42.00 hours
Target version:SUSE QA tests - Milestone 31
Difficulty:
Duration:

Description

Observation

openQA test in scenario opensuse-15.1-Argon-Live-x86_64-krypton-live@64bit-2G fails in
kontact
then fails to login in post_fail_hook. Probably system is stalled.

Reproducible

Fails since (at least) Build 1.27

Expected result

Regardless of being able to login or not we should run our "stall detection" and system load checks, e.g. at least magic-sysrq to look for blocked tasks.

Suggestions

  • Move the return if get_var(NOLOGS) from x11test::post_fail_hook to opensusebasetest::post_fail_hook but still under show_tasks_in_blocked_state.
  • Move the call to export_logs to the post_fail_hook inside opensusebasetest
  • Remove x11test::post_fail_hook :)

Further details

Always latest result in this scenario: latest


Related issues

Related to openQA Tests - action #36126: [functional][u] post_fail_hook matches on "text_login_roo... Resolved 14/05/2018
Blocks openQA Tests - action #60188: [functional][u] test fails in libqt5_qtbase because "Emot... New 22/11/2019

History

#1 Updated by okurz 7 months ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: krypton-live
https://openqa.opensuse.org/tests/997651

#2 Updated by okurz 6 months ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: krypton-live
https://openqa.opensuse.org/tests/1010661

#3 Updated by okurz 6 months ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: krypton-live
https://openqa.opensuse.org/tests/1022219

#4 Updated by okurz 5 months ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: krypton-live
https://openqa.opensuse.org/tests/1034012

#5 Updated by okurz 5 months ago

This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: krypton-live
https://openqa.opensuse.org/tests/1045410

To prevent further reminder comments one of the following options should be followed:
1. The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
2. The openQA job group is moved to "Released"
3. The label in the openQA scenario is removed

#6 Updated by okurz 4 months ago

  • Related to action #36126: [functional][u] post_fail_hook matches on "text_login_root" before actual tty switch and therefore never logs in added

#7 Updated by SLindoMansilla 4 months ago

  • Priority changed from Normal to High

#8 Updated by SLindoMansilla 4 months ago

  • Subject changed from [functional][u] test fails in kontact, then fails to login in post_fail_hook. Probably system is stalled but we should run our "stall detection" and system load checks regardless to [opensuse][kde] test fails in kontact, then fails to login in post_fail_hook. Probably system is stalled but we should run our "stall detection" and system load checks regardless

#9 Updated by zluo 3 months ago

  • Subject changed from [opensuse][kde] test fails in kontact, then fails to login in post_fail_hook. Probably system is stalled but we should run our "stall detection" and system load checks regardless to [opensuse][kde][functional][u] test fails in kontact, then fails to login in post_fail_hook. Probably system is stalled but we should run our "stall detection" and system load checks regardless
  • Status changed from New to In Progress
  • Assignee set to zluo

actually stall detection is already in place. I saw this yesterday on o3. let' me check this now.

#10 Updated by zluo 3 months ago

http://f40.suse.de/tests/5532#next_previous shows 100 test runs has only 1 failure:

http://f40.suse.de/tests/5517#step/kontact/12 kontact can not be started up. It is mostly a worker issue or performance issue.

So this issue can be found https://openqa.opensuse.org/tests/1088376#step/kontact/12. At moment we don't have issue on o3. So I would say we have now different situation.
Since post_fail_hook doesn't called at all, I will this it for now and check.

#11 Updated by zluo 3 months ago

  • Subject changed from [opensuse][kde][functional][u] test fails in kontact, then fails to login in post_fail_hook. Probably system is stalled but we should run our "stall detection" and system load checks regardless to [opensuse][kde][functional][u] test fails in kontact, then fails to login in post_fail_hook. Enable post_fail_hook for collecting logs

#13 Updated by SLindoMansilla 3 months ago

So this issue can be found https://openqa.opensuse.org/tests/1088376#step/kontact/12. At moment we don't have issue on o3. So I would say we have now different situation.

Since post_fail_hook doesn't called at all, I will this it for now and check.

This links shows that post_fail_hook was called. But, it failed to login. That is exactly the issue that the ticket mentions.

#14 Updated by zluo 3 months ago

to discuss with team:

how can we handle the issue when post_fail_hook encounter issue with stalled SUT?

#15 Updated by szarate 3 months ago

  • Description updated (diff)

I think we can get away with just shifting around the calls to post fail hooks

#16 Updated by szarate 3 months ago

  • Description updated (diff)

#17 Updated by szarate 3 months ago

  • Blocks action #60188: [functional][u] test fails in libqt5_qtbase because "Emoticons --System Settings Module" window added

#18 Updated by szarate 3 months ago

  • Assignee deleted (zluo)

#19 Updated by szarate 3 months ago

  • Status changed from In Progress to Workable

#20 Updated by szarate 3 months ago

  • Target version set to Milestone 28
  • Estimated time set to 42.00

#21 Updated by mgriessmeier about 1 month ago

  • Target version changed from Milestone 28 to Milestone 31

#22 Updated by SLindoMansilla about 1 month ago

  • Description updated (diff)
  • Assignee set to SLindoMansilla

#23 Updated by SLindoMansilla about 1 month ago

  • Status changed from Workable to In Progress

Merge x11test::post_fail_hook to opensusebasetest: https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/9324 (merged)

#24 Updated by SLindoMansilla about 1 month ago

  • Status changed from In Progress to Workable

Waiting for next occurrence in production.

#25 Updated by SLindoMansilla 3 days ago

  • Status changed from Workable to Resolved

post_fail_hook is triggered: https://openqa.opensuse.org/tests/1176518#step/kontact/21

But, the system is stalled: #63355

#26 Updated by okurz 1 day ago

  • Status changed from Resolved to Workable

SLindoMansilla wrote:

post_fail_hook is triggered: https://openqa.opensuse.org/tests/1176518#step/kontact/21


But, the system is stalled: #63355

But this is exactly what the original ticket observation states: #54488#Observation

So maybe you fixed some intermediate problem and are back to the original problem now? Maybe it helps to overall increase the timeout a lot for the initial login of the post_fail_hook or login into the log console before any relevant test has a chance to fail.

Also available in: Atom PDF