action #54488
closed[opensuse][kde][functional][u] test fails in kontact, then fails to login in post_fail_hook. Enable post_fail_hook for collecting logs
0%
Description
Observation¶
openQA test in scenario opensuse-15.1-Argon-Live-x86_64-krypton-live@64bit-2G fails in
kontact
then fails to login in post_fail_hook. Probably system is stalled.
Reproducible¶
Fails since (at least) Build 1.27
Expected result¶
Regardless of being able to login or not we should run our "stall detection" and system load checks, e.g. at least magic-sysrq to look for blocked tasks.
Suggestions¶
- Move the
return if get_var(NOLOGS)
from x11test::post_fail_hook to opensusebasetest::post_fail_hook but still under show_tasks_in_blocked_state. DONE - Move the call to export_logs to the post_fail_hook inside opensusebasetest DONE
- Remove x11test::post_fail_hook DONE
Further details¶
Always latest result in this scenario: latest
Updated by okurz over 5 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: krypton-live
https://openqa.opensuse.org/tests/997651
Updated by okurz over 5 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: krypton-live
https://openqa.opensuse.org/tests/1010661
Updated by okurz about 5 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: krypton-live
https://openqa.opensuse.org/tests/1022219
Updated by okurz about 5 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: krypton-live
https://openqa.opensuse.org/tests/1034012
Updated by okurz about 5 years ago
This is an autogenerated message for openQA integration by the openqa_review script:
This bug is still referenced in a failing openQA test: krypton-live
https://openqa.opensuse.org/tests/1045410
To prevent further reminder comments one of the following options should be followed:
- The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
- The openQA job group is moved to "Released"
- The label in the openQA scenario is removed
Updated by okurz about 5 years ago
- Related to action #36126: [functional][u] post_fail_hook matches on "text_login_root" before actual tty switch and therefore never logs in added
Updated by SLindoMansilla about 5 years ago
- Priority changed from Normal to High
Updated by SLindoMansilla about 5 years ago
- Subject changed from [functional][u] test fails in kontact, then fails to login in post_fail_hook. Probably system is stalled but we should run our "stall detection" and system load checks regardless to [opensuse][kde] test fails in kontact, then fails to login in post_fail_hook. Probably system is stalled but we should run our "stall detection" and system load checks regardless
Updated by zluo about 5 years ago
- Subject changed from [opensuse][kde] test fails in kontact, then fails to login in post_fail_hook. Probably system is stalled but we should run our "stall detection" and system load checks regardless to [opensuse][kde][functional][u] test fails in kontact, then fails to login in post_fail_hook. Probably system is stalled but we should run our "stall detection" and system load checks regardless
- Status changed from New to In Progress
- Assignee set to zluo
actually stall detection is already in place. I saw this yesterday on o3. let' me check this now.
Updated by zluo almost 5 years ago
http://f40.suse.de/tests/5532#next_previous shows 100 test runs has only 1 failure:
http://f40.suse.de/tests/5517#step/kontact/12 kontact can not be started up. It is mostly a worker issue or performance issue.
So this issue can be found https://openqa.opensuse.org/tests/1088376#step/kontact/12. At moment we don't have issue on o3. So I would say we have now different situation.
Since post_fail_hook doesn't called at all, I will this it for now and check.
Updated by zluo almost 5 years ago
- Subject changed from [opensuse][kde][functional][u] test fails in kontact, then fails to login in post_fail_hook. Probably system is stalled but we should run our "stall detection" and system load checks regardless to [opensuse][kde][functional][u] test fails in kontact, then fails to login in post_fail_hook. Enable post_fail_hook for collecting logs
Updated by zluo almost 5 years ago
Updated by SLindoMansilla almost 5 years ago
So this issue can be found https://openqa.opensuse.org/tests/1088376#step/kontact/12. At moment we don't have issue on o3. So I would say we have now different situation.
Since post_fail_hook doesn't called at all, I will this it for now and check.
This links shows that post_fail_hook was called. But, it failed to login. That is exactly the issue that the ticket mentions.
Updated by zluo almost 5 years ago
to discuss with team:
how can we handle the issue when post_fail_hook encounter issue with stalled SUT?
Updated by szarate almost 5 years ago
- Description updated (diff)
I think we can get away with just shifting around the calls to post fail hooks
Updated by szarate almost 5 years ago
- Blocks action #60188: [functional][u] test fails in libqt5_qtbase because "Emoticons --System Settings Module" window added
Updated by szarate almost 5 years ago
- Status changed from In Progress to Workable
Updated by szarate almost 5 years ago
- Target version set to Milestone 28
- Estimated time set to 42.00 h
Updated by mgriessmeier almost 5 years ago
- Target version changed from Milestone 28 to Milestone 31
Updated by SLindoMansilla almost 5 years ago
- Description updated (diff)
- Assignee set to SLindoMansilla
Updated by SLindoMansilla almost 5 years ago
- Status changed from Workable to In Progress
Merge x11test::post_fail_hook to opensusebasetest: https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/9324 (merged)
Updated by SLindoMansilla almost 5 years ago
- Status changed from In Progress to Workable
Waiting for next occurrence in production.
Updated by SLindoMansilla almost 5 years ago
- Status changed from Workable to Resolved
post_fail_hook is triggered: https://openqa.opensuse.org/tests/1176518#step/kontact/21
But, the system is stalled: #63355
Updated by okurz almost 5 years ago
- Status changed from Resolved to Workable
SLindoMansilla wrote:
post_fail_hook is triggered: https://openqa.opensuse.org/tests/1176518#step/kontact/21
But, the system is stalled: #63355
But this is exactly what the original ticket observation states: #54488#Observation
So maybe you fixed some intermediate problem and are back to the original problem now? Maybe it helps to overall increase the timeout a lot for the initial login of the post_fail_hook or login into the log console before any relevant test has a chance to fail.
Updated by okurz over 4 years ago
- Blocks deleted (action #60188: [functional][u] test fails in libqt5_qtbase because "Emoticons --System Settings Module" window)
Updated by SLindoMansilla over 4 years ago
- Status changed from Workable to New
- Assignee deleted (
SLindoMansilla)
For grooming
Updated by szarate over 4 years ago
- Status changed from New to Resolved
Latest occurences of errors in kontact are no longer related to stalls, so this ticket seems done from AC criteria, however https://progress.opensuse.org/issues/68794 has been created as a follow up to address the time wasted during the post fail hook stuff.
@okurz: if you disagree, please ask via rocket chat before reopening, or remove the [u] tag and pick it yourself