Project

General

Profile

Actions

action #65040

closed

coordination #68794: [qe-core][functional][epic] rework postfail hooks

[sle][functiona][u] enhance post_fail_hook on OOM condition

Added by zluo almost 5 years ago. Updated over 4 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Enhancement to existing tests
Target version:
SUSE QA (private) - Milestone 30
Start date:
2020-03-31
Due date:
% Done:

0%

Estimated time:
42.00 h
Difficulty:

Description

from @okurz:

We already use magic-sysrq-w to find out if there are blocked tasks so we could trigger another sysrq commands, e.g. magic-sysrq-m to read memory information and parse from there if there is free memory. Also might be helpul in one of the test setup modules to already log into the log console so that we can switch to the already logged in console in case of problems and not get stuck at the login prompt. If the system is responsive during post_fail_hook and not workarounds need to be tried we could also read out from logs if there was an OOM condition. Also in the end we should be able to clearly determine from the information that we can gather from the SUT automatically what is wrong with kontact when it is only partially shown.

see #63355 for assumption that it could be related to OOM or it is related to poor performance that post_fail_hook fails already at login prompt.

Let's check this together at first.

Tasks

  1. Add new show_memory_information sub (similar to show_tasks_in_blocked_state)
  2. Adapt the sub to show memory information properly
  3. Add show_memory_information to the post fail hook in lib/opensusebasetest.pm
  4. use serial failures feature to parse and search for an oom condition derived from this
=head2 show_tasks_in_blocked_state

 show_tasks_in_blocked_state();

Dumps tasks that are in uninterruptable (blocked) state and wait for headline
of dump.

See L<https://github.com/torvalds/linux/blob/master/Documentation/admin-guide/sysrq.rst>.                                                                     

=cut
sub show_tasks_in_blocked_state {
    # sending sysrqs doesn't work for svirt
    if (!check_var('BACKEND', 'svirt')) {
        send_key 'alt-sysrq-w';
        # info will be sent to serial tty
        wait_serial(qr/sysrq\s*:\s+show\s+blocked\s+state/i, 1);
        send_key 'ret';    # ensure clean shell prompt
    }    
}

Related issues 2 (0 open2 closed)

Related to openQA Tests (public) - action #63355: [opensuse][functional][u] test fails in kontact, kontact summary screen only partitially shown, then post_fail_hook fails to login – OOM?Resolvedzluo2020-02-10

Actions
Related to openQA Tests (public) - action #66607: [functional][u] Execute "SysRq t" when workqueue lockup is detected and publish kernel logsResolveddheidler

Actions
Actions #1

Updated by zluo almost 5 years ago

  • Related to action #63355: [opensuse][functional][u] test fails in kontact, kontact summary screen only partitially shown, then post_fail_hook fails to login – OOM? added
Actions #2

Updated by SLindoMansilla over 4 years ago

  • Category set to Enhancement to existing tests
Actions #3

Updated by SLindoMansilla over 4 years ago

  • Description updated (diff)
  • Status changed from New to Workable
  • Target version set to Milestone 30
  • Estimated time set to 42.00 h
Actions #4

Updated by dheidler over 4 years ago

  • Status changed from Workable to In Progress
  • Assignee set to dheidler
Actions #5

Updated by SLindoMansilla over 4 years ago

  • Related to action #66607: [functional][u] Execute "SysRq t" when workqueue lockup is detected and publish kernel logs added
Actions #6

Updated by dheidler over 4 years ago

  • Status changed from In Progress to Feedback
Actions #7

Updated by szarate over 4 years ago

Along with https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/10960 this can be considered done I guess.

Actions #8

Updated by szarate over 4 years ago

Actions #9

Updated by szarate over 4 years ago

Actions #10

Updated by szarate over 4 years ago

  • Parent task set to #68794
Actions #11

Updated by dheidler over 4 years ago

  • Status changed from Feedback to Resolved
Actions

Also available in: Atom PDF