Project

General

Profile

action #49763

[functional][u] test fails in force_scheduled_tasks - enable problem detection post fail hook

Added by szarate over 2 years ago. Updated over 2 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Bugs in existing tests
Target version:
-
Start date:
2019-03-27
Due date:
% Done:

0%

Estimated time:
Difficulty:

Description

Observation

openQA test in scenario sle-12-SP5-Server-DVD-aarch64-create_hdd_gnome@aarch64 fails in
force_scheduled_tasks

Test suite description

image creation job used as parent for other jobs testing based on existing installation. To be used as START_AFTER_TEST=create_hdd_gnome

Motivation

The mentioned job failed due to bsc#1130701 however, there are no logs available, at least no useful logs and this leads to waste... At least we should upload the dmesg log, console_tests_setup is at least providing the so needed logs

Reproducible

Fails since (at least) Build 0132 (current job)

Expected result

Last good: 0127 (or more recent)

Further details

Always latest result in this scenario: latest


Related issues

Related to openQA Tests - action #50456: [functional][u] gather logs and information from SUT in failed stateResolved2019-04-16

History

#1 Updated by jorauch over 2 years ago

  • Assignee set to jorauch

#2 Updated by jorauch over 2 years ago

  • Status changed from Workable to In Progress

#3 Updated by jorauch over 2 years ago

  • Status changed from In Progress to Feedback

#4 Updated by SLindoMansilla over 2 years ago

After a second look, I am not sure that the changes will solve the problem.

The logs weren't updated because the SUT was in a failed state that is not handled by post_fail_hook: https://openqa.suse.de/tests/2738505#step/force_scheduled_tasks/14

Your verification run is not enough because it didn't fail. You need to provide a failing one, like adding:

assert_script_run('fail here')

force_scheduled_tasks use consoletest as base class, which has the following in its post_fail_hook (https://github.com/os-autoinst/os-autoinst-distri-opensuse/blob/master/lib/consoletest.pm#L21):

    my ($self) = shift;
    select_console('log-console');
    $self->SUPER::post_fail_hook;
    $self->remount_tmp_if_ro;
    # Export logs after failure
    assert_script_run("journalctl --no-pager -b 0 > /tmp/full_journal.log");
    upload_journal "/tmp/full_journal.log";
    assert_script_run("dmesg > /tmp/dmesg.log");
    upload_logs "/tmp/dmesg.log";
    # Export extra log after failure for further check gdm issue 1127317, also poo#45236 used for tracking action on Openqa
    script_run("tar -jcv -f /tmp/xorg.tar.bz2  /home/bernhard/.local/share/xorg");
    upload_logs('/tmp/xorg.tar.bz2', failok => 1);
    script_run("tar -jcv -f /tmp/sysconfig.tar.bz2  /etc/sysconfig");
    upload_logs('/tmp/sysconfig.tar.bz2', failok => 1);
    script_run("tar -jcv -f /tmp/gdm.tar.bz2  /home/bernhard/.cache/gdm");
    upload_logs('/tmp/gdm.tar.bz2', failok => 1);

So, your change will not solve the problem. We should revert your change, so that force_scheduled_tasks still uses the same as consoletest and apply the handling of the blocked state in consoletest for all child classes.

#5 Updated by jorauch over 2 years ago

The pfh of the base class is being executed, so no need to revert here, when adapting the pfh in the baseclass we should really replace those single commands

#6 Updated by jorauch over 2 years ago

Ok, how are we supposed to work around a blocked state in the PFH?

#7 Updated by SLindoMansilla over 2 years ago

jorauch wrote:

The pfh of the base class is being executed, so no need to revert here, when adapting the pfh in the baseclass we should really replace those single commands

No, because those commands need to be executed in every child.
After your PR you are uploading those logs twice.
Furthermore, since my PR got merged, the log uploading is unified: https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/7177

#8 Updated by SLindoMansilla over 2 years ago

jorauch wrote:

Ok, how are we supposed to work around a blocked state in the PFH?

That is something you have to investigate.

I would run this job in interactive mode, send_key('ctrl-l') or send_key('ret') to clean the prompt before assert_screen('log-console')
https://openqa.suse.de/tests/2738505#step/force_scheduled_tasks/16
and then continue running the post_fail_hook

#9 Updated by SLindoMansilla over 2 years ago

Useful information:

https://unix.stackexchange.com/questions/13019/description-of-kernel-printk-values

the kernel ring buffer, tty's and serial console are three different concepts. Go read it up and you'll realize that there is certainly something you can do. From quick googling I found https://kernelnewbies.org/Linux_Kernel_Tester%27s_Guide_Chapter3 - hint: ctrl+f for "console=" and then take a look what we pass to the SUT in https://openqa.suse.de/tests/2738505#step/bootloader_uefi/7

#10 Updated by SLindoMansilla over 2 years ago

  • Status changed from Feedback to Workable

Removing from "feedback" queue.

#11 Updated by jorauch over 2 years ago

I would not consider this high anymore, as we are running at least a useful PFH, the other problem should be done in an extra ticket as it goes far over what is requested in this one imho

#12 Updated by jorauch over 2 years ago

  • Related to action #50456: [functional][u] gather logs and information from SUT in failed state added

#13 Updated by jorauch over 2 years ago

  • Status changed from Workable to In Progress

Created https://progress.opensuse.org/issues/50456 as followup for the suggested improvements by sergio, closing this as discussed with mgriessmeier.

#14 Updated by jorauch over 2 years ago

  • Status changed from In Progress to Resolved

Also available in: Atom PDF