Project

General

Profile

Actions

action #47417

closed

[functional][u][svirt-xen] test fails in dracut - no serial output even though command returns 0

Added by dheidler about 5 years ago. Updated almost 5 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Bugs in existing tests
Target version:
SUSE QA - Milestone 24
Start date:
2019-02-12
Due date:
% Done:

0%

Estimated time:
Difficulty:

Description

Observation

openQA test in scenario extra_tests_in_textmode fails in
dracut

Reproducible

Fails since (at least) Build 0112

Expected result

Last good: https://openqa.suse.de/tests/2410341

Further details

Always latest result in this scenario: latest

This looks like a problem with the serial terminal.

Suggestions

  • Check what's the required parameter for serial_terminal to behave correctly for svirt backend
  • Unschedule the test if VIRTIO=1 is not set

Related issues 4 (0 open4 closed)

Related to openQA Tests - action #46895: [qam] test fails in dracut - output not matched on 15SP1Resolvedpstivanin2019-01-312019-01-31

Actions
Blocked by openQA Tests - action #47123: [functional][u][tw] extra_tests_in_textmode fails in zbarResolvedpstivanin2019-02-04

Actions
Copied to openQA Tests - action #48251: [qam] test fails in dracut to match no output including "dracut -f" but the found output looks like what we expectResolvedpstivanin2019-02-12

Actions
Copied to openQA Tests - action #51023: [sle][functional][u] take dracut back in functional testsResolvedzluo2019-02-12

Actions
Actions #1

Updated by okurz about 5 years ago

  • Assignee set to dheidler
  • Priority changed from Normal to Urgent

Please update scenario name and last good

Actions #2

Updated by dheidler about 5 years ago

  • Description updated (diff)
  • Status changed from New to Workable
  • Target version set to Milestone 23
Actions #3

Updated by dheidler about 5 years ago

  • Assignee deleted (dheidler)
Actions #4

Updated by okurz about 5 years ago

As discussed in planning: Please find related tickets as we assume they are more tickets for "serial terminal" problems. If we find that retriggering is a valid workaround or the problem is limited to only limited scenarios then we can reduce the priority to "High" or even "Normal".

Actions #5

Updated by szarate about 5 years ago

  • Status changed from Workable to In Progress
  • Assignee set to szarate

Picking it up as agreed... quickly searching couldn't find the related problems with wait_serial

Actions #6

Updated by szarate about 5 years ago

  • Description updated (diff)
  • Status changed from In Progress to Feedback
  • Priority changed from Urgent to Normal

Long story short:

https://openqa.suse.de/tests/2450422#step/dracut/6 tries to select serial terminal, but is running on svirt-xen-hvm backend, however the test seems to be intended to work with VIRTIO=1 which I'm not sure the svirt/xen backend support, serial_terminal has support for this kind of scenarios. Same test is passing for x86 scenarios:
https://openqa.suse.de/tests/2450399#step/dracut/26

I will try with suggestion #2 later today, and will lower the urgency, I don't believe that it's urgent anymore, since it passes on other scenarios

Actions #7

Updated by szarate about 5 years ago

  • Status changed from Feedback to Workable
  • Assignee deleted (szarate)

Setting back to workable, I think It will take a bit longer :)

Actions #8

Updated by szarate about 5 years ago

  • Related to action #46895: [qam] test fails in dracut - output not matched on 15SP1 added
Actions #9

Updated by okurz about 5 years ago

  • Subject changed from [functional][u] test fails in dracut - no serial output even though command returns 0 to [functional][u][svirt-xen] test fails in dracut - no serial output even though command returns 0

back to "High" as used as a label for currently failing tests

Actions #10

Updated by okurz about 5 years ago

  • Priority changed from Normal to High
Actions #11

Updated by okurz about 5 years ago

  • Copied to action #48251: [qam] test fails in dracut to match no output including "dracut -f" but the found output looks like what we expect added
Actions #12

Updated by okurz about 5 years ago

I think the dracut test module was recently moved into a dedicated "extratests" group so I suggest to crosscheck the current fail ratio on production based on previously running jobs and see how severe the issue still is.

Actions #13

Updated by mgriessmeier about 5 years ago

  • Blocked by action #47123: [functional][u][tw] extra_tests_in_textmode fails in zbar added
Actions #14

Updated by mgriessmeier about 5 years ago

  • Status changed from Workable to Blocked
Actions #15

Updated by mgriessmeier about 5 years ago

  • Status changed from Blocked to Workable
  • Target version changed from Milestone 23 to Milestone 24

blocker resolved, moving to M24 as workable

Actions #16

Updated by zluo about 5 years ago

  • Status changed from Workable to In Progress
  • Assignee set to zluo

take over and check.

Actions #17

Updated by zluo about 5 years ago

dracut has been disabled atm, so trying to check test on last build of sles 12:

http://f40.suse.de/tests/3348#live

Actions #18

Updated by zluo about 5 years ago

Actions #20

Updated by zluo about 5 years ago

it seems we have problem with svirt-xen worker on osd. It is running for 2 hours...

Actions #23

Updated by zluo about 5 years ago

https://openqa.suse.de/tests/2801515 shows following issue:

[2019-04-12T10:11:30.724 CEST] [debug] output does not pass the code block:

[2019-04-12T10:11:30.828 CEST] [debug] output not validating at /var/lib/openqa/pool/12/os-autoinst-distri-opensuse/tests/console/dracut.pm line 28.

validate_script_output("dracut -f", sub { m/.*Executing: \/usr\/bin\/dracut -f\n|\b(?:Skipping|Including modules done|Including|Creating image|Creating initramfs)\b/ }, 180);

this command doesn't any trouble on x86_64.
and dracut --list-modules even not triggered afterwards:
validate_script_output("dracut --list-modules", sub { m/.*Executing: \/usr\/bin\/dracut --list-modules\n(\w+|\n|-|d+)+/ });

I think the reason is that SUT is busy (system I/O): busy
https://openqa.suse.de/tests/2801515#step/dracut/31

Actions #24

Updated by zluo about 5 years ago

https://openqa.suse.de/tests/2802493 shows same failure as before.

Actions #25

Updated by zluo about 5 years ago

checked the history of https://progress.opensuse.org/issues/47123.

comment#40

The 'extra_tests_dracut' has been added to all scenario

--

In main_commom.pm we have already:

sub load_extra_tests_dracut {
loadtest "console/dracut";
}

Actions #26

Updated by zluo about 5 years ago

  • Status changed from In Progress to Feedback

https://openqa.suse.de/tests/2802456 (Maintance QA only)

According to information from pstivanin, dracut test module is unstable and breaks other tests, so it got removed from extra_tests.

Let's talk about this in a meeting how we to handle it.

Actions #27

Updated by zluo about 5 years ago

  • Status changed from Feedback to Workable
  • Priority changed from High to Normal
Actions #28

Updated by zluo about 5 years ago

  • Status changed from Workable to In Progress

give a try for adding post_fail_hook and add force_scheduled_tasks and cleanup_before_shutdown to see if we still have problem with system I/O: busy.

Actions #30

Updated by zluo about 5 years ago

  • Status changed from In Progress to Workable
  • Assignee deleted (zluo)
  • Priority changed from Normal to High

https://openqa.suse.de/tests/2807621#step/dracut/12 shows same issue as before. And I'm pretty sure that SUT was not able to execute command after initramfs image created because system I/O becomes busy.

Please let's discuss this ticket and find a good to solve the issue.

Actions #31

Updated by mgriessmeier about 5 years ago

  • Assignee set to zluo
Actions #32

Updated by zluo about 5 years ago

  • Assignee changed from zluo to michalnowak

Michal, how do you think about this ticket? Do we really need to test dracut on svirt-xen?
dracut is covered in tests from QA Maintance now.

Actions #33

Updated by michalnowak about 5 years ago

  • Assignee changed from michalnowak to zluo

This is not about Xen, but about the test having wrong assumption about where the output from dracut goes. dracut writes its output to stderr, but validate_script_output() seems to check stdout (which is empty in this case). Serial terminal seems to do some sort of "magic" to merge stderr and stdout, so it works there.

The easy fix is to run dracut like this: dracut -f 2>&1.

Actions #34

Updated by zluo about 5 years ago

  • Status changed from Workable to In Progress

then trying dracut -f 2>& in validate_script_output.

Actions #36

Updated by zluo about 5 years ago

Actions #37

Updated by zluo about 5 years ago

http://f40.suse.de/tests/3368 shows that this change is working fine for extra_tests_in_textmode@64bit

Actions #38

Updated by zluo about 5 years ago

PR created:
https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/7293

The question is still: should we add dracut back to sle functional extra_test?

Actions #39

Updated by zluo about 5 years ago

  • Status changed from In Progress to Feedback
Actions #40

Updated by SLindoMansilla almost 5 years ago

  • Status changed from Feedback to Workable

PR merged. Waiting for follow up ticket. Please, add the new ticket as related to this ticket. Feel free to resolve this ticket.

Actions #41

Updated by zluo almost 5 years ago

  • Copied to action #51023: [sle][functional][u] take dracut back in functional tests added
Actions #42

Updated by zluo almost 5 years ago

  • Status changed from Workable to Resolved

follow-up ticket created:
https://progress.opensuse.org/issues/51023

the test itself is working fine:
https://openqa.suse.de/tests/2814438

Set is now as resolved.

Actions

Also available in: Atom PDF