Project

General

Profile

action #47417

[functional][u][svirt-xen] test fails in dracut - no serial output even though command returns 0

Added by dheidler over 2 years ago. Updated over 2 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Bugs in existing tests
Target version:
SUSE QA - Milestone 24
Start date:
2019-02-12
Due date:
% Done:

0%

Estimated time:
Difficulty:

Description

Observation

openQA test in scenario extra_tests_in_textmode fails in
dracut

Reproducible

Fails since (at least) Build 0112

Expected result

Last good: https://openqa.suse.de/tests/2410341

Further details

Always latest result in this scenario: latest

This looks like a problem with the serial terminal.

Suggestions

  • Check what's the required parameter for serial_terminal to behave correctly for svirt backend
  • Unschedule the test if VIRTIO=1 is not set

Related issues

Related to openQA Tests - action #46895: [qam] test fails in dracut - output not matched on 15SP1Resolved2019-01-312019-01-31

Blocked by openQA Tests - action #47123: [functional][u][tw] extra_tests_in_textmode fails in zbarResolved2019-02-04

Copied to openQA Tests - action #48251: [qam] test fails in dracut to match no output including "dracut -f" but the found output looks like what we expectResolved2019-02-12

Copied to openQA Tests - action #51023: [sle][functional][u] take dracut back in functional testsResolved2019-02-12

History

#1 Updated by okurz over 2 years ago

  • Assignee set to dheidler
  • Priority changed from Normal to Urgent

Please update scenario name and last good

#2 Updated by dheidler over 2 years ago

  • Description updated (diff)
  • Status changed from New to Workable
  • Target version set to Milestone 23

#3 Updated by dheidler over 2 years ago

  • Assignee deleted (dheidler)

#4 Updated by okurz over 2 years ago

As discussed in planning: Please find related tickets as we assume they are more tickets for "serial terminal" problems. If we find that retriggering is a valid workaround or the problem is limited to only limited scenarios then we can reduce the priority to "High" or even "Normal".

#5 Updated by szarate over 2 years ago

  • Status changed from Workable to In Progress
  • Assignee set to szarate

Picking it up as agreed... quickly searching couldn't find the related problems with wait_serial

#6 Updated by szarate over 2 years ago

  • Description updated (diff)
  • Status changed from In Progress to Feedback
  • Priority changed from Urgent to Normal

Long story short:

https://openqa.suse.de/tests/2450422#step/dracut/6 tries to select serial terminal, but is running on svirt-xen-hvm backend, however the test seems to be intended to work with VIRTIO=1 which I'm not sure the svirt/xen backend support, serial_terminal has support for this kind of scenarios. Same test is passing for x86 scenarios:
https://openqa.suse.de/tests/2450399#step/dracut/26

I will try with suggestion #2 later today, and will lower the urgency, I don't believe that it's urgent anymore, since it passes on other scenarios

#7 Updated by szarate over 2 years ago

  • Status changed from Feedback to Workable
  • Assignee deleted (szarate)

Setting back to workable, I think It will take a bit longer :)

#8 Updated by szarate over 2 years ago

  • Related to action #46895: [qam] test fails in dracut - output not matched on 15SP1 added

#9 Updated by okurz over 2 years ago

  • Subject changed from [functional][u] test fails in dracut - no serial output even though command returns 0 to [functional][u][svirt-xen] test fails in dracut - no serial output even though command returns 0

back to "High" as used as a label for currently failing tests

#10 Updated by okurz over 2 years ago

  • Priority changed from Normal to High

#11 Updated by okurz over 2 years ago

  • Copied to action #48251: [qam] test fails in dracut to match no output including "dracut -f" but the found output looks like what we expect added

#12 Updated by okurz over 2 years ago

I think the dracut test module was recently moved into a dedicated "extratests" group so I suggest to crosscheck the current fail ratio on production based on previously running jobs and see how severe the issue still is.

#13 Updated by mgriessmeier over 2 years ago

  • Blocked by action #47123: [functional][u][tw] extra_tests_in_textmode fails in zbar added

#14 Updated by mgriessmeier over 2 years ago

  • Status changed from Workable to Blocked

#15 Updated by mgriessmeier over 2 years ago

  • Status changed from Blocked to Workable
  • Target version changed from Milestone 23 to Milestone 24

blocker resolved, moving to M24 as workable

#16 Updated by zluo over 2 years ago

  • Status changed from Workable to In Progress
  • Assignee set to zluo

take over and check.

#17 Updated by zluo over 2 years ago

dracut has been disabled atm, so trying to check test on last build of sles 12:

http://f40.suse.de/tests/3348#live

#18 Updated by zluo over 2 years ago

#20 Updated by zluo over 2 years ago

it seems we have problem with svirt-xen worker on osd. It is running for 2 hours...

#23 Updated by zluo over 2 years ago

https://openqa.suse.de/tests/2801515 shows following issue:

[2019-04-12T10:11:30.724 CEST] [debug] output does not pass the code block:

[2019-04-12T10:11:30.828 CEST] [debug] output not validating at /var/lib/openqa/pool/12/os-autoinst-distri-opensuse/tests/console/dracut.pm line 28.

validate_script_output("dracut -f", sub { m/.*Executing: \/usr\/bin\/dracut -f\n|\b(?:Skipping|Including modules done|Including|Creating image|Creating initramfs)\b/ }, 180);

this command doesn't any trouble on x86_64.
and dracut --list-modules even not triggered afterwards:
validate_script_output("dracut --list-modules", sub { m/.*Executing: \/usr\/bin\/dracut --list-modules\n(\w+|\n|-|d+)+/ });

I think the reason is that SUT is busy (system I/O): busy
https://openqa.suse.de/tests/2801515#step/dracut/31

#24 Updated by zluo over 2 years ago

https://openqa.suse.de/tests/2802493 shows same failure as before.

#25 Updated by zluo over 2 years ago

checked the history of https://progress.opensuse.org/issues/47123.

comment#40

The 'extra_tests_dracut' has been added to all scenario

--

In main_commom.pm we have already:

sub load_extra_tests_dracut {
loadtest "console/dracut";
}

#26 Updated by zluo over 2 years ago

  • Status changed from In Progress to Feedback

https://openqa.suse.de/tests/2802456 (Maintance QA only)

According to information from pstivanin, dracut test module is unstable and breaks other tests, so it got removed from extra_tests.

Let's talk about this in a meeting how we to handle it.

#27 Updated by zluo over 2 years ago

  • Status changed from Feedback to Workable
  • Priority changed from High to Normal

#28 Updated by zluo over 2 years ago

  • Status changed from Workable to In Progress

give a try for adding post_fail_hook and add force_scheduled_tasks and cleanup_before_shutdown to see if we still have problem with system I/O: busy.

#30 Updated by zluo over 2 years ago

  • Status changed from In Progress to Workable
  • Assignee deleted (zluo)
  • Priority changed from Normal to High

https://openqa.suse.de/tests/2807621#step/dracut/12 shows same issue as before. And I'm pretty sure that SUT was not able to execute command after initramfs image created because system I/O becomes busy.

Please let's discuss this ticket and find a good to solve the issue.

#31 Updated by mgriessmeier over 2 years ago

  • Assignee set to zluo

#32 Updated by zluo over 2 years ago

  • Assignee changed from zluo to michalnowak

Michal, how do you think about this ticket? Do we really need to test dracut on svirt-xen?
dracut is covered in tests from QA Maintance now.

#33 Updated by michalnowak over 2 years ago

  • Assignee changed from michalnowak to zluo

This is not about Xen, but about the test having wrong assumption about where the output from dracut goes. dracut writes its output to stderr, but validate_script_output() seems to check stdout (which is empty in this case). Serial terminal seems to do some sort of "magic" to merge stderr and stdout, so it works there.

The easy fix is to run dracut like this: dracut -f 2>&1.

#34 Updated by zluo over 2 years ago

  • Status changed from Workable to In Progress

then trying dracut -f 2>& in validate_script_output.

#36 Updated by zluo over 2 years ago

#37 Updated by zluo over 2 years ago

http://f40.suse.de/tests/3368 shows that this change is working fine for extra_tests_in_textmode@64bit

#38 Updated by zluo over 2 years ago

PR created:
https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/7293

The question is still: should we add dracut back to sle functional extra_test?

#39 Updated by zluo over 2 years ago

  • Status changed from In Progress to Feedback

#40 Updated by SLindoMansilla over 2 years ago

  • Status changed from Feedback to Workable

PR merged. Waiting for follow up ticket. Please, add the new ticket as related to this ticket. Feel free to resolve this ticket.

#41 Updated by zluo over 2 years ago

  • Copied to action #51023: [sle][functional][u] take dracut back in functional tests added

#42 Updated by zluo over 2 years ago

  • Status changed from Workable to Resolved

follow-up ticket created:
https://progress.opensuse.org/issues/51023

the test itself is working fine:
https://openqa.suse.de/tests/2814438

Set is now as resolved.

Also available in: Atom PDF