action #35877

[functional][u] Find out in post-fail-hook if system is I/O-busy

Added by okurz almost 2 years ago. Updated about 1 year ago.

Status:ResolvedStart date:03/05/2018
Priority:NormalDue date:
Assignee:zluo% Done:

0%

Category:Enhancement to existing tests
Target version:QA - future
Difficulty:
Duration:

Description

Motivation

Often we have the situation that tests fail because "something might be slow". But the question might be is it CPU or I/O? To answer this we should look for I/O load in post_fail_hooks

Acceptance criteria

  • AC1: generic post_fail_hook in os-autoinst-distri-opensuse checks for I/O load on failures as one of the first steps

Suggestion

  • sed -n 's/^.*sda / /p' /proc/diskstats | cut -d' ' -f10 will show the value of "0" if there is no I/O pending and a higher number if there is I/O pending for the first disk. Use this or something comparable as one of the very first steps in a generic post_fail_hook, e.g. the one in lib/opensusebasetest.pm

Alternatives

  • sudo iotop --batch --iter=3 --processes --quiet --only but it requires the package "iotop" installed which IMHO we do not even have available everywhere

References


Related issues

Related to openQA Tests - action #30805: [functional][opensuse][leap][medium][u] first test after ... Resolved 25/01/2018 14/08/2018
Blocks openQA Tests - action #43376: [functional][u] Adapt opensusebasetest to provide dmesg a... New 05/11/2018

History

#1 Updated by okurz almost 2 years ago

  • Description updated (diff)

#2 Updated by jorauch almost 2 years ago

  • Related to action #30805: [functional][opensuse][leap][medium][u] first test after reboot fails in krunner, potential system overload (was: test fails in inkscape - typing too fast?) added

#3 Updated by mgriessmeier almost 2 years ago

  • Due date deleted (19/06/2018)

Bulk removing Due Date

#4 Updated by okurz almost 2 years ago

  • Target version changed from Milestone 17 to future

#5 Updated by okurz almost 2 years ago

  • Target version changed from future to future

#6 Updated by okurz over 1 year ago

  • Status changed from New to Workable

#7 Updated by okurz over 1 year ago

  • Blocks action #43376: [functional][u] Adapt opensusebasetest to provide dmesg and journal log added

#8 Updated by okurz over 1 year ago

  • Description updated (diff)

#9 Updated by zluo over 1 year ago

  • Status changed from Workable to In Progress
  • Assignee set to zluo

take over

#10 Updated by zluo over 1 year ago

http://e13.suse.de/tests/11492#step/scc_registration/124 shows I/O status after post_fail_hook triggered.

#11 Updated by zluo over 1 year ago

sed -n 's/^.*da / /p' /proc/diskstats | cut -d' ' -f10 

should cover sda and vda. testing now.

#13 Updated by zluo over 1 year ago

  • Status changed from In Progress to Feedback

waiting for merging PR.

#14 Updated by zluo over 1 year ago

need to provide new verification test run since I re-installed openQA...

#15 Updated by zluo over 1 year ago

  • Status changed from Feedback to In Progress

working on updating PR.

#16 Updated by zluo over 1 year ago

  • Status changed from In Progress to Feedback

http://f40.suse.de/tests/48#step/logs_from_installation_system/25 show successful test run for ipmi.
PR got updated.

#17 Updated by zluo over 1 year ago

  • Status changed from Feedback to In Progress

need to investigate why log-console does not work and then update PR.

#18 Updated by zluo over 1 year ago

change in opensusebasetest.pm like this now:

if ($self->{in_wait_boot}) {
record_info('shutdown', 'At least we reached target Shutdown') if (wait_serial 'Reached target Shutdown');
}
elsif ($self->{in_boot_desktop}) {
record_info('Startup', 'At least Startup is finished.') if (wait_serial 'Startup finished');
}
# Find out in post-fail-hook if system is I/O-busy, poo#35877
else {
select_console 'log-console';
my $io_status = script_output("sed -n 's/.*da / /p' /proc/diskstats | cut -d' ' -f10");
record_info('System I/O status:', ($io_status =~ /0$/) ? 'idle' : 'busy');
}

http://f40.suse.de/tests/66#step/yast2_lan_restart/54 shows expected results

Need more test scenarios...

#19 Updated by zluo over 1 year ago

http://f40.suse.de/tests/67#step/bootloader/10 shows that my part of post_fail_hook nit executed which is correct.

#20 Updated by zluo over 1 year ago

http://localhost/tests/68#step/boot_from_pxe/30 shows another example which doesn't trigger my part of post_fail_hook.

#22 Updated by zluo over 1 year ago

http://f40.suse.de/tests/72#step/rabbitmq/20 shows correct results for opensuse tests.

#23 Updated by zluo over 1 year ago

PR updated now.

#24 Updated by zluo over 1 year ago

  • Status changed from In Progress to Feedback

#25 Updated by zluo about 1 year ago

  • Status changed from Feedback to Resolved

Also available in: Atom PDF