Project

General

Profile

Actions

action #35877

closed

[functional][u] Find out in post-fail-hook if system is I/O-busy

Added by okurz almost 6 years ago. Updated about 5 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Enhancement to existing tests
Target version:
Start date:
2018-05-03
Due date:
% Done:

0%

Estimated time:
Difficulty:

Description

Motivation

Often we have the situation that tests fail because "something might be slow". But the question might be is it CPU or I/O? To answer this we should look for I/O load in post_fail_hooks

Acceptance criteria

  • AC1: generic post_fail_hook in os-autoinst-distri-opensuse checks for I/O load on failures as one of the first steps

Suggestion

  • sed -n 's/^.*sda / /p' /proc/diskstats | cut -d' ' -f10 will show the value of "0" if there is no I/O pending and a higher number if there is I/O pending for the first disk. Use this or something comparable as one of the very first steps in a generic post_fail_hook, e.g. the one in lib/opensusebasetest.pm

Alternatives

  • sudo iotop --batch --iter=3 --processes --quiet --only but it requires the package "iotop" installed which IMHO we do not even have available everywhere

References


Related issues 2 (0 open2 closed)

Related to openQA Tests - action #30805: [functional][opensuse][leap][medium][u] first test after reboot fails in krunner, potential system overload (was: test fails in inkscape - typing too fast?)Resolvedokurz2018-01-252018-08-14

Actions
Blocks openQA Tests - action #43376: [functional][u] Adapt opensusebasetest to provide dmesg and journal logResolvedSLindoMansilla2018-11-05

Actions
Actions #1

Updated by okurz almost 6 years ago

  • Description updated (diff)
Actions #2

Updated by jorauch almost 6 years ago

  • Related to action #30805: [functional][opensuse][leap][medium][u] first test after reboot fails in krunner, potential system overload (was: test fails in inkscape - typing too fast?) added
Actions #3

Updated by mgriessmeier almost 6 years ago

  • Due date deleted (2018-06-19)

Bulk removing Due Date

Actions #4

Updated by okurz almost 6 years ago

  • Target version changed from Milestone 17 to future
Actions #5

Updated by okurz almost 6 years ago

  • Target version changed from future to future
Actions #6

Updated by okurz over 5 years ago

  • Status changed from New to Workable
Actions #7

Updated by okurz over 5 years ago

  • Blocks action #43376: [functional][u] Adapt opensusebasetest to provide dmesg and journal log added
Actions #8

Updated by okurz over 5 years ago

  • Description updated (diff)
Actions #9

Updated by zluo over 5 years ago

  • Status changed from Workable to In Progress
  • Assignee set to zluo

take over

Actions #10

Updated by zluo over 5 years ago

http://e13.suse.de/tests/11492#step/scc_registration/124 shows I/O status after post_fail_hook triggered.

Actions #11

Updated by zluo over 5 years ago

sed -n 's/^.*da / /p' /proc/diskstats | cut -d' ' -f10 

should cover sda and vda. testing now.

Actions #13

Updated by zluo over 5 years ago

  • Status changed from In Progress to Feedback

waiting for merging PR.

Actions #14

Updated by zluo over 5 years ago

need to provide new verification test run since I re-installed openQA...

Actions #15

Updated by zluo over 5 years ago

  • Status changed from Feedback to In Progress

working on updating PR.

Actions #16

Updated by zluo over 5 years ago

  • Status changed from In Progress to Feedback

http://f40.suse.de/tests/48#step/logs_from_installation_system/25 show successful test run for ipmi.
PR got updated.

Actions #17

Updated by zluo over 5 years ago

  • Status changed from Feedback to In Progress

need to investigate why log-console does not work and then update PR.

Actions #18

Updated by zluo over 5 years ago

change in opensusebasetest.pm like this now:

if ($self->{in_wait_boot}) {
record_info('shutdown', 'At least we reached target Shutdown') if (wait_serial 'Reached target Shutdown');
}
elsif ($self->{in_boot_desktop}) {
record_info('Startup', 'At least Startup is finished.') if (wait_serial 'Startup finished');
}
# Find out in post-fail-hook if system is I/O-busy, poo#35877
else {
select_console 'log-console';
my $io_status = script_output("sed -n 's/.*da / /p' /proc/diskstats | cut -d' ' -f10");
record_info('System I/O status:', ($io_status =~ /0$/) ? 'idle' : 'busy');
}

http://f40.suse.de/tests/66#step/yast2_lan_restart/54 shows expected results

Need more test scenarios...

Actions #19

Updated by zluo over 5 years ago

http://f40.suse.de/tests/67#step/bootloader/10 shows that my part of post_fail_hook nit executed which is correct.

Actions #20

Updated by zluo over 5 years ago

http://localhost/tests/68#step/boot_from_pxe/30 shows another example which doesn't trigger my part of post_fail_hook.

Actions #22

Updated by zluo over 5 years ago

http://f40.suse.de/tests/72#step/rabbitmq/20 shows correct results for opensuse tests.

Actions #23

Updated by zluo over 5 years ago

PR updated now.

Actions #24

Updated by zluo over 5 years ago

  • Status changed from In Progress to Feedback
Actions #25

Updated by zluo about 5 years ago

  • Status changed from Feedback to Resolved
Actions

Also available in: Atom PDF