action #35877
closed[functional][u] Find out in post-fail-hook if system is I/O-busy
0%
Description
Motivation¶
Often we have the situation that tests fail because "something might be slow". But the question might be is it CPU or I/O? To answer this we should look for I/O load in post_fail_hooks
Acceptance criteria¶
- AC1: generic post_fail_hook in os-autoinst-distri-opensuse checks for I/O load on failures as one of the first steps
Suggestion¶
sed -n 's/^.*sda / /p' /proc/diskstats | cut -d' ' -f10
will show the value of "0" if there is no I/O pending and a higher number if there is I/O pending for the first disk. Use this or something comparable as one of the very first steps in a generic post_fail_hook, e.g. the one in lib/opensusebasetest.pm
Alternatives¶
sudo iotop --batch --iter=3 --processes --quiet --only
but it requires the package "iotop" installed which IMHO we do not even have available everywhere
References¶
Updated by jorauch almost 7 years ago
- Related to action #30805: [functional][opensuse][leap][medium][u] first test after reboot fails in krunner, potential system overload (was: test fails in inkscape - typing too fast?) added
Updated by okurz over 6 years ago
- Target version changed from Milestone 17 to future
Updated by okurz about 6 years ago
- Blocks action #43376: [functional][u] Adapt opensusebasetest to provide dmesg and journal log added
Updated by zluo about 6 years ago
- Status changed from Workable to In Progress
- Assignee set to zluo
take over
Updated by zluo about 6 years ago
http://e13.suse.de/tests/11492#step/scc_registration/124 shows I/O status after post_fail_hook triggered.
Updated by zluo about 6 years ago
sed -n 's/^.*da / /p' /proc/diskstats | cut -d' ' -f10
should cover sda and vda. testing now.
Updated by zluo about 6 years ago
Updated by zluo about 6 years ago
- Status changed from In Progress to Feedback
waiting for merging PR.
Updated by zluo about 6 years ago
need to provide new verification test run since I re-installed openQA...
Updated by zluo about 6 years ago
- Status changed from Feedback to In Progress
working on updating PR.
Updated by zluo about 6 years ago
- Status changed from In Progress to Feedback
http://f40.suse.de/tests/48#step/logs_from_installation_system/25 show successful test run for ipmi.
PR got updated.
Updated by zluo about 6 years ago
- Status changed from Feedback to In Progress
need to investigate why log-console does not work and then update PR.
Updated by zluo about 6 years ago
change in opensusebasetest.pm like this now:
if ($self->{in_wait_boot}) {
record_info('shutdown', 'At least we reached target Shutdown') if (wait_serial 'Reached target Shutdown');
}
elsif ($self->{in_boot_desktop}) {
record_info('Startup', 'At least Startup is finished.') if (wait_serial 'Startup finished');
}
# Find out in post-fail-hook if system is I/O-busy, poo#35877
else {
select_console 'log-console';
my $io_status = script_output("sed -n 's/.*da / /p' /proc/diskstats | cut -d' ' -f10");
record_info('System I/O status:', ($io_status =~ /0$/) ? 'idle' : 'busy');
}
http://f40.suse.de/tests/66#step/yast2_lan_restart/54 shows expected results
Need more test scenarios...
Updated by zluo about 6 years ago
http://f40.suse.de/tests/67#step/bootloader/10 shows that my part of post_fail_hook nit executed which is correct.
Updated by zluo about 6 years ago
http://localhost/tests/68#step/boot_from_pxe/30 shows another example which doesn't trigger my part of post_fail_hook.
Updated by zluo about 6 years ago
http://f40.suse.de/tests/70#step/gnote_first_run/18 shows correct behavior.
Updated by zluo about 6 years ago
http://f40.suse.de/tests/72#step/rabbitmq/20 shows correct results for opensuse tests.
Updated by zluo almost 6 years ago
- Status changed from Feedback to Resolved
resolved now.
see example:
https://openqa.opensuse.org/tests/856186#step/java/26