action #12410
Updated by okurz about 8 years ago
## observation https://openqa.suse.de/tests/448619 dasdfmt seems to be done but the wait_serial exited, maybe it was taking unusually long? ## steps to reproduce TBC ## problem happened during the last week sporadically H1. worker/s390-host specific H2. can happen everywhere H3. our recent changes in bootloader_s390 introduced some behaviour change H4. serial output gets lost H4.1. REJECTED: output to serial gets lost randomly -> 1000/1000 runs of `assert_script_run("echo $_", 10, 'failed');` succeeded, see http://opeth.suse.de/tests/2825 H4.2. REJECTED: long timeouts cause serial output loss -> 10/10 runs of `assert_script_run("sleep 900 && echo $_", 1200, 'failed');` succeeded, succceeded, see http://opeth.suse.de/tests/2825 and http://opeth.suse.de/tests/2866 H4.3. UNCLEAR: serial output only gets lost when dasdfmt is called with assert_script_run -> not reproducable at all on lord.arch, maybe E3-1 and E4-1 are invalid therefore ## suggestion * <del>check logfiles, e.g. for exact timing sequence</del> -> wait_serial times out after 20 minutes in both occassions. from video we can see that the actual formatting process was finished already in before * E1-1. DONE: reproduce by calling the dasdfmt repeatedly on another host (my host (okurz)) -> done, could not reproduce in http://lord.arch/tests/1582 in 17/17 runs of full dasdfmt on personal instance * E2-1. find out if problem only occurs on some or a single host * E3-1. @mgriessmeier: find old test run before we deployed new backend that shows this error * E4-1. DONE: @mgriessmeier: find s390x host with small disk (to save time) and format many times, i.e. call for-loop with the assert_script_run on dasdfmt -> could not reproduce ## workaround sporadic, restart