openSUSE Project Management Tool: Issueshttps://progress.opensuse.org/https://progress.opensuse.org/themes/openSUSE/favicon/favicon.ico?15829177842022-12-09T13:36:47ZopenSUSE Project Management Tool
Redmine openQA Project - action #121774 (In Progress): LTP cgroup test appears to crash OpenQA worker ins...https://progress.opensuse.org/issues/1217742022-12-09T13:36:47ZMDouchamartin.doucha@suse.com
<p>LTP test cgroup_fj_stress_blkio_4_4_each on latest SLE-15SP1 KOTD kernel appears to crash the OpenQA worker instance it's running on. The test itself will succeed but the OpenQA job will stay stuck in <code>wait_serial()</code> for several hours (despite 90 second timeout) until the whole job fails on MAX_JOB_TIME. There are 3 examples so far:<br>
<a href="https://openqa.suse.de/tests/10089424#step/cgroup_fj_stress_blkio_4_4_each/7" class="external">https://openqa.suse.de/tests/10089424#step/cgroup_fj_stress_blkio_4_4_each/7</a><br>
<a href="https://openqa.suse.de/tests/10111009#step/cgroup_fj_stress_blkio_4_4_each/7" class="external">https://openqa.suse.de/tests/10111009#step/cgroup_fj_stress_blkio_4_4_each/7</a><br>
<a href="https://openqa.suse.de/tests/10113099#step/cgroup_fj_stress_blkio_4_4_each/7" class="external">https://openqa.suse.de/tests/10113099#step/cgroup_fj_stress_blkio_4_4_each/7</a></p>
<p>I've seen this issue only on SLE-15SP1 KOTD builds 156 and 157. I have not seen any cases on other SLE versions.</p>
<p>Typical autoinst-log.txt entries related to the timeout:</p>
<pre><code>[2022-12-06T08:52:27.432374+01:00] [debug] <<< testapi::script_run(cmd="vmstat -w", output="", quiet=undef, timeout=30, die_on_timeout=1)
[2022-12-06T08:52:27.432549+01:00] [debug] tests/kernel/run_ltp.pm:334 called testapi::script_run
[2022-12-06T08:52:27.432710+01:00] [debug] <<< testapi::wait_serial(record_output=undef, regexp="# ", quiet=undef, no_regex=1, buffer_size=undef, expect_not_found=0, timeout=90)
[2022-12-06T10:39:58.278597+01:00] [debug] autotest received signal TERM, saving results of current test before exiting
[2022-12-06T10:39:58.278622+01:00] [debug] isotovideo received signal TERM
[2022-12-06T10:39:58.278748+01:00] [debug] backend got TERM
</code></pre>
<a name="Expected-result"></a>
<h2 >Expected result<a href="#Expected-result" class="wiki-anchor">¶</a></h2>
<p>Last good: <a href="https://openqa.suse.de/tests/10091628" class="external">4.12.14-150100.156.1.gb6c27ee</a> (or more recent)</p>
<a name="Further-details"></a>
<h2 >Further details<a href="#Further-details" class="wiki-anchor">¶</a></h2>
<p>Always latest result in this scenario: <a href="https://openqa.suse.de/tests/latest?arch=x86_64&distri=sle&flavor=Server-DVD-Incidents-Kernel-KOTD&machine=64bit&test=ltp_controllers&version=15-SP1" class="external">latest</a></p>
<a name="Steps-to-reproduce"></a>
<h2 >Steps to reproduce:<a href="#Steps-to-reproduce" class="wiki-anchor">¶</a></h2>
<ol>
<li>Run <code>ltp_controllers</code> testsuite on SLE-15SP1 KOTD</li>
<li>Wait.</li>
</ol>