openSUSE Project Management Tool: Issueshttps://progress.opensuse.org/https://progress.opensuse.org/themes/openSUSE/favicon/favicon.ico?15829177842021-04-08T10:40:57ZopenSUSE Project Management Tool
Redmine openQA Infrastructure - action #90857 (Resolved): Add redundancy for SAP multi machines tests - E...https://progress.opensuse.org/issues/908572021-04-08T10:40:57Zjadamekjulien.adamek@suse.com
<p>OpenQA QEM review reported an issue with our SAP HANA tests executed on Maintenance TestRepo.<br>
The need is to get more resources or executing the tests on more existing machines.</p>
<p>First, let me summarize the situation:<br>
Nowadays, the timing is really tight because as you know, the maintenance test repo is triggered twice a day. <br>
That means 2 X 6 OS versions to test (12-SP3 to 15-SP2) with one HANA test per OS version.<br>
And it must be completed before the next build otherwise jobs are tagged obsolete.<br>
One HANA test requires 49 GB RAM: 2 x 24 GB (HANA machines) + 1 GB for the support server machine.</p>
<p>For these tests, we are only using openqaworker8 (sap_sle12) and openqaworker9 (sap_sle15), we made it like that to preserve the memory usage of the openQA instance (<a href="https://progress.opensuse.org/issues/73246):" class="external">https://progress.opensuse.org/issues/73246):</a><br>
Like that the HANA tests are done in serialize for sle12 as well as sle15. </p>
<p>For instance:<br>
HANA test starts for 15 GA on openqaworker9, the test lasts half an hour and a half. Once the test is done, the HANA test on 15 SP1 starts, and so on... <br>
Like we have 3 differents 15 versions (GA, SP1, SP2), the tests last 4 hours and a half only for SLE15.<br>
For SLE12, the HANA test lasts one hour so as we have 3 different 12 versions (SP3, SP4, SP5), the tests last 3 hours for SLE12. 12-S2P2 was removed recently.</p>
<p>Besides that, both workers are also used on Maintenance incident and we can not know how much we need there in advance.</p>
<p>I agree the solution isn't redundant at all. If one of the workers is down, the tests can not be executed elsewhere.<br>
For speeding up the tests, we can think about adding memory in both workers (at least 64GB per worker, not less because the jobs are linked together as they are multi machines jobs).</p>
openQA Project - action #54464 (Rejected): qemu-img convert failedhttps://progress.opensuse.org/issues/544642019-07-19T10:14:45Zjadamekjulien.adamek@suse.com
<p>I had the issue just below with two workers (openqaworker7 and 6):</p>
<p>[2019-07-18T16:24:32.567 CEST] [debug] qemu-img: Could not open '/var/lib/openqa/pool/11/raid/hd0-overlay8': Could not open backing file: Could not open backing file: Could not open backing file: Could not open backing file: Could not open backing file: Could not open backing file: Could not open backing file: Could not open backing file: Could not open backing file: Could not open '/var/lib/openqa/pool/11/sle-12-SP4-x86_64-ha-alpha-alpha-node01.qcow2': No such file or directory</p>
<p>This is not the first time I see this message, perhaps it's a known issue.</p>
<p>I know that other poo are opened for openqaworker7 (poo#54074 and poo#49694) but I'm not aware of any for openqaworker6.</p>
<p>List of failed tests:<br>
worker 6 <a href="https://openqa.suse.de/tests/3090100" class="external">https://openqa.suse.de/tests/3090100</a> <br>
worker 6 <a href="https://openqa.suse.de/tests/3088928" class="external">https://openqa.suse.de/tests/3088928</a><br>
worker 7 <a href="https://openqa.suse.de/tests/3087514" class="external">https://openqa.suse.de/tests/3087514</a></p>
<p>The test was re-triggered by a POST command and was ok in openqaworker9.<br>
worker 9 <a href="https://openqa.suse.de/tests/3092394" class="external">https://openqa.suse.de/tests/3092394</a></p>
<p>I see poo#54128 as well but looks like different.</p>