openSUSE Project Management Tool: Issueshttps://progress.opensuse.org/https://progress.opensuse.org/themes/openSUSE/favicon/favicon.ico?15829177842020-06-05T10:41:39ZopenSUSE Project Management Tool
Redmine openQA Tests - action #67768 (Resolved): [kernel][hpc] SLE15-SP2 GMC HPC exploratory checkshttps://progress.opensuse.org/issues/677682020-06-05T10:41:39Zsebchladsebastian.chlad@suse.com
<p>This should commence with GMC build</p>
openQA Tests - action #66907 (Rejected): Multimachine test fails in setup for ARM workershttps://progress.opensuse.org/issues/669072020-05-15T11:41:55Zsebchladsebastian.chlad@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>With RC2 build for SLE15SP2 we hit again known ARM MM problems. Those are recurring issues and require usual attention form QA tools team/people with access to OSD machines.<br>
As this is RC2 build validation day, I'm opening this to perhaps get some traction on finding long term solutions. </p>
<p>openQA test in scenario sle-15-SP2-Online-aarch64-hpc_DELTA_slurm_accounting_supportserver@aarch64 fails in<br>
<a href="https://openqa.suse.de/tests/4236631/modules/setup/steps/72" class="external">setup</a></p>
<a name="Test-suite-description"></a>
<h2 >Test suite description<a href="#Test-suite-description" class="wiki-anchor">¶</a></h2>
<p>Slurm accounting tests with db configured and NFS shared folder provided. 1 ctl, multiple compute nodes. Maintainer: schlad</p>
<a name="Reproducible"></a>
<h2 >Reproducible<a href="#Reproducible" class="wiki-anchor">¶</a></h2>
<p>Fails since (at least) Build <a href="https://openqa.suse.de/tests/4236631" class="external">194.1</a> (current job)</p>
<a name="Expected-result"></a>
<h2 >Expected result<a href="#Expected-result" class="wiki-anchor">¶</a></h2>
<p>Last good: <a href="https://openqa.suse.de/tests/4230231" class="external">191.1</a> (or more recent)</p>
<a name="Further-details"></a>
<h2 >Further details<a href="#Further-details" class="wiki-anchor">¶</a></h2>
<p>Always latest result in this scenario: <a href="https://openqa.suse.de/tests/latest?arch=aarch64&distri=sle&flavor=Online&machine=aarch64&test=hpc_DELTA_slurm_accounting_supportserver&version=15-SP2" class="external">latest</a></p>
openQA Tests - action #64039 (Resolved): test fails in slurm_masterhttps://progress.opensuse.org/issues/640392020-03-02T08:45:40Zsebchladsebastian.chlad@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>openQA test in scenario sle-15-SP2-Online-x86_64-hpc_DELTA_slurm_master_accounting@64bit fails in<br>
<a href="https://openqa.suse.de/tests/3939585/modules/slurm_master/steps/313" class="external">slurm_master</a></p>
<a name="Test-suite-description"></a>
<h2 >Test suite description<a href="#Test-suite-description" class="wiki-anchor">¶</a></h2>
<p>Slurm accounting tests with db configured and NFS shared folder provided. 1 ctl, multiple compute nodes. Maintainer: schlad</p>
<a name="Reproducible"></a>
<h2 >Reproducible<a href="#Reproducible" class="wiki-anchor">¶</a></h2>
<p>Fails since (at least) Build <a href="https://openqa.suse.de/tests/3936274" class="external">146.1</a></p>
<a name="Expected-result"></a>
<h2 >Expected result<a href="#Expected-result" class="wiki-anchor">¶</a></h2>
<p>Last good: <a href="https://openqa.suse.de/tests/3922312" class="external">143.1</a> (or more recent)</p>
<a name="Further-details"></a>
<h2 >Further details<a href="#Further-details" class="wiki-anchor">¶</a></h2>
<p>Always latest result in this scenario: <a href="https://openqa.suse.de/tests/latest?arch=x86_64&distri=sle&flavor=Online&machine=64bit&test=hpc_DELTA_slurm_master_accounting&version=15-SP2" class="external">latest</a></p>
<p>Longer Story:<br>
<a href="https://bugzilla.suse.com/show_bug.cgi?id=1165151" class="external">https://bugzilla.suse.com/show_bug.cgi?id=1165151</a><br>
<a href="https://jira.suse.com/browse/SLE-9356" class="external">https://jira.suse.com/browse/SLE-9356</a></p>
<p>One would assume better tests for the DB should be in place however HPC tests are not really meant to do that. HPC tests are there to ensure checks of the HPC components. </p>
<p>I will try to encourage Marita/QA SLE Pro Man to ensure better functional tests of the DB.</p>
openQA Tests - action #62858 (Resolved): [kernel][hpc] test fails in slurm_masterhttps://progress.opensuse.org/issues/628582020-01-31T12:44:09Zsebchladsebastian.chlad@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>openQA test in scenario sle-15-SP2-Online-x86_64-hpc_ALPHA_slurm_master@64bit fails in<br>
<a href="https://openqa.suse.de/tests/3849070/modules/slurm_master/steps/89" class="external">slurm_master</a></p>
<a name="Test-suite-description"></a>
<h2 >Test suite description<a href="#Test-suite-description" class="wiki-anchor">¶</a></h2>
<p>Simple HPC cluster with 1 slurmctl and multiple compute nodes Maintainer : <a href="mailto:schlad@suse.de">schlad@suse.de</a> , <a href="mailto:pcervinka@suse.com">pcervinka@suse.com</a></p>
<a name="Reproducible"></a>
<h2 >Reproducible<a href="#Reproducible" class="wiki-anchor">¶</a></h2>
<p>Fails since (at least) Build <a href="https://openqa.suse.de/tests/3791496" class="external">122.1</a></p>
<a name="Expected-result"></a>
<h2 >Expected result<a href="#Expected-result" class="wiki-anchor">¶</a></h2>
<p>Last good: <a href="https://openqa.suse.de/tests/3785815" class="external">122.1</a> (or more recent)</p>
<a name="Further-details"></a>
<h2 >Further details<a href="#Further-details" class="wiki-anchor">¶</a></h2>
<p>Always latest result in this scenario: <a href="https://openqa.suse.de/tests/latest?arch=x86_64&distri=sle&flavor=Online&machine=64bit&test=hpc_ALPHA_slurm_master&version=15-SP2" class="external">latest</a></p>
openQA Tests - action #62645 (Resolved): [hpc] test hpchttps://progress.opensuse.org/issues/626452020-01-24T13:52:24Zsebchladsebastian.chlad@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>HPC test should work.</p>
<p>fix hpc test lib</p>
openQA Tests - action #62153 (Resolved): [hpc] test fails in slurm_masterhttps://progress.opensuse.org/issues/621532020-01-15T08:33:08Zsebchladsebastian.chlad@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>openQA test in scenario sle-15-SP2-Online-aarch64-hpc_GAMMA_slurm_master_db@aarch64 fails in<br>
<a href="https://openqa.suse.de/tests/3787688/modules/slurm_master/steps/14" class="external">slurm_master</a></p>
<a name="Test-suite-description"></a>
<h2 >Test suite description<a href="#Test-suite-description" class="wiki-anchor">¶</a></h2>
<p>Slurm accounting tests with db configured and NFS shared folder provided. 2 ctls, multiple compute nodes. Maintainer: schlad</p>
<a name="Reproducible"></a>
<h2 >Reproducible<a href="#Reproducible" class="wiki-anchor">¶</a></h2>
<p>Fails since (at least) Build <a href="https://openqa.suse.de/tests/3758774" class="external">120.1</a></p>
<a name="Expected-result"></a>
<h2 >Expected result<a href="#Expected-result" class="wiki-anchor">¶</a></h2>
<p>Last good: <a href="https://openqa.suse.de/tests/3729513" class="external">105.4</a> (or more recent)</p>
<a name="Further-details"></a>
<h2 >Further details<a href="#Further-details" class="wiki-anchor">¶</a></h2>
<p>Always latest result in this scenario: <a href="https://openqa.suse.de/tests/latest?arch=aarch64&distri=sle&flavor=Online&machine=aarch64&test=hpc_GAMMA_slurm_master_db&version=15-SP2" class="external">latest</a></p>
openQA Tests - action #60731 (Resolved): [hpc] test fails in before_testhttps://progress.opensuse.org/issues/607312019-12-05T13:21:36Zsebchladsebastian.chlad@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>openQA test in scenario sle-15-SP2-Online-aarch64-hpc_ALPHA_mpich_mpi_master@aarch64 fails in<br>
<a href="https://openqa.suse.de/tests/3666210/modules/before_test/steps/53" class="external">before_test</a></p>
<a name="Test-suite-description"></a>
<h2 >Test suite description<a href="#Test-suite-description" class="wiki-anchor">¶</a></h2>
<p>Basic tests of mpich with CPU count=1. Maintainer: schlad <a href="mailto:schlad@suse.de">schlad@suse.de</a></p>
<a name="Reproducible"></a>
<h2 >Reproducible<a href="#Reproducible" class="wiki-anchor">¶</a></h2>
<p>Fails since (at least) Build <a href="https://openqa.suse.de/tests/3618154" class="external">92.1</a></p>
<a name="Expected-result"></a>
<h2 >Expected result<a href="#Expected-result" class="wiki-anchor">¶</a></h2>
<p>Last good: (unknown) (or more recent)</p>
<a name="Further-details"></a>
<h2 >Further details<a href="#Further-details" class="wiki-anchor">¶</a></h2>
<p>Always latest result in this scenario: <a href="https://openqa.suse.de/tests/latest?arch=aarch64&distri=sle&flavor=Online&machine=aarch64&test=hpc_ALPHA_mpich_mpi_master&version=15-SP2" class="external">latest</a></p>
openQA Tests - action #60728 (Resolved): [hpc] test fails in setuphttps://progress.opensuse.org/issues/607282019-12-05T13:11:43Zsebchladsebastian.chlad@suse.com
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>openQA test in scenario sle-15-SP2-Online-x86_64-hpc_BETA_slurm_adv_supportserver@64bit fails in<br>
<a href="https://openqa.suse.de/tests/3663349/modules/setup/steps/40" class="external">setup</a></p>
<a name="Test-suite-description"></a>
<h2 >Test suite description<a href="#Test-suite-description" class="wiki-anchor">¶</a></h2>
<p>HPC cluster with 2 slurm ctls and multiple compute nodes. NFS folder mounted for slurmctls, so the failover can be tested<br>
Maintainer: <a href="mailto:schlad@suse.de">schlad@suse.de</a></p>
<a name="Reproducible"></a>
<h2 >Reproducible<a href="#Reproducible" class="wiki-anchor">¶</a></h2>
<p>Fails since (at least) Build <a href="https://openqa.suse.de/tests/3614243" class="external">92.1</a></p>
<a name="Expected-result"></a>
<h2 >Expected result<a href="#Expected-result" class="wiki-anchor">¶</a></h2>
<p>Last good: (unknown) (or more recent)</p>
<a name="Further-details"></a>
<h2 >Further details<a href="#Further-details" class="wiki-anchor">¶</a></h2>
<p>Always latest result in this scenario: <a href="https://openqa.suse.de/tests/latest?arch=x86_64&distri=sle&flavor=Online&machine=64bit&test=hpc_BETA_slurm_adv_supportserver&version=15-SP2" class="external">latest</a></p>
openQA Project - action #52517 (Rejected): Sorting out tests: Visualize parallel job dependencies...https://progress.opensuse.org/issues/525172019-06-03T15:14:01Zsebchladsebastian.chlad@suse.com
<p>The current openQA webUI seems sorting out elements (tests) alphabetically.<br>
For instance: <a href="https://openqa.suse.de/tests/overview?distri=sle&version=15-SP1&build=228.2&groupid=130" class="external">https://openqa.suse.de/tests/overview?distri=sle&version=15-SP1&build=228.2&groupid=130</a></p>
<p>In the meantime, Marius implemented the feature of displaying dependencies within a test run.<br>
<a href="https://openqa.suse.de/tests/2923320#dependencies" class="external">https://openqa.suse.de/tests/2923320#dependencies</a></p>
<p>I'm introducing more tests for cluster testing, so there are multiple VMs (cluster nodes) which are displayed as separate 'tests' while in fact those are VMs constituting multi-node test run.<br>
The current webUI does not provide a nice way to display such tests in a way which would be easy for people to recognize it.</p>
<p>I wonder if there is a quick way to take advantage of Marius' work and be able to render the page of a job group for a given build in way to group certain tests together.</p>
<p>Of course this could be achieved by proper naming, but this still seems like a workaround and not a proper solution. </p>
<p>I wonder if people see similar need and if there are any ideas on how to go about this use case.</p>
openQA Tests - action #52256 (Resolved): [hpc][kernel] Prepare support_server NFS serverhttps://progress.opensuse.org/issues/522562019-05-28T15:23:00Zsebchladsebastian.chlad@suse.com
<p>Some HPC tests require to use NFS server, as the 2 nodes could share one directory.</p>
<p>Currently I use custom test run() for this but implementing this in support_server (as there is a flag for it) would be desirable</p>
openQA Project - action #51932 (Resolved): Trimming of white spaces on "machine definition"https://progress.opensuse.org/issues/519322019-05-23T14:22:56Zsebchladsebastian.chlad@suse.com
<p>As it looks there is no trimming of leading white space characters which might result in fairly obscure problems with - for instance - scheduler picking up the jobs.</p>
openQA Infrastructure - action #41189 (Resolved): [tools][monitoring] Worker 'reachable' notifica...https://progress.opensuse.org/issues/411892018-09-18T11:20:56Zsebchladsebastian.chlad@suse.com
<p>As we have initial grafana to monitor the state of machines/workers, we need to start sending email notifications from existing grafana monitoring.</p>
<p>See: <a href="http://docs.grafana.org/alerting/notifications/" class="external">http://docs.grafana.org/alerting/notifications/</a></p>
<p>Requirements:</p>
<ul>
<li>email notifications should be delivered to an open mailing list</li>
<li>QAM and QASLE (Marita and Heiko) are subscribed to those notifications</li>
</ul>
openQA Project - action #32593 (Rejected): Multiple ttySx consoles for qemuhttps://progress.opensuse.org/issues/325932018-03-01T11:56:43Zsebchladsebastian.chlad@suse.com
<p>For some testing it could be useful to have a possibility to create some more ttyS devices.</p>
<p>Would it be useful to have another variable, say MSERIALDEV, where test writer could specify how many ttyS devices he or she needs?</p>
<p>Or would it be better to play with SERIALDEV, so it could do the job of starting more than one ttyS devices?</p>
openQA Tests - action #12288 (Resolved): [test review] snapper_undochange fails to create first s...https://progress.opensuse.org/issues/122882016-06-10T11:59:00Zsebchladsebastian.chlad@suse.com
<a name="observation"></a>
<h2 >observation<a href="#observation" class="wiki-anchor">¶</a></h2>
<p>test fails for no apparent reason so indication is that it might be openQA performance limitations</p>
<a name="reproducible"></a>
<h2 >reproducible<a href="#reproducible" class="wiki-anchor">¶</a></h2>
<p>occured at least twice: </p>
<a name="problem"></a>
<h2 >problem<a href="#problem" class="wiki-anchor">¶</a></h2>
<p>H1. test timeout of 30 seconds is too strict to ensure a snapshot can be created under all circumstances<br>
H2. a snapshot sometimes can not be created at all, waiting for longer would not help<br>
H3. ppc64le specific</p>
<a name="suggestion"></a>
<h2 >suggestion<a href="#suggestion" class="wiki-anchor">¶</a></h2>
openQA Tests - action #12110 (Resolved): gnome_control_center: short timeout on details caused te...https://progress.opensuse.org/issues/121102016-05-24T14:01:52Zsebchladsebastian.chlad@suse.com
<a name="observation"></a>
<h2 >observation<a href="#observation" class="wiki-anchor">¶</a></h2>
<p>gnome_control_center fails as there is no details displayed as expected, e.g. see <a href="https://openqa.suse.de/tests/396384" class="external">https://openqa.suse.de/tests/396384</a></p>
<a name="problem"></a>
<h2 >problem<a href="#problem" class="wiki-anchor">¶</a></h2>
<p>It seems like in staging we see issue with short timeout set to 3 so the the needle does not match to the expected results.</p>
<a name="suggestions"></a>
<h2 >suggestions<a href="#suggestions" class="wiki-anchor">¶</a></h2>
<ol>
<li><p>perhaps increase the timeout however that is a bad solution.</p></li>
<li><p>should we monitor system resources to get the better picture instead of just increasing timeouts?</p></li>
</ol>