openSUSE Project Management Tool: Issueshttps://progress.opensuse.org/https://progress.opensuse.org/themes/openSUSE/favicon/favicon.ico?15829177842024-02-07T13:33:23ZopenSUSE Project Management Tool
Redmine openQA Tests - action #155083 (In Progress): [qe-core] schedule openssl_nodejs on all SLE versionshttps://progress.opensuse.org/issues/1550832024-02-07T13:33:23Zmgrifalconi
<p>Right now it is not running on SLE15 SP4 and SP5</p>
<p>Test runs fail on the same tests for both node versions:<br>
15 SP4 <a href="https://openqa.suse.de/tests/13450430#step/openssl_nodejs/9" class="external">https://openqa.suse.de/tests/13450430#step/openssl_nodejs/9</a> (full log: <a href="https://openqa.suse.de/tests/13450430/logfile?filename=serial_terminal.txt" class="external">https://openqa.suse.de/tests/13450430/logfile?filename=serial_terminal.txt</a>)<br>
15 SP5 <a href="https://openqa.suse.de/tests/13450434#step/openssl_nodejs/9" class="external">https://openqa.suse.de/tests/13450434#step/openssl_nodejs/9</a> (full log: <a href="https://openqa.suse.de/tests/13450434/logfile?filename=serial_terminal.txt" class="external">https://openqa.suse.de/tests/13450434/logfile?filename=serial_terminal.txt</a>)</p>
<pre><code>15 SP4
Some tests have failed:
Node v18.18.2-150400.9.15.1 Test test/parallel/test-tls-multi-key.js
Node v18.18.2-150400.9.15.1 Test test/parallel/test-tls-client-renegotiation-13.js
Node v18.18.2-150400.9.15.1 Test test/parallel/test-tls-client-getephemeralkeyinfo.js
Node v18.18.2-150400.9.15.1 Test test/parallel/test-tls-sni-option.js
Node v18.18.2-150400.9.15.1 Test test/parallel/test-tls-client-verify.js
Node v18.18.2-150400.9.15.1 Test test/parallel/test-tls-client-mindhsize.js
Node v18.18.2-150400.9.15.1 Test test/parallel/test-tls-dhe.js
Node v18.18.2-150400.9.15.1 Test test/parallel/test-tls-sni-server-client.js
Node v18.18.2-150400.9.15.1 Test test/parallel/test-tls-server-verify.js
Node v18.18.2-150400.9.15.1 Test test/parallel/test-tls-cert-regression.js
Node v18.18.2-150400.9.15.1 Test test/parallel/test-tls-peer-certificate-encoding.js
Node v18.18.2-150400.9.15.1 Test test/parallel/test-tls-client-auth.js
Node v18.18.2-150400.9.15.1 Test test/parallel/test-tls-multiple-cas-as-string.js
15 SP5
Some tests have failed:
Node v20.8.1-150500.11.3.1 Test test/parallel/test-tls-multi-key.js
Node v20.8.1-150500.11.3.1 Test test/parallel/test-tls-client-renegotiation-13.js
Node v20.8.1-150500.11.3.1 Test test/parallel/test-tls-client-getephemeralkeyinfo.js
Node v20.8.1-150500.11.3.1 Test test/parallel/test-tls-sni-option.js
Node v20.8.1-150500.11.3.1 Test test/parallel/test-tls-client-verify.js
Node v20.8.1-150500.11.3.1 Test test/parallel/test-tls-client-mindhsize.js
Node v20.8.1-150500.11.3.1 Test test/parallel/test-tls-dhe.js
Node v20.8.1-150500.11.3.1 Test test/parallel/test-tls-sni-server-client.js
Node v20.8.1-150500.11.3.1 Test test/parallel/test-tls-server-verify.js
Node v20.8.1-150500.11.3.1 Test test/parallel/test-tls-cert-regression.js
Node v20.8.1-150500.11.3.1 Test test/parallel/test-tls-peer-certificate-encoding.js
Node v20.8.1-150500.11.3.1 Test test/parallel/test-tls-client-auth.js
Node v20.8.1-150500.11.3.1 Test test/parallel/test-tls-multiple-cas-as-string.js
</code></pre> openQA Tests - action #154039 (Blocked): [qe-core] [sporadic][timeout] test fails in mdadmhttps://progress.opensuse.org/issues/1540392024-01-22T13:10:15Zmgrifalconi
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>Seems sporadic but getting frequent today, might be worth to increase some timeout at least.</p>
<p>openQA test in scenario sle-15-SP4-Server-DVD-Incidents-TERADATA-x86_64-mau-extratests2@64bit fails in<br>
<a href="https://openqa.suse.de/tests/13308075/modules/mdadm/steps/7" class="external">mdadm</a></p>
<a name="Test-suite-description"></a>
<h2 >Test suite description<a href="#Test-suite-description" class="wiki-anchor">¶</a></h2>
<p>Testsuite maintained at <a href="https://gitlab.suse.de/qa-maintenance/qam-openqa-yml" class="external">https://gitlab.suse.de/qa-maintenance/qam-openqa-yml</a>. Run console tests against aggregated test repo</p>
<a name="Reproducible"></a>
<h2 >Reproducible<a href="#Reproducible" class="wiki-anchor">¶</a></h2>
<p>Fails since (at least) Build <a href="https://openqa.suse.de/tests/13307913" class="external">:31365:tomcat</a></p>
<a name="Expected-result"></a>
<h2 >Expected result<a href="#Expected-result" class="wiki-anchor">¶</a></h2>
<p>Last good: <a href="https://openqa.suse.de/tests/13307450" class="external">:32173:MozillaFirefox</a> (or more recent)</p>
<a name="Further-details"></a>
<h2 >Further details<a href="#Further-details" class="wiki-anchor">¶</a></h2>
<p>Always latest result in this scenario: <a href="https://openqa.suse.de/tests/latest?arch=x86_64&distri=sle&flavor=Server-DVD-Incidents-TERADATA&machine=64bit&test=mau-extratests2&version=15-SP4" class="external">latest</a></p>
QA - action #153886 (New): SMELT incidents and Release Requests IDs are not unique and may interf...https://progress.opensuse.org/issues/1538862024-01-18T13:58:20Zmgrifalconi
<p>I would like to raise 2 issues (to be verified) about the current approval process of maintenance updates:</p>
<ul>
<li><p>SMELT Incidents ID can be reused for multiple Release Requests and what the process uses right now is the incident ID to tag a test that is crucial for the RR approval. Now the bot/dashboard combo uses a workaround of deleting some openqa results (from dashboard DB) to prevent issues (see <a href="https://github.com/openSUSE/qem-dashboard/pull/78/files" class="external">https://github.com/openSUSE/qem-dashboard/pull/78/files</a> ) but this makes the bot approval logic complex and shared between bot and dashboard code. Would be nice to switch from SMELT ID to IBS RR ID (or just add the RR on top) to resolve the issue at the origin.</p></li>
<li><p>RR are not unique either, but in a different way: RR can be revoked and then reopen (maybe with different content to test? to be checked). I know the bot recognize (some) changes and re-triggers incident tests, but what about aggregates? Is there a chance they could be wrongly considered for approval decision? Also incident channels could be changed while the incident/RR combo is being tested causing some confusion on bot side. If this proves to be a real issue, a solution idea would be to make sure test results related to older 'version' of a RR are not considered and the bot waits for new ones. Maybe add to SMELT-ID/RR combo, also a timestamp of smelt-incident/ibs-rr latest change?</p></li>
</ul>
<p>I can expect the valid argument that these are rare corner cases, but we should also consider that we are here to catch corner cases. Complex updates that gets modified while being tested should get enhanced attention and not reduced IMO.</p>
openQA Infrastructure - action #151570 (New): [qe-core] Cleanup openQA jobgroupshttps://progress.opensuse.org/issues/1515702023-11-28T07:41:52Zmgrifalconi
<p>On both openqa.suse.de and openqa.opensuse.org we have many old job groups that are unused but were never deleted, likely due to the missing option in the web-ui (see <a href="https://progress.opensuse.org/issues/57170" class="external">https://progress.opensuse.org/issues/57170</a>).</p>
<p>I understand there might be little performance impact on leaving them be, but if there is no business reason to keep them I think is time to do some cleanup.</p>
<p>AC1: Find out if old job groups can be deleted or there is some reason to keep them<br>
AC2: Check how difficult would be to implement the delete button in the ui<br>
AC3: Clean up what is possible</p>
QA - action #123286 (Resolved): Bot and dashboard reference to wrong data and block update approv...https://progress.opensuse.org/issues/1232862023-01-18T09:19:08Zmgrifalconi
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>Hello, there is some inconsistency with the dashboard data about 27130:dragonbox</p>
<p>Link of the red SLE 15 SP4 box in blocked page points to <a href="https://openqa.suse.de/tests/overview?build=%3A27130%3Afixmath&distri=sle&groupid=439" class="external">https://openqa.suse.de/tests/overview?build=%3A27130%3Afixmath&distri=sle&groupid=439</a><br>
with no failures</p>
<p>Link inside the update request page <a href="http://dashboard.qam.suse.de/incident/27130" class="external">http://dashboard.qam.suse.de/incident/27130</a> points to a different incidents results <a href="https://openqa.suse.de/tests/overview?build=%3A27130%3Alibmwaw" class="external">https://openqa.suse.de/tests/overview?build=%3A27130%3Alibmwaw</a> with this time a failure</p>
<p>Bot approval job log:</p>
<pre><code> 2023-01-17 08:05:34 INFO Found failed, not-ignored job 10166069 for incident 27130
</code></pre>
<p>Interestingly enough, I restarted the month-old job and now even that is green.<br>
But still, the bot does not like it and keeps the 'box' red.<br>
<a href="https://openqa.suse.de/tests/10166069" class="external">https://openqa.suse.de/tests/10166069</a><br>
even if its clone is green: <a href="https://openqa.suse.de/tests/10331221" class="external">https://openqa.suse.de/tests/10331221</a></p>
<a name="Problem"></a>
<h2 >Problem<a href="#Problem" class="wiki-anchor">¶</a></h2>
<p>The problem here seems to be that the incident 27130 was modified multiple times and references multiple package as visible in <a href="https://smelt.suse.de/incident/27130/" class="external">https://smelt.suse.de/incident/27130/</a></p>
<a name="Acceptance-criteria"></a>
<h2 >Acceptance criteria<a href="#Acceptance-criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC1:</strong> The dashboard page and all links to openQA tests from dashboard reference the same consistent package(s) or no package at all, i.e. no "dragonbox" in dashboard but then pointing to "libmwaw" in openQA</li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Investigate if this is maybe <em>just a display issue</em> and in that case fix it</li>
<li>Ask mgrifalconi to update the ticket according to our ticket templates to help us understand what he really expects because we are not clear about that</li>
<li>Reconsider how we test maintenance requests before a release request is created while still supporting the "shift left" endeavour</li>
<li>Check if the data in the dashboard database regarding packages is consistent with SMELT (to rule out qem-bot involvement)</li>
</ul>
openQA Tests - action #121153 (New): [qe-core] remove qe_run testshttps://progress.opensuse.org/issues/1211532022-11-30T10:54:08Zmgrifalconi
<p>While deprecation of qe_run did not progress <a href="https://progress.opensuse.org/issues/99657" class="external">https://progress.opensuse.org/issues/99657</a> , turns out we have still some tests in our bucket using it:</p>
<p><a href="https://openqa.suse.de/tests/10055095#" class="external">https://openqa.suse.de/tests/10055095#</a><br>
<a href="https://openqa.suse.de/tests/10055092/modules/userspace_nfs/steps/1/src" class="external">https://openqa.suse.de/tests/10055092/modules/userspace_nfs/steps/1/src</a></p>
<p>We should either remove the tests or convert them as a normal perl module.</p>
<p>Then we should either:</p>
<ul>
<li>decommission qe_run and remove the framework entirely</li>
<li>find some squads that is still willing to use it and hand over the responsibility for it</li>
</ul>
QA - action #117619 (Resolved): Bot approved update request with failing tests size:Mhttps://progress.opensuse.org/issues/1176192022-10-06T09:26:57Zmgrifalconi
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>Incident <a href="https://smelt.suse.de/incident/25982/" class="external">https://smelt.suse.de/incident/25982/</a><br>
Request that was approved by sle-qam-openqa: <a href="https://build.suse.de/request/show/280720" class="external">https://build.suse.de/request/show/280720</a><br>
Bot job: <a href="https://gitlab.suse.de/qa-maintenance/bot-ng/-/jobs/1166058#L279" class="external">https://gitlab.suse.de/qa-maintenance/bot-ng/-/jobs/1166058#L279</a><br>
<code>INFO: SUSE:Maintenance:25982:280720</code></p>
<p>Failing test: <a href="https://openqa.suse.de/tests/9642631#settings" class="external">https://openqa.suse.de/tests/9642631#settings</a><br>
Dashboard: <a href="https://dashboard.qam.suse.de/incident/25982" class="external">https://dashboard.qam.suse.de/incident/25982</a></p>
<p>Context on slack: <a href="https://suse.slack.com/archives/C02CANHLANP/p1665043765153419" class="external">https://suse.slack.com/archives/C02CANHLANP/p1665043765153419</a></p>
<a name="Acceptance-criteria"></a>
<h2 >Acceptance criteria<a href="#Acceptance-criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC1</strong>: We know the reason why the bot approved the request and didn't see the test failure</li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Run <code>./qem-bot/bot-ng.py -c /etc/openqabot --token [MASKED] inc-approve --dry</code> (see <a href="https://github.com/openSUSE/qem-bot/#usage" class="external">https://github.com/openSUSE/qem-bot/#usage</a> for more info)</li>
<li>Look into the dashboard logs on qam2.suse.de <code>journalctl -u dashboard.service</code></li>
<li>Note: The journal only goes back 3 days currently (Oct 3), so for the incident in question it's too late.
Consider increasing the journal size as a first step</li>
<li>Consider adding code that only runs the bot on a single incident</li>
</ul>
QA - action #113345 (Resolved): qem-bot does not ignore Development/Leap job groups as it should ...https://progress.opensuse.org/issues/1133452022-07-07T08:37:58Zmgrifalconi
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>Bot does not ignore Development/Leap job groups and could block update approvals due to broken development tests.<br>
See <a href="http://dashboard.qam.suse.de/blocked" class="external">http://dashboard.qam.suse.de/blocked</a> and look for "leap"</p>
<a name="Problem"></a>
<h2 >Problem<a href="#Problem" class="wiki-anchor">¶</a></h2>
<p>Likely regression due to <a href="https://github.com/openSUSE/qem-bot/commit/d4d33720d183ba30b63529577e2bbad700b238cd" class="external">https://github.com/openSUSE/qem-bot/commit/d4d33720d183ba30b63529577e2bbad700b238cd</a><br>
or <a href="https://github.com/openSUSE/qem-bot/commit/c869a5cb7a56cdb5c3ba33f64e086f34c64ce5b9#diff-dbb33d499407c366ab760f232[…]e02dad0dd506c87b478b8007cf496ad" class="external">https://github.com/openSUSE/qem-bot/commit/c869a5cb7a56cdb5c3ba33f64e086f34c64ce5b9#diff-dbb33d499407c366ab760f232[…]e02dad0dd506c87b478b8007cf496ad</a></p>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<p>Look hard at the above commits and try it out with "--dry-run" and fix it. We at least know that jobs are still ignored, e.g. <a href="https://gitlab.suse.de/qa-maintenance/bot-ng/-/jobs/1044528#L2663" class="external">https://gitlab.suse.de/qa-maintenance/bot-ng/-/jobs/1044528#L2663</a> shows "INFO: Ignoring job '9078831' in development group 'Maintenance: Leap 15.4 Incidents'" so there is at least <em>some</em> ignoring going on</p>
QA - action #107671 (Resolved): No aggregate maintenance runs scheduled today on osd size:Mhttps://progress.opensuse.org/issues/1076712022-02-28T07:13:59Zmgrifalconi
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>Seems a different issue than <a class="issue tracker-4 status-3 priority-6 priority-high2 closed child" title="action: No aggregate maintenance runs scheduled today on osd - dashboard.qem.suse.de down size:S (Resolved)" href="https://progress.opensuse.org/issues/106179">#106179</a> since the dashboard is accessible this time.</p>
<p>Link to list aggregate runs of the day:</p>
<p><a href="https://openqa.suse.de/tests/overview?arch=&flavor=&machine=&test=&modules=&module_re=&groupid=366&groupid=308&groupid=232&groupid=165&groupid=280&groupid=218&groupid=108&groupid=54&groupid=405&groupid=412&groupid=411&groupid=369&groupid=352&groupid=353&groupid=357&groupid=355&groupid=354&groupid=358&groupid=370&groupid=348&groupid=349&groupid=351&groupid=356&groupid=375&groupid=376&groupid=397&groupid=414&build=20220228-1#" class="external">https://openqa.suse.de/tests/overview?arch=&flavor=&machine=&test=&modules=&module_re=&groupid=366&groupid=308&groupid=232&groupid=165&groupid=280&groupid=218&groupid=108&groupid=54&groupid=405&groupid=412&groupid=411&groupid=369&groupid=352&groupid=353&groupid=357&groupid=355&groupid=354&groupid=358&groupid=370&groupid=348&groupid=349&groupid=351&groupid=356&groupid=375&groupid=376&groupid=397&groupid=414&build=20220228-1#</a><br>
(This was showing an empty list at that point)</p>
<p>Impact: update approval blocked</p>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>caused by downtime of <a href="http://download.suse.de" class="external">http://download.suse.de</a></li>
<li>read suggestions from <a class="issue tracker-4 status-3 priority-4 priority-default closed" title="action: openQABot pipeline failed: "ERROR:root:Something bad happended during reading MR data from SMELT/... (Resolved)" href="https://progress.opensuse.org/issues/105603">#105603</a></li>
<li>Some gitlab CI steps are failing but we allow them to fail to let other steps continue, e.g. in <a href="https://gitlab.suse.de/qa-maintenance/bot-ng/-/jobs/886067" class="external">https://gitlab.suse.de/qa-maintenance/bot-ng/-/jobs/886067</a> "sync smelt" fails but we allow it to fail so that "sync incidents" can continue but we also don't receive an alert about it and there is not sufficient retrying. We could split the steps into separate pipelines, make each step fatal and add configurable number of retries and interval between retries customized for each step in <a href="https://gitlab.suse.de/qa-maintenance/bot-ng/-/blob/master/.gitlab-ci.yml" class="external">https://gitlab.suse.de/qa-maintenance/bot-ng/-/blob/master/.gitlab-ci.yml</a>, e.g. for sync smelt long enough , retrying to cover the weekly SUSE IT maintenance window, less for other critical steps</li>
<li>For retrying we do not even need to change qem-bot, we could use just a wrapper in the gitlab CI job itself, e.g. <a href="https://github.com/okurz/leaky_bucket_error_count" class="external">https://github.com/okurz/leaky_bucket_error_count</a></li>
<li>Also look into gitlab CI options to either abort a previous pipeline if a new one is triggered or not start new ones as long as old ones are still running</li>
</ul>
QA - action #106179 (Resolved): No aggregate maintenance runs scheduled today on osd - dashboard....https://progress.opensuse.org/issues/1061792022-02-08T08:17:53Zmgrifalconi
<p>No aggregate runs scheduled today - dashboard.qem.suse.de down</p>
<p>Link to list aggregate runs of the day: <a href="https://openqa.suse.de/tests/overview?result=none&result=passed&result=softfailed&result=failed&result=incomplete&result=skipped&result=obsoleted&result=parallel_failed&result=parallel_restarted&result=user_cancelled&result=user_restarted&result=timeout_exceeded&state=scheduled&state=assigned&state=setup&state=running&state=uploading&state=done&state=cancelled&arch=&flavor=&machine=&test=&modules=&module_re=&groupid=366&groupid=308&groupid=232&groupid=165&groupid=280&groupid=218&groupid=108&groupid=54&groupid=405&groupid=412&groupid=411&groupid=369&groupid=352&groupid=353&groupid=357&groupid=355&groupid=354&groupid=358&groupid=370&groupid=348&groupid=349&groupid=351&groupid=356&groupid=375&groupid=376&groupid=397&groupid=414&build=20220208-1#" class="external">https://openqa.suse.de/tests/overview?result=none&result=passed&result=softfailed&result=failed&result=incomplete&result=skipped&result=obsoleted&result=parallel_failed&result=parallel_restarted&result=user_cancelled&result=user_restarted&result=timeout_exceeded&state=scheduled&state=assigned&state=setup&state=running&state=uploading&state=done&state=cancelled&arch=&flavor=&machine=&test=&modules=&module_re=&groupid=366&groupid=308&groupid=232&groupid=165&groupid=280&groupid=218&groupid=108&groupid=54&groupid=405&groupid=412&groupid=411&groupid=369&groupid=352&groupid=353&groupid=357&groupid=355&groupid=354&groupid=358&groupid=370&groupid=348&groupid=349&groupid=351&groupid=356&groupid=375&groupid=376&groupid=397&groupid=414&build=20220208-1#</a></p>
<p>This is blocking all update test/approval</p>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<p>Create an epic with feature requests based on this</p>
openQA Tests - action #101882 (New): [qe-core] aarch64 workers: test fails in patch_and_reboothttps://progress.opensuse.org/issues/1018822021-11-03T10:17:47Zmgrifalconi
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>Test expects to find the bootloader needle ( <a href="https://openqa.suse.de/tests/7588098#step/patch_and_reboot/52" class="external">https://openqa.suse.de/tests/7588098#step/patch_and_reboot/52</a> ) but somehow it is not picked by openqa and it fails.<br>
We could at the same time check for the login needle and skip the need of the bootloader if not found I suppose.</p>
<p>openQA test in scenario sle-15-Server-DVD-Updates-aarch64-qam-gnome@aarch64-virtio fails in<br>
<a href="https://openqa.suse.de/tests/7596208/modules/patch_and_reboot/steps/61" class="external">patch_and_reboot</a></p>
<a name="Test-suite-description"></a>
<h2 >Test suite description<a href="#Test-suite-description" class="wiki-anchor">¶</a></h2>
<p>Testsuite maintained at <a href="https://gitlab.suse.de/qa-maintenance/qam-openqa-yml" class="external">https://gitlab.suse.de/qa-maintenance/qam-openqa-yml</a>.</p>
<a name="Reproducible"></a>
<h2 >Reproducible<a href="#Reproducible" class="wiki-anchor">¶</a></h2>
<p>Fails since (at least) Build <a href="https://openqa.suse.de/tests/7594031" class="external">20211103-1</a></p>
<a name="Expected-result"></a>
<h2 >Expected result<a href="#Expected-result" class="wiki-anchor">¶</a></h2>
<p>Last good: <a href="https://openqa.suse.de/tests/7588098" class="external">20211102-1</a> (or more recent)</p>
<a name="Further-details"></a>
<h2 >Further details<a href="#Further-details" class="wiki-anchor">¶</a></h2>
<p>Always latest result in this scenario: <a href="https://openqa.suse.de/tests/latest?arch=aarch64&distri=sle&flavor=Server-DVD-Updates&machine=aarch64-virtio&test=qam-gnome&version=15" class="external">latest</a></p>
QA - action #97274 (New): qam dashboard improvement ideashttps://progress.opensuse.org/issues/972742021-08-20T06:48:13Zmgrifalconi
<p>Hello, doing openQA review I always used smelt comments to find out which test run needs to be checked to approve an update.</p>
<p>Ideally approval is automated, but when a single test fails (out of dozens/hundreds) it still needs some manual work to decide if such failures can be ignored for that particular test.</p>
<p>I won't mention crosscheck aggregate runs with precedent days (see <a href="https://progress.opensuse.org/issues/97118" class="external">https://progress.opensuse.org/issues/97118</a>).</p>
<p>These are the current issues I found while using the dashboard for my week of review:</p>
<ul>
<li><strong>Sorting order</strong>: I like to sort on smelt the priority or due date to have an idea on the situation. Neither of which is available. Incidents are sorted by incident ID, which I do not care</li>
<li><strong>Missing Release Request ID</strong>: If I am given only a RR ID, I must go to smelt to find the incident and back to the dashboard.</li>
<li><strong>Result History</strong>: I can only see latest results, so I find more painful to crosscheck different days, but I would be happier to see such think automated (see other poo linked earlier). In the meantime though, it is just more painful than before. I also have a good overview of the situation near the end of the day, because in the morning all runs are still ongoing and cannot do review based on yesterday's results.</li>
<li><strong>Development Job Groups</strong>: such job groups are not ignored, also some test groups will fit in. This creates some confusion and time wasted.</li>
</ul>
<p>Extra thought: <br>
The dashboard and smelt might be duplicating some work. Why not having a link in smelt to the list of related tests on the dashboard? I would be using the indexing/priority/informations on smelt and then go on the dashboard to check tests, possibly with result history.<br>
What I am basically asking for is the same features as smelt comments, whichever implementation is used. </p>
QA - coordination #97121 (New): [epic] enable qem-bot comments on IBS (was: enable qa-maintenance...https://progress.opensuse.org/issues/971212021-08-18T12:25:23Zmgrifalconi
<a name="Motivation"></a>
<h2 >Motivation<a href="#Motivation" class="wiki-anchor">¶</a></h2>
<p>As a maintenance release coordinator I am being notified if there is any failing openQA test blocking automatic approval of a SLE maintenance update</p>
<a name="Acceptance-criteria"></a>
<h2 >Acceptance criteria<a href="#Acceptance-criteria" class="wiki-anchor">¶</a></h2>
<ul>
<li><strong>AC1:</strong> As soon as testing for an individual release request is finished and if there is at least one failing openQA test a comment is written in IBS informing about the failing openQA tests</li>
<li><strong>AC2:</strong> If there is no failing openQA test related to an individual release request no comment is written</li>
<li><strong>AC3:</strong> Only a single comment is ever written on a release request</li>
</ul>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Within qem-bot we already have the feature to send out comments but it seems so far it does not look at the state of openQA jobs so it writes a comment for all release requests whenever triggered which means informing even about all currently running jobs. Maybe the next best task is to actually look at the state and only inform about failing openQA tests</li>
<li>Think about moving the trigger point of sending a comment into the approval step or something so when no automatic approval is done instead a comment is written</li>
<li>Ensure that only a single comment is written, not multiple whenever qem-bot is called</li>
</ul>
<a name="Further-details"></a>
<h2 >Further details<a href="#Further-details" class="wiki-anchor">¶</a></h2>
<p>Original motivation: <br>
While the qem dashboard is agreed to not yet be productive, smelt comments are very useful while doing manual approval of updates.</p>
<p>I would like to have such resource back, at least until <a href="https://progress.opensuse.org/issues/97118" class="external">https://progress.opensuse.org/issues/97118</a> <a class="issue tracker-4 status-4 priority-4 priority-default child" title="action: enhance bot automatic approval: check multiple days (Feedback)" href="https://progress.opensuse.org/issues/97118">#97118</a> is sorted out.</p>
openQA Tests - action #97007 (Blocked): [qa-core] test fails in dracut_enhancedhttps://progress.opensuse.org/issues/970072021-08-17T07:31:04Zmgrifalconi
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>openQA test in scenario sle-15-SP3-Server-DVD-Updates-x86_64-qam-dracut-systemd@64bit fails in<br>
<a href="https://openqa.suse.de/tests/6867197/modules/dracut_enhanced/steps/34" class="external">dracut_enhanced</a></p>
<a name="Test-suite-description"></a>
<h2 >Test suite description<a href="#Test-suite-description" class="wiki-anchor">¶</a></h2>
<p>Maintainer: <a href="mailto:emiura@suse.com">emiura@suse.com</a></p>
<a name="Reproducible"></a>
<h2 >Reproducible<a href="#Reproducible" class="wiki-anchor">¶</a></h2>
<p>Fails since (at least) Build <a href="https://openqa.suse.de/tests/6838543" class="external">20210813-1</a></p>
<a name="Expected-result"></a>
<h2 >Expected result<a href="#Expected-result" class="wiki-anchor">¶</a></h2>
<p>Last good: <a href="https://openqa.suse.de/tests/6652918" class="external">20210805-1</a> (or more recent)</p>
<a name="Further-details"></a>
<h2 >Further details<a href="#Further-details" class="wiki-anchor">¶</a></h2>
<p>Always latest result in this scenario: <a href="https://openqa.suse.de/tests/latest?arch=x86_64&distri=sle&flavor=Server-DVD-Updates&machine=64bit&test=qam-dracut-systemd&version=15-SP3" class="external">latest</a></p>
openQA Tests - action #95362 (Workable): [qe-core] make zypper_call use serial terminal regardles...https://progress.opensuse.org/issues/953622021-07-12T06:46:59Zmgrifalconi
<a name="Observation"></a>
<h2 >Observation<a href="#Observation" class="wiki-anchor">¶</a></h2>
<p>Anything that has to do with aarch64, or any architecture for that matter in terms of calling things like zypper_call, can be shifted to serial terminal.</p>
<a name="Suggestions"></a>
<h2 >Suggestions<a href="#Suggestions" class="wiki-anchor">¶</a></h2>
<ul>
<li>Modify zypper_call to always use serial terminal, then switch back to what it was using originally (testapi has <code>current_console</code> to get, the current console :))</li>
<li>Add a parameter to support backwards compatibility, to disable such feature in case of need (such as no_console_switch=1), assume that if the parameter is not set and VIRTO_CONSOLE=1 is set, the switch is enabled by default)</li>
<li>Advertise change in proper channels via RFC in os-autoinst-distri-opensuse (openqa, qa-sle ML, link in #testing in RC/Slack)</li>
</ul>